Tuesday, February 24, 2009

Linux SA Notes

ADD STATIC ROUTE IN LINUX
# Add static route to netbackup subnet
route add -net 172.24.16.0 netmask 255.255.252.0 gw 172.24.69.1

CHECK INTERFACE STATUS
# mii-tool
eth0: 100 Mbit, full duplex, link ok
eth1: no autonegotiation, 100baseTx-HD, link ok
eth2: 100 Mbit, full duplex, link ok
eth3: no link
eth4: 100 Mbit, full duplex, no link
eth5: no link

# ethtool


CHECK STATUS OF INTERFACE ON BOND
# cat /proc/net/bonding/bond0
# cat /etc/modprobe.conf


NTP

- stop ntpd
# service ntpd stop
# ntpdate -d hostname
# vi /etc/ntp/step-tickers (to include all the ntp servers) cross check with tokrs00001
# service ntpd start
# ntpq –pn

NIS

NIS client setup
# vi /etc/sysconfig/network
NISDOMAIN= sea.rbsfm.com
# vi /etc/yp.conf
domain sea.rbsfm.com server hkg8082xus
domain sea.rbsfm.com server hkg1121xus
domain sea.rbsfm.com server hkg0024ous
domain sea.rbsfm.com server hkg0025ous
# vi /etc/hosts
#NIS servers
175.3.208.23 hkg0023ous
175.3.220.82 hkg8082xus
- check /etc/nsswitch.conf
Passwd: files nis
Shadow: files nis
Group: files nis
- make sure portmap is running
# service portmap start
# chkconfig portmap on
- start ypbind
# service ypbind start
# chkconfig ypbind on
- test it
# ypcat passwd



MOUNT CDROM

# mount -t iso9660 -o ro /dev/cdrom /mnt (or any mount point)
To check whether the driver is loaded properly, type:
# lsmod

MOUNT USB
# fdisk -l
# mount /dev/sdbl1 /mnt

EMCGRAB for Linux
# mount lon0583xus:/apps/packages/ /apps
# cd /apps/EMC/EMC_Grabs/Linux
# mkdir -p /tmp/emc
# cp emcgrab_Linux_v3.9_1.tar /tmp/emc
# cd /tmp/emc
# tar xvf emcgrab_Linux_v3.9_1.tar
# cd emcgrab
# ./emcgrab.sh

CHANGE FILE ATTRIBUTE
# chattr +i /etc/hosts.allow
# chattr -i /etc/hosts.allow (to allow to save)


GET CPU INFO
# egrep "processor|MHz|model name" /proc/cpuinfo

GET SHARED MEMORY AND SEMEPHORES
# sysctl -a | grep shm >> for shared memory

# sysctl -a | grep -i sem >> for semephore

CHECK INCOMING NETWORK PACKETS
# lsof -i | grep 9495
# netstat -an | grep 9495

CHECK ADAPTER
# lspci|grep -i ql

CHECK RUNNING PROCESS NOT OWNED BY ROOT
# ps -ef | grep -iv root

HD CANNOT BE DETECTED
- reboot machine
- press F8 to go to Smart Array configuration menu
- create a new raid
- exit to reboot
- fdisk -l (list current partitions)
- create a new partition
# fdisk /dev/cciss/c0d2 (format to create new partition)
- create filesystem on the new partition
# mke2fs -j /dev/cciss/c0d2
- mount the partition
# mount -t ext3 /dev/cciss/c0d2 /export1

CHECK LIST OF GDS PRINTERS
# ypcat -k printers.conf.byname|grep sin


ADD A NEW GDS PRINTER
-check if a similar printer exist
# ypcat -k printers.conf.byname|grep sin

-check if there are running printer jobs
# lpstat -o (sin2111xxp)

-if there is a running job, cancel it
# cancel sin2111xxp-206

-go to /opt/hpnpl
# cd /opt/hpnpl
# ./hppi
- select 1 for Spooler Administration
- select 1 to add printer

RSYNC
#rsync -av -e ssh lon0261xus:/var/satellite/esxbuild /export1/satellite/ --progress –stats

CHECK BIOS
# hpasmcli -s "show server"

CHECK FANS and REDUNDANT PSUs
# hplog -f
# hplog –p

CHECK HP SIM
# hplog -v

REDUCE LINUX CACHED MEMORY

# grep ^Cached: /proc/meminfo

# dd if=/dev/zero bs=1024k count= of=dummy.tmp

# grep ^Cached: /proc/meminfo


ERROR/s Encountered


ERROR: Unable to communicate with the RILOE II/iLO.

RESOLUTION:
To restore communication with the HP Lights-Out Online Configuration Utility for Linux, perform the following:

Stop the HP SNMP Agents (hp-snmp-agents) by typing the following command:
# /etc/init.d/hp-snmp-agents stop

Restart HP ProLiant Channel Interface Device Driver for iLO / iLO 2 (hp-ilo) by typing the following command:
# /etc/init.d/hp-ilo restart

Start the HP SNMP Agents (hp-snmp-agents) by typing the following command:
#/etc/init.d/hp-snmp-agents start

ERROR: User does not exist (NIS related error)

RESOLUTION:
Put (+) in from of userid in /etc/passwd and /etc/shadow files
e.g. on /etc/passwd
+@all-all-users-tc-default-users
+bradmark
+@all-all-users-tc-oracle-dba

Solaris SA Notes

ADD STATIC ROUTE IN UNIX SOLARIS
#route add net 28.133.0.0/18 172.24.105.1
-add the route accross reboot-
#vi /etc/rc2.d/S69static_routes

ODS

# metastat|grep -i maint
# metastat -p (to check the controller where metadevices are originally mounted)
# format (to check the current controller)
# metastat |grep -i replace (to check the metadevices to be replaced)
(and then do the replacement)
# metareplace d0 c2t0d0s0 c1t0d0s0 (note that c1 is the new controller)
# metadb -a -c3 /dev/dsk/c1t0d0s3 (create meta database for the new controller c1)
# metastat | grep resync
# while true
>do
>metastat | grep resync
>sleep 1
>done


RSH

Enable rsh in Solaris 10
# svcadm enable svc:/network/login:rlogin
# svcadm enable svc:/network/login:rlogin

Enable rsh in solaris 8
# vi /etc/inetd.conf
- uncomment
shell stream tcp nowait root /usr/sbin/in.rshd in.rshd
- restart inetd daemon
# pkill -HUP inetd

AUTOFS

- check the files /etc/auto_master and /etc/auto.dumps file
- once changes made, u need to do 'automount -v' to pickup the config files
- on the client share the filesystem
# share -F nfs -o ro /dumps
- edit /etc/dfs/dfstab
# share -F nfs -o ro /dumps
- restart nfs
# svcadm restart nfs/server
# dfshares or exportfs (to verify nfs shares)

NFS

- mount the dir
# mount -F nfs hkgfiler2:/remedy /opt/ar/directa

- edit /etc/vfstab to mount it upon boot up
hkgfiler2:/remedy - /opt/ar/directa nfs yes rw,soft



NTP

# svcadm disable ntp
# cp /etc/inet/ntp.client /etc/inet/ntp.conf (and type-in ntp servers)
e.g.

server hkgba00001
server hkgba00002

#ntpdate -d
#svcadm enable svc:/network/ntp

DU

Check for Top directory users
# du -sk /export/home/* | sort –nr

TAR

Un-tar tar.gz file IN SOLARIS
gunzip -c file_name.tar.gz |tar xvf –
or

# gzip -d snap_sol10.tar.gz
# tar xf snap_sol10.tar


Create tar file
tar cvf dir.tar

Create create .gz
#gzip dir.tar


HBA

Check HBA info:
# fcinfo hba-port

NIC

Check network interface
# cat /etc/path_to_inst
# grep bge path_to_inst
"/pci@1f,700000/network@2" 0 "bge"
"/pci@1f,700000/network@2,1" 1 "bge"
"/pci@1d,700000/network@2" 2 "bge"
"/pci@1d,700000/network@2,1" 3 "bge"

Change network interface to auto neg “on”
# vi /platform/sun4u/kernel/drv/bge.conf

Check for network collision
# netstat –i

EEPROM

Disable system check during reboot
# eeprom diag-switch? False


PKGADD

To add a package
# pkgadd –d .

REXEC

Disable rexec in Solaris 8
# vi /etc/inetd.conf
- comment out
exec stream tcp nowait root /usr/sbin/tcpd in.rexecd
exec stream tcp6 nowait root /usr/sbin/tcpd in.rexecd

- then restart inetd daemon
# pkill -HUP inetd

SWAP

- create local swap area - in kilobytes(k), blocks(b) or megabytes (m)
# mkfile 1024m /new_swap (create 1GB swap)
- activate the swap area
# swap –a /new_swap
- verify
# swap –l

LOCALE

- set system-wide locale
# vi /etc/TIMEZONE

- set user locale
# vi .profile

LVM (Logical Volume Management)

- install lvm software
- partition disks
# fdisk partition (e.g. /dev/sdb1)

- create physical volumes
# pvcreate /dev/sdb1
# pvcreate /dev/sdc1
# pvcreate /dev/sdd1
# pvscan

- create volume group and make it available
# vgcreate /dev/sdb1 /dev/sdc1 /dev/sdd1
# vgchange –a y
# vgdisplay
# lvcreate –-size 3493G –-name
# lvdisplay
# mount /dev/

- rename a logical volume group
# vgdisplay –v
# vgchange –a n /dev/
# vgrename /dev/ /dev/
# vgchange –a y /dev/
# vgdisplay –v
# vgdisplay –v ext

- rename a logical volume
# lvdisplay /dev/ext/
# lvrename /dev/ext/ /dev/ext/
# lvdisplay /dev/ext/

- make a filesystem on the disk
# mkfs.xfs –f /dev/fhome/fanhome
# mount /dev/fhome/fanhome /home
# mount –t xfs /dev/ext/nfsdata /mnt

- add a new drive to a volume group
# vgdisplay –v (check active disks)
# fdisk to create partition
# umount /bak/backups (filesystem to extend)
# pvcreate /dev/sdf1 (prepare new partition)
# vgextend bak /dev/sdf1
# lvresize –-size 5.999T /dev/shome/home (increase size)
# mount /dev/bak/backups /bak/backups
# xfs_growfs /bak/backups


ERROR/s Encountered

Error: SOLARIS box running slow

Resolution:
# mpstat 5 5 (check usr,sys usage)
# prstat -s cpu -n 5 (list by cpu usage and top 5 processes)
# prstat -s cpu -a -n 5 (list usage per user)


Error: Veritas config daemon vxconfigd not running

Resolution:
# modinfo|grep vx
If DMP is enabled the /etc/system will have the force load entries and the driver "vxdmp"
# grep vxdmp /etc/system
forceload: drv/vxdmp

# vxdctl mode
mode: enabled

Next, run vxinstall or do a vxconfigd -k -m enable

Error: Timed out while waiting for NIS to come up

Resolution:

- logon to console
- do a “send brk” to get to “ok prompt”
- boot to single user mode
Ok> boot –s
- temporarily disable RPC service or remove /etc/defaultdomain to prevent NIS from starting at system boot
# mv /etc/rc2.d/S71rpc /etc/rc2.d/NOS71rpc or
# rm /etc/defaultdomain
# or exit to continue booting to default runlevel




Error: “Umount: I/O error” or “umount: cannot unmount /mount_point”

Resolution:

# fuser –c /file_system
# kill -9 process_id (from the above fuser command)
# lsof +D /file_system
# umount –f /file_system
# mount | grep /file_sytem

Error: No directory – logged on with (/) directory

Resolution:

- restart autofs
# /etc/init.d/autofs stop; /etc/init.d/autofs start
- re-login


Error: Media Error (hard disk with bad sectors)

Resolution:

# format
- select the disk to read test
> anal (select analyze)
> read

Friday, January 16, 2009

Replace failed disk in Veritas

To replace a failed disk in Veritas, please follow the below procedure.

1. Check the failed disk using the command 'vxdisk list'

2. Run the 'format' command to see ' if the disk is offline' or 'not responding to selection'.

3. Log a service call to hardware vendor.

4. Remove the failed disk from volume manager control using the below commands.

a. Run 'vxdiskadm' as root.

b. choose option 4: Remove a disk for replacement

c. Choose the logical name corresponding the disk that has failed ( for ex. data02)

5. Get the disk replaced by the vendor.

6. Make sure the disk appears fine in the format command(no need to do any partition).

7. Run 'vxdctl enable' to enable vxconfigd sense the replaced device

8. Run 'vxdiskadm' command again and follow the below steps.

a. Choose option:5 Replace a failed or removed disk.

b. Choose the disk that was removed in step 4b(for ex. data02).

c. Choose the device corresponding to the logical name(for ex. c1t10d0)

d. Say no to 'encapsulate' and choose okay to initialise the disk to replace the failed one.

e. Accept default (no - option) for FMR plex resync option

f. Once completed successful appeared on the prompt.Exit vxdiskadm

9. Check the disks are online by running 'vxdisk list'.


vxprint -ht

Moving hot-relocated subdisk back to disk

# vxdiskadm

Choose option 14


Move hot-relocated subdisks back to a disk
Menu: VolumeManager/Disk/UnrelocateDisk
Use this operation to move subdisks which were hot-relocated back
onto the original disk that has been replaced due to a disk failure.
This operation takes, as input, the original disk name. If the
failed drive was replaced with a disk using a different name, this
operation also provides an option to specify the new name.
Enter the original disk name [,list,q,?] list
datadg0211
datadg03

Enter the original disk name [,list,q,?] datadg0211
Unrelocate to a new disk [y,n,q,?] (default: n)
Requested operation is to move all the subdisks which were hot-relocated
from datadg0211 back to datadg0211 of disk group datadg02.
Continue with operation? [y,n,q,?] (default: y)
Use -f option to unrelocate the subdisks if moving to the exact offset fails?
[y,n,q,?] (default: n)

Thursday, January 15, 2009

Go to ok prompt from ILOM of Sun T5120

Follow below procedure to get to "ok" prompt from ILOM.

1. ssh to ILOM hostname


2. From the ILOM prompt , type the below.

--> set /HOST send_break_action=break

--> start /SP/console to get to the ok prompt.


Manual system reset from the ILOM prompt.

--> set /HOST/bootmode script="setenv auto-boot? false"

--> reset /SYS

SUN M4000 ALOM

SUN's M4000 server has a new management interface called XSCF. It's different from the usual "sc" of some low-end servers.


Logon to ALOM of Sun M4000:

# ssh or
username: eis-installer or your_username
password: password


To connect to console:

XSCF> console -d 0

If somebody is already using the console, you can force connect

XSCF> console -d 0 -f

To go back to XSCF prompt:

type "#." (without the quotes)

To reset the server/domain:

XSCF> reset -d 0 por [resets domain 0]
XSCF> reset -d 0 xir [resets domain 0 with XIR reset]

To send break:

XSCF> sendbreak -d 0

To reboot XSCF system:

XSCF> rebootxscf


Other commands below:


XSCF> showstatus
XSCF> showversion -c xcp -v [shows xcp firmware, version, openboot prom version
XSCF> showenvironment
XSCF> showenvironment temp
XSCF> showenvironment volt
XSCF> showhardconf
XSCF> showdcl -va [check domain id...]
XSCF> showdomainstatus -a
XSCF> showboards -a
XSCF> poweron -a [powers up all domains]
XSCF> poweroff -a [powers off all domains]
XSCF> poweron -d 0 [powers on domain 0]
XSCF> poweroff -d 0 [powers off domain 0]
XSCF> poweroff -f -d 0 [forces a power off domain 0]
XSCF> sendbreak -d 0 [sends break command to domain 0]
XSCF> setautologout -s 60 [sets autologout to 60 minutes]
XSCF> showautologout
XSCF> shownetwork -a
XSCF> setnetwork xscf#0-lan#0 -m 255.255.255.0 10.10.10.5
XSCF> sethostname xscf#0 fire-xscf
XSCF> sethostname -h host.org
XSCF> setroute -h host.org
XSCF> setnameserver 10.10.10.2 10.10.10.3
XSCF> setroute -c add -n 10.10.10.1 -m 255.255.255.0 xscf#0-lan#0

To add 2 additional memory boards:

XSCF> addboard -c assign -d 0 00-2
XSCF> addboard -c assign -d 1 00-3

XSCF> showboards -va

Veritas Netbackup Client Installation

Netbackup client needs to be installed on the server that you need to create backup.

For Linux

1. Check if netbackup client exist

# rpm -qa | grep nbu

2. If none yet, install the netbackup client rpm package

# rpm -ivh SYSnbuc-6.0-4.i386.rpm (or the name of the package)

3. Edit bp.conf

# cd /usr/openv/netbackup
# vi bp.conf (remove everything and put your backup server's hostname)

SERVER = hostname

4. Create exclude_list file. This will exclude the specified file or dir from the backup. See example below.
[root@hostname] cat exclude_list
#Sparse file that can take 45 mins to process.
/var/log/lastlog
#2.6 kernel introduces the /sys tree which cannot be backed up
/sys

5. Check route of client to netbackup server

# netstat -rn (check routing table of the client)
If the route from the client to to backup server doesn't exist, create the static route, example below.

route add -net 172.24.16.0 netmask 255.255.252.0 gw 172.24.27.1

Add this to your /etc/rc.local/ to add route during boot up.

# ssh backup_server (connect to backup server via ssh)
# ping backup_client


For Sun Solaris 8/10

1 Check if there's netbackup client installed - check if there's /opt/openv or /usr/openv
2. if none exist, install the package
# pkgadd -d . netbackcl

3. Edit bp.conf
# cd /usr/openv/netbackup/
# vi bp.conf (remove everything and put below entry)
--

SERVER = hostname
---

4. create exclude_list file

For Solaris 8:

/proc/
/tmp/
/cdrom/
core
/home/
/apps/
/builds/
/patches/
/packages/

For Solaris 10:

#Start of OS standard excludes
/*arch*/
/*ora*dump*/
/ldoms/*
/oracrs*/*
/oravote*/*
/tmp/*
/var/SUNWsrspx/SRSQueueStore/store/.free
/var/SUNWsrspx/SUNWsrspx/SRSQueueStore/store/.free
/var/crash/*
/var/opt/SUNWsrspx/SRSQueueStore/store/.free
/var/tmp/*
#End of OS standard excludes
#Host specific excludes below

Building Sun Fire X4150 x86 machine

Building X4150 machine is a little bit different from Sun Fire V-series machines. To build this machine, you need to follow below steps:

1. Configure the ILOM (Integrated Lights-Out-Management) service to gain remote access of the server. Sun engineer can help in setting this one up. You may also ask the engineer to enable the web gui, usually https://ILOM_hostname or https://ILOM_ipaddress

2. Connect to the server via ssh or telnet (depends on what has been enabled)

# ssh ILOM hostname / IP address

or via web GUI, https://ILOM_hostname

3. Launch server console by clicking "Launch Redirection" button under "Remote Control" tab

4. Reset the machine to configure BIOS and RAID. Click "Remote Power Control" tab and then select "Reset" in Power Control field and then click "Save".

5. In BIOS configuration menu, go to Server to check the NIC's mac addresses (you'll be needing this on jumpstart process)

6. Go to Boot and the select Boot Device Priority. Select "USB:Virtual DVD/CD" as the "1st Boot Device" and the "RAID disk" as the "2nd Boot Device". Save and Exit BIOS.

7. Configure Jumpstart to create an initial boot ISO.

8. Mount the iso image. From the ILOM remote console, Select "Devices" and then click on "CD-ROM image" and then select the iso file.

9. Proceed with the Jumpstart.

Note: Different errors might occur during jummpstart,like Disk not found error. For this one you need to check if RAID has been configured. For ....... error (hehehe) you may want to check your tftp package or check your network connection.

AutoSys Admin Commands

A collection of autosys admin commands that I use to manage my company's Autosys Infrastructure.

First setup aliases to make your life easier!

Below is how you setup aliases in C shell. If you're using a different shell, then, RTFM!

Add the aliases in your profile

# Send Event
alias se sendevent -E

# Start Job
alias fsj sendevent -E FORCE_STARTJOB -J
alias sj sendevent -E STARTJOB -J

# Job Report
alias jr autorep -J

# Machine Report
alias mr autorep -M

Then, just do the ff commands:

To view job details,
# jr jobname -q

To view job full name and box job
# jr jobname -w

To view job per page
# jr nbu.tok.prd% | pg

To force start a job
# fsj jobname

To ICE a job
# sendevent -E JOB_ON_ICE -J jobname

To un-ICE a job
# sendevent -E JOB_OFF_ICE -J jobname

To kill a job
# sendevent -E KILLJOB -J jobname

To mark job as success
# sendevent -E CHANGE_STATUS -s SUCCESS -j

To mark job as terminated
# sendevent -E CHANGE_STATUS -s TERMINATED -j

To check the job details
# jr jobname -d

To delete a job
# cat job.jil
delete_job: jobname
# jil < job.jil

To update a job
# cat job.jil
update_job: jobname
description: "New Description"
# jil < job.jil

To setup a backup job
# jr existing_job -q > newjob.jil (copy existing job)
# vi newjob.jil (edit entries appropriately)
# jil < newjob.jil (load the job)
# sendevent -E JOB_ON_ICE -J jobname (ice the job)
# fsj jobname (force start the job)

To rename a job
# jr old_job_name -q new_job_name.jil
# vi new_job_name.jil (rename old name with the new one)
# save the file
# jil < new_job_name.jil
# delete old_job_name, by doing
# jil
delete_job: old_job_name (enter)
press ctrl-d

That's it!!!