Vision: 12/1/10

Sunday, December 5, 2010

Boot Disk Mirroring Using Solaris Volume Manager Software

Applicable OS Version: Solaris 9 Operating System (OS), Solaris 8 OS with Solstice DiskSuite 4.2.1 Software with Patch 108693-06 (SPARC Platform Edition)

Note: I do not guarantee that this will work as it is for every one. Please tweak as needed.

The following steps might have used random controllers and targets. They might vary from host to host.

Also, it's a good idea to mirror across controllers instead of mirroring across the same controller and still having the controller as a single point of failure.

1) Important precaution:

Copy /etc/vfstab and /etc/system before you go ahead:

cp -p /etc/system /etc/system.orig."date"

cp -p /etc/vfstab /etc/vfstab.orig."date"

In case /etc/system gets messed up, we can still use the command boot -a from the OK prompt and specify by using:

/etc/system.orig."date"

2) Make sure that you have an extra disk to mirror the root disk and there is no data on it.

3) Create a small slice of 25 Mbyte (10 Mbyte is also fine) for storing volume databases on the "rootdisk" and label the disk.

If you don't have any space on your root disk, create a small slice by deleting and re-adding swap space.

Make sure that there is not a lot of activity on the box while you do this.

3.1) To list your swap, use: swap -l

(It's good if you have more than one slice configured as swap.)

3.2) Execute:

swap -d swap-name ( /dev/dsk/c?ct?d?s?)

Change your partition table to incorporate a new slice by reducing the size or cylinder length of the swap partition.

3.3) Execute:

swap -a swap-name ( /dev/dsk/c?t?d?s?)

4) The VTOC (volume table of contents) on the root disk and root mirror must be the same. Copy the VTOC using prtvtoc and fmthard.

# prtvtoc /dev/rdsk/c?t?d?s2 | fmthard -s - /dev/rdsk/c?t?d?s2

5) Create metadatabases on the small slice created on rootdisk:

# metadb -f -a -c3 c?t?d?s6 (Slice 6 is my small slice here)
# metadb -a -c3 c?t?d?s6 (Slice 6 on rootmirror)

6) Now we can create a mirror for each and every slice in the partition table.

For root or / partition:

# metainit -f d10 1 1 c?t?d?s? 

# metainit d20 1 1 c?t?d?s?

(create a md d0 and attach one submirror)
# metainit d0 -m d10 

(set up system files for root (/) metadevice, that is, 
     changes to /etc/system and /etc/vfstab)
# metaroot d0 

# lockfs -fa (clear improper file locks on all mounted UFS file systems)

7) Naming convention for other metadisks follow. (Note for those who are new to this software: We will not do metaroot and lockfs steps on other file systems.)

The submirrors will be named d10, d20, and so on.

In d10, 1 is the submirror number, and 0 is the slice number.

If we have swap on partition/slice 1, we would do this:

# metainit -f d11 1 1 c?t?d?s1

# metainit d21 1 1 c?t?d?s1

# metainit d1 -m d11

8) Repeat for as many file systems you have on your boot disk.

9) Make changes to your /etc/vfstab. The md entry for root will already be updated by the metaroot command.

A sample copy of /etc/vfstab looks like this:

#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
##/dev/dsk/c1t1d0s1 - - swap - no -
/dev/md/dsk/d1 - - swap - no -
/dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no -
##/dev/dsk/c1t1d0s7 /dev/rdsk/c1t1d0s7 /export/home ufs 2 yes -
/dev/md/dsk/d7 /dev/md/rdsk/d7 /export/home ufs 2 yes -
##/dev/dsk/c1t1d0s3 /dev/rdsk/c1t1d0s3 /opt/uc4 ufs 2 yes -
/dev/md/dsk/d3 /dev/md/rdsk/d3 /opt/uc4 ufs 2 yes -
swap - /tmp tmpfs - yes -

10) Configure your dump device using dumpadm.

11) Make the following entry in the /etc/system file, in the mdd info section:

set md:mirrored_root_flag=1

When the root disk becomes unavailable, the database copies stored on the root disk are also unavailable.

Solaris Volume Manager software expects more than 50 percent of the databases to be available to boot up normally or else it may complain about the insufficient number of database replicas. The preceding change is made in order for Solaris Volume Manager software to boot up with at least 50 percent of the copies.

12) Execute:

sync; sync; init 6

13) Once the system comes up, attach the other submirror:

# metattach d0 d20

(Note: It's "metattach" and not "metaattach")

# metattach d1 d21

and so on.

14) To see whether the FS syncing is done or not, do this:

metastat | grep progress

15) Determine the device path to the boot devices for both the primary and mirror:

ls -l /dev/dsk/c1t1d0s0 /dev/dsk/c1t0d0s0
lrwxrwxrwx 1 root root 43 Dec 23 17:51 /dev/dsk/c1t0d0s0 -> \
   ../../devices/pci@1c,600000/scsi@2/sd@0,0:a
lrwxrwxrwx 1 root root 43 Dec 23 17:51 /dev/dsk/c1t1d0s0 -> \
   ../../devices/pci@1c,600000/scsi@2/sd@1,0:a

# eeprom "nvramrc=devalias rootdisk /pci@1c,600000/scsi@2/disk@1,0 
devalias rootmirror /pci@1c,600000/scsi@2/disk@0,0"

(Please note the change "sd" to "disk" in using ls -l output.)

# eeprom "use-nvramrc?=true"

You can also change the boot-device values so that the system tries to boot from the mirror in case one of them is not available.

# eeprom boot-device="rootdisk rootmirror net"

16) Once the syncing is complete, test your system by removing the root disk.

Recovering a Bad Sector Disk on the Solaris 9 OS

A disk can start creating trouble if sectors on it are bad. We can try to verify and repair the defective sectors. For example, the following message shows that the block 100 is defective:

WARNING: /io-unit@f,e0200000...
   Error for command 'read' Error Level: Retryable
   Requested Block 243, Error Block 100
   Sense Key: Media Error
   Vendor ...
   ASC = 0x11 (unrecovered read error) ...

We can try to take corrective action by performing a surface scan analysis. First we need to unmount all slices on the defective disk and then invoke the format utility. (Note: This example shows only s0 mounted on target 2.)

# umount /dev/dsk/c0t2d0s0 
# format

When we are asked to select the disk, provide the number:

Specify disk (enter its number): 1
selecting c0t2d0:
[disk formatted]
Warning: Current Disk has mounted partitions.

Now we should invoke the analyze menu and provide the parameters as asked:

format> analyze
analyze> setup
Analyze entire disk [yes]? n
Enter starting block number [0, 0/0/0]: enter start block
Enter ending block number [2052287, 2035/13/71]: enter end block
Loop continuously [no]: y
Repair defective blocks [yes]: n
Stop after first error [no]: n
Use random bit patterns [no]: n
Enter number of blocks per transfer [126, 0/1/54]: 1
Verify media after formatting [yes]: y
Enable extended messages [no]: n
Restore defect list [yes]: y
Create defect label [yes]: y

analyze> read
Ready to analyze (won't harm SunOS). This takes a long time,
but is interruptible with Control-C. Continue? Y
   pass 0
   ...
   pass 1
   block 100, Corrected media error (hard data ecc)
   ...
   Total of 1 defective blocks repaired.

Now we have found the absolute block number of the defective block on the disk, and we will repair it.

analyze> q
format> repair
Enter absolute block number of defect: 100
Ready to repair defect, continue? y
Repairing block 100 ...ok.
format> q

Changing Hostname on RHEL

1. Change the ^HOSTNAME line in /etc/sysconfig/network

2. Change the hostname (FQDN and alias) in /etc/hosts

3. Run /bin/hostname new_hostname for the hostname change to take effect immediately.

4. Run /sbin/service syslog restart for syslog to log using the new hostname.

A reboot is not required to change the system hostname.

Thursday, December 2, 2010

Understanding /proc/cpuinfo

Example:

$ uname -r
2.6.18-8.el5

How many physical processors are there?

$ grep 'physical id' /proc/cpuinfo | sort | uniq | wc -l
2

How many virtual processors are there?

$ grep ^processor /proc/cpuinfo | wc -l
4

Are the processors dual-core (or multi-core)?

$ grep 'cpu cores' /proc/cpuinfo
cpu cores       : 2
cpu cores       : 2
cpu cores       : 2
cpu cores       : 2

"2" indicates the two physical processors are dual-core, resulting in 4 virtual processors.

If "1" was returned, the two physical processors are single-core. If the processors are single-core, and the number of virtual processors is greater than the number of physical processors, the CPUs are using hyper-threading. Hyper-threading is supported if ht is present in the CPU flags and you are using an SMP kernel.

Are the processors 64-bit?

A 64-bit processor will have lm ("long mode") in the flags section of cpuinfo. A 32-bit processor will not.

flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8legacy ts fid vid ttp tm stc

Changing IP address

The following steps may be used to change the IP address of a Solaris system.

Change the host's IP in /etc/hosts for the change to take effect after reboot. If you are using Solaris 10, you must also change the host's IP in /etc/inet/ipnodes for the change to take effect after reboot.
Run ifconfig interface ip_address netmask broadcast_address for the IP address change to take effect immediately. The netmask and broadcast_address should be specified if you are using variable length subnet masks (VLSM), but may be omitted otherwise.
If you are using variable length subnet masks (VLSM), add the host's network number and subnet mask to /etc/netmasks.

If the new IP address places the system on a different network:

Place the host's new default gateway in /etc/defaultrouter
Run route add default new_gateway for the new default gateway to take effect immediately.
Run route delete default old_gateway to delete the old default gateway from the routing table.

Creating a Flash archive

1. If the root disk is encapsulated by Veritas Volume Manager (VxVM), unencapsulate it before continuing.

2. I recommend booting to single-user mode, as you generally do not want to include NFS mounts or other file systems mounted in later run levels as part of your Flash archive.

#reboot -- -s
3. Create the Flash archive.

flarcreate -n name -a author -S -c archive_name
eg: flarcreate -n "Solaris 9 image" -a "shiroy" /var/tmp/sol8.archive

Notes:
flarcreate will not determine the size of the archive beforehand when using the -S flag. Personally, I have seen flarcreate take an inordinate amount of time calculating the size of the archive.The -c flag enables archive compression via the compress command.

4. If applicable, re-encapsulate the rootdisk with the vxdiskadm command. Reboot the system for the encapsulation to take effect.

NixCraft – Linux Administration

Here is a great Linux Admin blog with plenty of content for the both the beginner and the advanced Linux Admin. The name of the blog is nixCraft and it has been in my RSS reader for a while, and I wanted to share with you a couple of the scripts and links that I pulled out and have used.

The first is a quick and easy MySql database backup that you can put in cron to backup your MySql database. Many open source projects use MySql and it always pays to have a backup especially when upgrading, so take a look at this post called backing up your mysql database server.
The second script is a rsync replication script that we can use between a couple of clustered web servers. The script is called resync backup replication script.
And the last example is for the beginner administrator. This post identifies a number of Unix/Linux commands and cheat sheets that was worthwhile to the new administrator.
If you are into Linux from a support or development perspective than I encourage you to take a look at the nixCraft site as I am sure that you will find something useful.