Friday, June 22, 2012

Upgrading ASMLib and OS in 11gR1 RAC Environment

Upgrading operating system (OS) requires several consideration and tasks.
1. If ASMLib is used then ASMLib libraries must be upgraded as part of the OS upgrade.
2. After the upgrade Oracle/clusterware binaries should be relinked.
In this case the two nodes 11gR1 (11.1.0.7) RAC environment running on RHEL 5 (2.6.18-194.el5) will be upgraded to RHEL 5 (2.6.18-308.el5).
As per (743649.1) Oralce support rolling upgrade of OS when both OS are certified on the database that's running. This is supported only for the duration of the upgrade. But according to (1391807.1) using multiple versions of ASMLib across the cluster may not be compatible. To avoid these incompatible issues and to minimize the total downtime one node could be upgraded first and then shutting down the other node before the upgraded one with new version of ASMLib is started. This way only one node will be active in the cluster when two nodes are having different ASMLib versions. In this case all nodes are upgraded at the same time incurring total system outage.

1. Find out if ASMLibs are available for the kernel version to which the system is being upgraded to. If not move asmlib devices to block devices before the upgrade (also true if upgrading to RHEL 6 where to get asmlibs require ULN account) or see if (462618.1) could be of any help.

2. Make a note of the ownership and permission of the following files in $CRS_HOME/bin directory using the script below. Some files might be missing this is not a problem.
#!/bin/sh

if [ "$CRS_HOME" != "" ]; then
                echo "crs home exist"

        for i in clntsh.map clntst_1.lis clntst_2.lis clntst.lis libclntsh.so libclntsh.so.10.1 libclntst10.so libclntsh.so.11.1 libclntst11.so clntst
        do
                if [ -f $CRS_HOME/lib/$i ]; then
                        echo "file exists"
                        ls -l $CRS_HOME/lib/$i

                        if [ -f $CRS_HOME/lib32/$i ]; then
                                echo "file in lib32"
                                ls -l $CRS_HOME/lib32/$i
                        else
                                echo "file not exists"
                        fi
                else
                        if [ -f $CRS_HOME/lib32/$i ]; then
                                echo "file in lib32"
                                ls -l $CRS_HOME/lib32/$i
                        else
                                echo "file not exists"
                        fi
                fi
        done
else
                echo "Please set CRS_HOME and run again"
fi

./cluster_file_check.sh
crs home exist
file exists
-rwxr--r-- 1 root root 4765562 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib/clntsh.map
file in lib32
-rw-r--r-- 1 root root 3985432 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib32/clntsh.map
file not exists
file not exists
file not exists
file exists
lrwxrwxrwx 1 root root 17 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so -> libclntsh.so.11.1
file in lib32
lrwxrwxrwx 1 root root 17 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so -> libclntsh.so.11.1
file exists
lrwxrwxrwx 1 oracle oinstall 51 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so.10.1 -> /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so
file in lib32
lrwxrwxrwx 1 oracle oinstall 53 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so.10.1 -> /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so
file not exists
file exists
-rwxr-xr-x 1 root root 48316147 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so.11.1
file in lib32
-rwxr-xr-x 1 root root 37081093 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so.11.1
file not exists
file not exists
3. Make a note of the asm disk's scsi_ids , this setup is a test setup using Virtualbox. The scsi_ids will be compared after the OS upgrade and must be the same as before the upgrade.
# blkid|grep sd.*oracleasm|while read a b;do echo -n $a$b" scsi_id=";(echo $a|tr -d [:digit:]|tr -d [:]|cut -d"/" -f3|xargs -i scsi_id -g -s /block/{})done;
/dev/sdc1:LABEL="DATA" TYPE="oracleasm" scsi_id=SATA     VBOX HARDDISK  VB35fd6dbb-14d87b49
/dev/sdd1:LABEL="FLASH" TYPE="oracleasm" scsi_id=SATA     VBOX HARDDISK  VB4b88ed44-eff71eb1
4. Shutdown the cluster stack across the cluster. If one node at a time is upgraded then do this only on one node. In this case since all nodes are upgraded at the same time entire cluster is brought down
crs_stop -all
5. Stop the crs and disbale auto start of the crs. Do this on all ndoes.
# crsctl stop crs
Stopping resources.
This could take several minutes.
Successfully stopped Oracle Clusterware resources
Stopping Cluster Synchronization Services.
Shutting down the Cluster Synchronization Services daemon.
Shutdown request successfully issued.

# crsctl disable crs
Oracle Clusterware is disabled for start-up after a reboot.
6. Comment out the crs spwaning processes entries in /etc/inittab. Do this on all nodes
#h1:35:respawn:/etc/init.d/init.evmd run >/dev/null 2>&1 </dev/null
#h2:35:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&1 </dev/null
#h3:35:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1 </dev/null
Run init on the systme and verify that no crs related processes are running.
/sbin/init q

ps -ef|grep css
ps -ef|grep crs
ps -ef|grep evm
ps -ef|grep init
7. Stop oracleasm running and disable and prevent it loading on reboot. Do this on all nodes.
# /etc/init.d/oracleasm stop
Dropping Oracle ASMLib disks:                              [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]

# /etc/init.d/oracleasm disable
Writing Oracle ASM library driver configuration: done
Dropping Oracle ASMLib disks:                              [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]

# /sbin/chkconfig --list | grep oracleasm
oracleasm       0:off   1:off   2:on    3:on    4:on    5:on    6:off
# /sbin/chkconfig oracleasm off
# /sbin/chkconfig --list | grep oracleasm
oracleasm       0:off   1:off   2:off   3:off   4:off   5:off   6:off
8. Make a backup of /etc/sysconfig/oracleasm-_dev_oracleasm on all nodes.
# cp /etc/sysconfig/oracleasm-_dev_oracleasm /etc/sysconfig/oracleasm-_dev_oracleasm.bak
At this point the system is ready for the OS upgrade.

9. Carry out the OS ugprade. All Oracle Home locations (ORACLE_HOME,ASM_HOME and CRS_HOME) are to be same before and after the upgrade





10. Install the new version of oracleasm and remove the old version. Old version could only be removed after installing the new version due to dependencies.
rpm -e oracleasm-2.6.18-194.el5-2.0.5-1.el5
error: Failed dependencies:
        oracleasm >= 1.0.4 is needed by (installed) oracleasmlib-2.0.4-1.el5.x86_64

# rpm -ivh oracleasm-2.6.18-308.el5-2.0.5-1.el5.x86_64.rpm
warning: oracleasm-2.6.18-308.el5-2.0.5-1.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 1e5e0159
Preparing...                ########################################### [100%]
   1:oracleasm-2.6.18-308.el########################################### [100%]

# rpm -e oracleasm-2.6.18-194.el5-2.0.5-1.el5

# rpm -qa | grep oracleasm
oracleasmlib-2.0.4-1.el5
oracleasm-2.6.18-308.el5-2.0.5-1.el5
oracleasm-support-2.1.3-1.el5
11. Configure oracleasm and make sure content on oracleasm-_dev_oracleasm.bak and oracleasm-_dev_oracleasm are the same.
/etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library
driver.  The following questions will determine whether the driver is
loaded on boot and what permissions it will have.  The current values
will be shown in brackets ('[]').  Hitting  without typing an
answer will keep that current value.  Ctrl-C will abort.

Default user to own the driver interface [oracle]:
Default group to own the driver interface [dba]:
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]:
Writing Oracle ASM library driver configuration: done
Initializing the Oracle ASMLib driver:                     [  OK  ]
Scanning the system for Oracle ASMLib disks:               [  OK  ]
12. Enable oracleasm and set it up to start on reboot.
/etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: done
Initializing the Oracle ASMLib driver:                     [  OK  ]
Scanning the system for Oracle ASMLib disks:               [  OK  ]

chkconfig --list | grep oracleasm
oracleasm       0:off   1:off   2:off   3:off   4:off   5:off   6:off

# chkconfig oracleasm on

# chkconfig --list | grep oracleasm
oracleasm       0:off   1:off   2:on    3:on    4:on    5:on    6:off
13. Verify scsi_id are same as before the upgrade
# blkid|grep sd.*oracleasm|while read a b;do echo -n $a$b" scsi_id=";(echo $a|tr -d [:digit:]|tr -d [:]|cut -d"/" -f3|xargs -i scsi_id -g -s /block/{})done;
/dev/sdc1:LABEL="DATA" TYPE="oracleasm" scsi_id=SATA     VBOX HARDDISK  VB35fd6dbb-14d87b49
/dev/sdd1:LABEL="FLASH" TYPE="oracleasm" scsi_id=SATA     VBOX HARDDISK  VB4b88ed44-eff71eb1
With this the ASMLib upgrade is complete. Next is to relink the Oracle binaries.

14. As oracle user relink both ASM_HOME and ORACLE_HOME. Verify which relink is in the path variable using
which relink
Relink ORACLE_HOME
$ cd $ORACLE_HOME/bin
$ relink all
Reset the PATH and ORACLE_HOME before running relink for ASM_HOME
export PATH=$ASM_HOME/bin:$PATH
export ORACLE_HOME=$ASM_HOME
Verify the relink binary from ASM_HOME is in path
which relink
Run the relink of ASM_HOME
$ cd $ASM_HOME/bin
$ relink all
15. As per (743649.1 step 10) clusterware client shared libraries should be relinked as the clusterware software owner. But running this as software owner result in permission issue
export ORACLE_HOME=$CRS_HOME (important to set ORACLE_HOME variable to CRS_HOME location)
cd $ORACLE_HOME/bin

./genclntsh
/bin/rm: cannot remove `/opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so.11.1': Permission denied
/bin/rm: cannot remove `/opt/crs/oracle/product/11.1.0/crs/lib/clntsh.map': Permission denied
./genclntsh: line 309: /opt/crs/oracle/product/11.1.0/crs/lib/clntsh.map: Permission denied
genclntsh: Failed to link libclntsh.so.11.1
Permission issue persist even after setting the permission to oracle:oinstall. Running ./genclntsh as root results in files get created with new timestamp permission same as before (metalink note could be wrong!??). When genclntsh runs without any issue no message is shown and return to shell prompt. After run check the file permissions are same as before (step 2 above).
./cluster_file_check.sh
crs home exist
file exists
-rw-r--r-- 1 root root 4764652 Jun 20 16:49 /opt/crs/oracle/product/11.1.0/crs/lib/clntsh.map
file in lib32
-rw-r--r-- 1 root root 3984490 Jun 20 16:49 /opt/crs/oracle/product/11.1.0/crs/lib32/clntsh.map
file not exists
file not exists
file not exists
file exists
lrwxrwxrwx 1 root root 17 Jun 20 16:49 /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so -> libclntsh.so.11.1
file in lib32
lrwxrwxrwx 1 root root 17 Jun 20 16:49 /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so -> libclntsh.so.11.1
file exists
lrwxrwxrwx 1 oracle oinstall 51 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so.10.1 -> /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so
file in lib32
lrwxrwxrwx 1 oracle oinstall 53 Feb  8  2011 /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so.10.1 -> /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so
file not exists
file exists
-rwxr-xr-x 1 root root 48316147 Jun 20 16:49 /opt/crs/oracle/product/11.1.0/crs/lib/libclntsh.so.11.1
file in lib32
-rwxr-xr-x 1 root root 37081093 Jun 20 16:49 /opt/crs/oracle/product/11.1.0/crs/lib32/libclntsh.so.11.1
file not exists
file not exists
16. Enable crs and remove the comments on /etc/inittab and start crs
# crsctl enable crs
Oracle Clusterware is enabled for start-up after a reboot.

h1:35:respawn:/etc/init.d/init.evmd run >/dev/null 2>&1 </dev/null
h2:35:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&1 </dev/null
h3:35:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1 </dev/null
17. Start crs with crsctl or with OS init
crsctl start crs
Attempting to start Oracle Clusterware stack
The CRS stack will be started shortly
or
/sbin/init q
This conclude the ASMLib and OS upgrade.

Useful Metalink Notes
How to Relink Oracle Database Software on UNIX [ID 131321.1]
Is It Necessary To Relink Oracle Following OS Upgrade? [ID 444595.1]
Will an Operating System Upgrade Affect Oracle Clusterware? [ID 743649.1]
How To Upgrade ASMLib Kernel Driver as Part of Kernel Upgrade? [ID 1391807.1]
Do You Need to Relink Oracle Clusterware When Upgrading the Operating System? [ID 743649.1]
Is It Required To Relink The Oracle Binaries When Booting With The Old Kernel? [ID 461138.1]
Cannot Find Exact Kernel Version Match For ASMLib (Workaround using oracleasm_debug_link tool) [ID 462618.1]
RAC: Frequently Asked Questions [ID 220970.1] (Is a relink required for the clusterware home after an OS upgrade?)

Useful White Paper
Best Practices for Optimizing Availability During Planned Maintenance Using Oracle Clusterware and Oracle Real Application Clusters. Oracle Maximum Availability Architecture White Paper September 2007

Related Posts
Upgrading RHEL 6 OS in a 11gR2 RAC Environment
Upgrading OS in 11gR2 RAC Environment (RHEL 5)