Tuesday, February 14, 2017

Recover OCR without backup

                                                    


After restarting my VM machine my Cluster services was not starting up.
·         I simply checked cluster on both the nodes and found below output.
[root@db-rac1 bin]# ./crsctl check cluster -all
**************************************************************
db-rac1:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
**************************************************************
db-rac2:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

·         crsd.bin was not running.

[root@db-rac2 ~]# ps -ef|grep pmon
root      5128  5064  0 23:13 pts/1    00:00:00 grep pmon
[root@db-rac2 ~]#
[root@db-rac2 ~]# ps -ef|grep d.bin
root      4440     1  1 23:12 ?        00:00:01 /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid      4833     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid      4844     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid      4861     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root      4871     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid      4874     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root      4885     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root      4897     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
root      4912     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid      4928     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root      5005     1  1 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ologgerd -M -d /u01/app/grid/product/11.2.0/grid/crf/db/db-rac2
root      5130  5064  0 23:14 pts/1    00:00:00 grep d.bin

·         Checked ocssd.log file for above issue.

db-rac1:-

root@db-rac1 bin]# tail -300 /u01/app/grid/product/11.2.0/grid/log/db-rac1/cssd/ocssd.log|more

2017-02-12 23:22:54.823: [    CSSD][1081260352]clssgmDiscEndpcl: gipcDestroy 0x9182
2017-02-12 23:22:54.825: [    GPNP][1108064576]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2104] get-profile call to url "ipc://GPNPD_db-rac1" disco "" [f=0 claimed- host:
 cname: seq: auth:]
2017-02-12 23:22:54.830: [    GPNP][1108064576]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2234] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD
_db-rac1" disco ""
2017-02-12 23:22:54.830: [    CSSD][1108064576]clssscGetParameterProfile: buffer passed for parameter ASM discovery (3) is too short, required 23, passed 20
2017-02-12 23:22:54.830: [    CSSD][1108064576]clssnmReadDiscoveryProfile: voting file discovery string(/dev/oracleasm/disks/*)
2017-02-12 23:22:54.830: [    CSSD][1108064576]clssnmvDDiscThread: using discovery string /dev/oracleasm/disks/* for initial discovery
2017-02-12 23:22:54.830: [   SKGFD][1108064576]Discovery with str:/dev/oracleasm/disks/*:

2017-02-12 23:22:54.830: [   SKGFD][1108064576]UFS discovery with :/dev/oracleasm/disks/*:

2017-02-12 23:22:54.830: [   SKGFD][1108064576]Execute glob on the string /dev/oracleasm/disks/*
2017-02-12 23:22:54.830: [   SKGFD][1108064576]OSS discovery with :/dev/oracleasm/disks/*:

2017-02-12 23:22:54.830: [    CSSD][1108064576]clssnmvDiskVerify: Successful discovery of 0 disks
2017-02-12 23:22:54.830: [    CSSD][1108064576]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2017-02-12 23:22:54.830: [    CSSD][1108064576]clssnmvFindInitialConfigs: No voting files found
2017-02-12 23:22:54.830: [    CSSD][1108064576](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds
2017-02-12 23:22:54.845: [    CSSD][1081260352]clssscSelect: cookie accept request 0xde70f0
2017-02-12 23:22:54.845: [    CSSD][1081260352]clssscevtypSHRCON: getting client with cmproc 0xde70f0
2017-02-12 23:22:54.845: [    CSSD][1081260352]clssgmRegisterClient: proc(5/0xde70f0), client(572/0xd55be0)
2017-02-12 23:22:54.845: [    CSSD][1081260352]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(
0xde70f0) client(0xd55be0)

·         On both the nodes, I found same message.



·         Let’s check ASM disk which contains OCR file.

[root@db-rac1 bin]# /etc/init.d/oracleasm listdisks
[root@db-rac1 bin]#
[root@db-rac1 bin]# ls -lart /dev/oracleasm/disks
total 0
drwxr-xr-x 4 root root 0 Feb 12 23:11 ..
drwxr-xr-x 1 root root 0 Feb 12 23:11 .
[root@db-rac1 bin]#

Same on 2nd node.

[root@db-rac2 bin]# /etc/init.d/oracleasm listdisks
[root@db-rac2 bin]#

[root@db-rac2 bin]# ls -lart /dev/oracleasm/disks
total 0
drwxr-xr-x 4 root root 0 Feb 12 23:11 ..
drwxr-xr-x 1 root root 0 Feb 12 23:11 .
[root@db-rac2 bin]#

Ohh oracle ASM disks are not available

·         Also the OCR  backup was not available

Node 1:
[root@db-rac1 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
ORA-29701: unable to connect to Cluster Synchronization Service

Node 2:-
[root@db-rac2 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
ORA-29701: unable to connect to Cluster Synchronization Service

Now stop all the cluster Processes forcefully.

[root@db-rac1 ~]# ps -ef|grep d.bin
root      4440     1  1 23:12 ?        00:00:01 /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid      4833     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid      4844     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid      4861     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root      4871     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid      4874     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root      4885     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root      4897     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
root      4912     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid      4928     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root      5005     1  1 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ologgerd -M -d /u01/app/grid/product/11.2.0/grid/crf/db/db-rac1
root      5130  5064  0 23:14 pts/1    00:00:00 grep d.bin

[root@db-rac1 bin]# ./crsctl stop crs -f


[root@db-rac2 ~]# ps -ef|grep d.bin
root      4440     1  1 23:12 ?        00:00:01 /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid      4833     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid      4844     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid      4861     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root      4871     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid      4874     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root      4885     1  0 23:12 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root      4897     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
root      4912     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid      4928     1  0 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root      5005     1  1 23:13 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ologgerd -M -d /u01/app/grid/product/11.2.0/grid/crf/db/db-rac2
root      5130  5064  0 23:14 pts/1    00:00:00 grep d.bin

[root@db-rac2 bin]# ./crsctl stop crs -f


Crosschek all the process again.

·         Recreate  oracle asm OCR1 disk.
[root@db-rac2 bin]# /etc/init.d/oracleasm deletedisk /dev/sdc1 OCR1
Removing ASM disk "/dev/sdc1":                             [  OK  ]

[root@db-rac1 ~]# /etc/init.d/oracleasm createdisk OCR1 /dev/sdc1
Marking disk "OCR1" as an ASM disk:                        [  OK  ]

·         Deconfigure the cluster through “rootcrs.pl” script located in $GRID_HOME/crs/install.
First execute on 1st node then repeat the same on 2nd node.

[root@db-rac1 install]# pwd
/u01/app/grid/product/11.2.0/grid/crs/install






[root@db-rac1 install]# ./rootcrs.pl -deconfig -force
Using configuration parameter file: ./crsconfig_params
PRCR-1119 : Failed to look up CRS resources of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd

CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'db-rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'db-rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'db-rac1'
CRS-2677: Stop of 'ora.crf' on 'db-rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'db-rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'db-rac1' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'db-rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'db-rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'db-rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'db-rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on this node
[root@db-rac1 install]#

Node 2:-

[root@db-rac2 ~]# cd /u01/app/grid/product/11.2.0/grid/crs/install
[root@db-rac2 install]#
[root@db-rac2 install]# ./rootcrs.pl -deconfig -force
Using configuration parameter file: ./crsconfig_params
PRCR-1119 : Failed to look up CRS resources of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd

CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'db-rac2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'db-rac2'
CRS-2673: Attempting to stop 'ora.crf' on 'db-rac2'
CRS-2677: Stop of 'ora.mdnsd' on 'db-rac2' succeeded
CRS-2677: Stop of 'ora.crf' on 'db-rac2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'db-rac2'
CRS-2677: Stop of 'ora.gipcd' on 'db-rac2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'db-rac2'
CRS-2677: Stop of 'ora.gpnpd' on 'db-rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'db-rac2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on this node
[root@db-rac2 install]#

·         Now execute root.sh script on 1st node.

[root@db-rac1 ~]# cd /u01/app/grid/product/11.2.0/grid
[root@db-rac1 install]#
[root@db-rac1 grid]# ls -lart root*
-rwxr-xr-x 1 grid oinstall 558 Feb 11 05:45 rootupgrade.sh
-rwxr-x--- 1 grid oinstall 545 Feb 11 05:45 root.sh


[root@db-rac1 grid]# ./root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/grid/product/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-2672: Attempting to start 'ora.mdnsd' on 'db-rac1'
CRS-2676: Start of 'ora.mdnsd' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'db-rac1'
CRS-2676: Start of 'ora.gpnpd' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'db-rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'db-rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'db-rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'db-rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'db-rac1'
CRS-2676: Start of 'ora.diskmon' on 'db-rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'db-rac1' succeeded

ASM created and started successfully.

Disk Group OCR created successfully.                                              

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Successful addition of voting disk 139516a22b594fd0bf2edbd9b2740ce8.
Successfully replaced voting disk group with +OCR.
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   139516a22b594fd0bf2edbd9b2740ce8 (/dev/oracleasm/disks/OCR1) [OCR]
Located 1 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'db-rac1'
CRS-2676: Start of 'ora.asm' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.OCR.dg' on 'db-rac1'
CRS-2676: Start of 'ora.OCR.dg' on 'db-rac1' succeeded
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
[root@db-rac1 grid]#


·         Execute root.sh on 2nd node:-

[root@db-rac2 grid]# ./root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/grid/product/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node db-rac1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
[root@db-rac2 grid]#








·         Now Check crsd.bin is started on both the nodes.

[root@db-rac2 grid]# ps -ef|grep d.bin
root      5952     1  0 03:42 ?        00:00:02 /u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid      7733     1  0 03:43 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid      7744     1  0 03:43 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid      7762     1  0 03:43 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root      7805     1  0 03:43 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
grid      7808     1  0 03:43 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root      7828     1  0 03:43 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid      7850     1  0 03:43 ?        00:00:01 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root      7905     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
root      7916     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/octssd.bin
root      7938     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root      7963     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/ologgerd -m db-rac1 -r -d /u01/app/grid/product/11.2.0/grid/crf/db/db-rac2
root      8124     1  3 03:44 ?        00:00:07 /u01/app/grid/product/11.2.0/grid/bin/crsd.bin reboot
grid      8141     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/evmd.bin
grid      8237  8141  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/evmlogger.bin -o /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.log
grid      8273     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
root      8276     1  0 03:44 ?        00:00:00 /u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
root      9196  4627  0 03:48 pts/1    00:00:00 grep d.bin


·         check cluster status and resources.

[root@db-rac1 bin]# ./crsctl check cluster -all
**************************************************************
db-rac1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
db-rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[root@db-rac1 bin]#
[root@db-rac1 bin]#
[root@db-rac1 bin]# ./crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.OCR.dg
               ONLINE  ONLINE       db-rac1
               ONLINE  ONLINE       db-rac2
ora.asm
               ONLINE  ONLINE       db-rac1                  Started
               ONLINE  ONLINE       db-rac2                  Started
ora.gsd
               OFFLINE OFFLINE      db-rac1
               OFFLINE OFFLINE      db-rac2
ora.net1.network
               ONLINE  ONLINE       db-rac1
               ONLINE  ONLINE       db-rac2
ora.ons
               ONLINE  ONLINE       db-rac1
               ONLINE  ONLINE       db-rac2
ora.registry.acfs
               ONLINE  ONLINE       db-rac1
               ONLINE  ONLINE       db-rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db-rac1
ora.cvu
      1        ONLINE  ONLINE       db-rac1
ora.db-rac1.vip
      1        ONLINE  ONLINE       db-rac1
ora.db-rac2.vip
      1        ONLINE  ONLINE       db-rac2
ora.oc4j
      1        ONLINE  ONLINE       db-rac1
ora.scan1.vip
      1        ONLINE  ONLINE       db-rac1

Finally we have done OCR recovery….

[root@db-rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2604
         Available space (kbytes) :     259516
         ID                       : 1732998155
         Device/File Name         :       +OCR
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded


         Logical corruption check succeeded



DISCLAIMER


The views expressed here are my own and do not necessarily reflect the views of any other individual, business entity, or organization. The views expressed by visitors on this blog are theirs solely and may not reflect mine.

1 comment: