After restarting my VM machine my Cluster services was not
starting up.
·
I simply checked cluster on both the nodes and
found below output.
[root@db-rac1 bin]# ./crsctl check cluster -all
**************************************************************
db-rac1:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster
Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
**************************************************************
db-rac2:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster
Synchronization Services daemon
CRS-4534: Cannot communicate
with Event Manager
·
crsd.bin was not running.
[root@db-rac2 ~]# ps -ef|grep pmon
root 5128 5064 0
23:13 pts/1 00:00:00 grep pmon
[root@db-rac2 ~]#
[root@db-rac2 ~]# ps -ef|grep d.bin
root 4440 1 1
23:12 ? 00:00:01
/u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid 4833 1 0
23:12 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid 4844 1 0
23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid 4861 1 0
23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root 4871 1 0
23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid 4874 1 0
23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root 4885 1 0
23:12 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root 4897 1 0
23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
root 4912 1 0
23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid 4928 1 0
23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root 5005 1 1
23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ologgerd -M -d /u01/app/grid/product/11.2.0/grid/crf/db/db-rac2
root 5130 5064 0
23:14 pts/1 00:00:00 grep d.bin
·
Checked ocssd.log file for above issue.
db-rac1:-
root@db-rac1 bin]#
tail -300 /u01/app/grid/product/11.2.0/grid/log/db-rac1/cssd/ocssd.log|more
2017-02-12
23:22:54.823: [
CSSD][1081260352]clssgmDiscEndpcl: gipcDestroy 0x9182
2017-02-12
23:22:54.825: [
GPNP][1108064576]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2104]
get-profile call to url "ipc://GPNPD_db-rac1" disco "" [f=0
claimed- host:
cname: seq: auth:]
2017-02-12
23:22:54.830: [
GPNP][1108064576]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2234] Result:
(0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD
_db-rac1" disco
""
2017-02-12
23:22:54.830: [
CSSD][1108064576]clssscGetParameterProfile: buffer passed for parameter
ASM discovery (3) is too short, required 23, passed 20
2017-02-12
23:22:54.830: [
CSSD][1108064576]clssnmReadDiscoveryProfile: voting file discovery
string(/dev/oracleasm/disks/*)
2017-02-12
23:22:54.830: [
CSSD][1108064576]clssnmvDDiscThread: using discovery string /dev/oracleasm/disks/*
for initial discovery
2017-02-12
23:22:54.830: [
SKGFD][1108064576]Discovery with str:/dev/oracleasm/disks/*:
2017-02-12
23:22:54.830: [ SKGFD][1108064576]UFS
discovery with :/dev/oracleasm/disks/*:
2017-02-12
23:22:54.830: [ SKGFD][1108064576]Execute
glob on the string /dev/oracleasm/disks/*
2017-02-12
23:22:54.830: [ SKGFD][1108064576]OSS
discovery with :/dev/oracleasm/disks/*:
2017-02-12 23:22:54.830:
[ CSSD][1108064576]clssnmvDiskVerify:
Successful discovery of 0 disks
2017-02-12
23:22:54.830: [
CSSD][1108064576]clssnmCompleteInitVFDiscovery: Completing initial
voting file discovery
2017-02-12
23:22:54.830: [
CSSD][1108064576]clssnmvFindInitialConfigs: No voting files found
2017-02-12
23:22:54.830: [ CSSD][1108064576](:CSSNM00070:)clssnmCompleteInitVFDiscovery:
Voting file not found. Retrying discovery in 15 seconds
2017-02-12
23:22:54.845: [
CSSD][1081260352]clssscSelect: cookie accept request 0xde70f0
2017-02-12
23:22:54.845: [
CSSD][1081260352]clssscevtypSHRCON: getting client with cmproc 0xde70f0
2017-02-12
23:22:54.845: [
CSSD][1081260352]clssgmRegisterClient: proc(5/0xde70f0),
client(572/0xd55be0)
2017-02-12
23:22:54.845: [
CSSD][1081260352]clssgmExecuteClientRequest(): type(6) size(684) only
connect and exit messages are allowed before lease acquisition proc(
0xde70f0)
client(0xd55be0)
·
On both the nodes, I found same message.
·
Let’s check ASM disk which contains OCR file.
[root@db-rac1 bin]# /etc/init.d/oracleasm
listdisks
[root@db-rac1 bin]#
[root@db-rac1 bin]# ls -lart
/dev/oracleasm/disks
total 0
drwxr-xr-x 4 root root 0 Feb 12 23:11 ..
drwxr-xr-x 1 root root 0 Feb 12 23:11 .
[root@db-rac1 bin]#
Same on 2nd node.
[root@db-rac2 bin]# /etc/init.d/oracleasm
listdisks
[root@db-rac2 bin]#
[root@db-rac2 bin]# ls -lart
/dev/oracleasm/disks
total 0
drwxr-xr-x 4 root root 0 Feb 12 23:11 ..
drwxr-xr-x 1 root root 0 Feb 12 23:11 .
[root@db-rac2 bin]#
Ohh oracle ASM disks are not available
·
Also the OCR backup was not available
Node 1:
[root@db-rac1 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the
cluster registry
PROC-26: Error while accessing the physical
storage
ORA-29701: unable to connect to Cluster
Synchronization Service
Node 2:-
[root@db-rac2 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the
cluster registry
PROC-26: Error while accessing the physical
storage
ORA-29701: unable to connect to Cluster
Synchronization Service
Now stop all the cluster Processes
forcefully.
[root@db-rac1 ~]# ps -ef|grep d.bin
root
4440 1 1 23:12 ? 00:00:01
/u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid
4833 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid
4844 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid
4861 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root
4871 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid
4874 1 0
23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root
4885 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root
4897 1 0 23:13 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
root
4912 1 0 23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid
4928 1 0 23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root
5005 1 1 23:13 ?
00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ologgerd -M -d
/u01/app/grid/product/11.2.0/grid/crf/db/db-rac1
root
5130 5064 0 23:14 pts/1 00:00:00 grep d.bin
[root@db-rac1 bin]# ./crsctl stop crs -f
[root@db-rac2 ~]# ps -ef|grep d.bin
root
4440 1 1 23:12 ? 00:00:01
/u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid
4833 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid
4844 1 0 23:12 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid
4861 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root
4871 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
grid
4874 1 0
23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root
4885 1 0 23:12 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root
4897 1 0 23:13 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
root
4912 1 0 23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid
4928 1 0 23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root
5005 1 1 23:13 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ologgerd -M -d /u01/app/grid/product/11.2.0/grid/crf/db/db-rac2
root
5130 5064 0 23:14 pts/1 00:00:00 grep d.bin
[root@db-rac2 bin]# ./crsctl stop crs -f
Crosschek all the process again.
·
Recreate
oracle asm OCR1 disk.
[root@db-rac2 bin]# /etc/init.d/oracleasm deletedisk
/dev/sdc1 OCR1
Removing ASM disk "/dev/sdc1": [ OK ]
[root@db-rac1 ~]# /etc/init.d/oracleasm createdisk OCR1 /dev/sdc1
Marking disk "OCR1" as an ASM disk: [ OK ]
·
Deconfigure the cluster through “rootcrs.pl”
script located in $GRID_HOME/crs/install.
First execute on 1st node then
repeat the same on 2nd node.
[root@db-rac1 install]# pwd
/u01/app/grid/product/11.2.0/grid/crs/install
[root@db-rac1 install]# ./rootcrs.pl
-deconfig -force
Using configuration parameter file:
./crsconfig_params
PRCR-1119 : Failed to look up CRS resources
of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource
ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource
ora.ons is registered
Cannot communicate with crsd
CRS-4535: Cannot communicate with Cluster
Ready Services
CRS-4000: Command Stop failed, or completed
with errors.
CRS-2791: Starting shutdown of Oracle High
Availability Services-managed resources on 'db-rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on
'db-rac1'
CRS-2673: Attempting to stop 'ora.crf' on
'db-rac1'
CRS-2677: Stop of 'ora.crf' on 'db-rac1'
succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on
'db-rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'db-rac1'
succeeded
CRS-2677: Stop of 'ora.gipcd' on 'db-rac1'
succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on
'db-rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'db-rac1'
succeeded
CRS-2793: Shutdown of Oracle High
Availability Services-managed resources on 'db-rac1' has completed
CRS-4133: Oracle High Availability Services
has been stopped.
Removing Trace File Analyzer
Successfully deconfigured Oracle
clusterware stack on this node
[root@db-rac1 install]#
Node 2:-
[root@db-rac2 ~]# cd
/u01/app/grid/product/11.2.0/grid/crs/install
[root@db-rac2 install]#
[root@db-rac2 install]# ./rootcrs.pl -deconfig -force
Using configuration parameter file: ./crsconfig_params
PRCR-1119 : Failed to look up CRS resources of
ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is
registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is
registered
Cannot communicate with crsd
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-2791: Starting shutdown of Oracle High Availability
Services-managed resources on 'db-rac2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'db-rac2'
CRS-2673: Attempting to stop 'ora.crf' on 'db-rac2'
CRS-2677: Stop of 'ora.mdnsd' on 'db-rac2' succeeded
CRS-2677: Stop of 'ora.crf' on 'db-rac2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'db-rac2'
CRS-2677: Stop of 'ora.gipcd' on 'db-rac2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'db-rac2'
CRS-2677: Stop of 'ora.gpnpd' on 'db-rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability
Services-managed resources on 'db-rac2' has completed
CRS-4133: Oracle High Availability Services has been
stopped.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on
this node
[root@db-rac2 install]#
·
Now execute root.sh script on 1st
node.
[root@db-rac1 ~]# cd /u01/app/grid/product/11.2.0/grid
[root@db-rac1 install]#
[root@db-rac1 grid]# ls -lart root*
-rwxr-xr-x 1 grid oinstall 558 Feb 11 05:45
rootupgrade.sh
-rwxr-x--- 1 grid oinstall 545 Feb 11 05:45 root.sh
[root@db-rac1 grid]# ./root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER=
grid
ORACLE_HOME= /u01/app/grid/product/11.2.0/grid
Enter the full pathname of the local bin directory:
[/usr/local/bin]:
The contents of "dbhome" have not changed. No
need to overwrite.
The contents of "oraenv" have not changed. No
need to overwrite.
The contents of "coraenv" have not changed. No
need to overwrite.
Entries will be added to the /etc/oratab file as needed
by
Database Configuration Assistant when a database is
created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file:
/u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-2672: Attempting to start 'ora.mdnsd' on 'db-rac1'
CRS-2676: Start of 'ora.mdnsd' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'db-rac1'
CRS-2676: Start of 'ora.gpnpd' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on
'db-rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'db-rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'db-rac1'
succeeded
CRS-2676: Start of 'ora.gipcd' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'db-rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'db-rac1'
CRS-2676: Start of 'ora.diskmon' on 'db-rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'db-rac1' succeeded
ASM created and started successfully.
Disk Group OCR created successfully.
clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Successful addition of voting disk
139516a22b594fd0bf2edbd9b2740ce8.
Successfully replaced voting disk group with +OCR.
CRS-4266: Voting file(s) successfully replaced
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 139516a22b594fd0bf2edbd9b2740ce8
(/dev/oracleasm/disks/OCR1) [OCR]
Located 1 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'db-rac1'
CRS-2676: Start of 'ora.asm' on 'db-rac1' succeeded
CRS-2672: Attempting to start 'ora.OCR.dg' on 'db-rac1'
CRS-2676: Start of 'ora.OCR.dg' on 'db-rac1' succeeded
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ...
succeeded
[root@db-rac1 grid]#
·
Execute root.sh on 2nd node:-
[root@db-rac2 grid]# ./root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER=
grid
ORACLE_HOME=
/u01/app/grid/product/11.2.0/grid
Enter the full pathname of the local bin directory:
[/usr/local/bin]:
The contents of "dbhome" have not changed. No
need to overwrite.
The contents of "oraenv" have not changed. No
need to overwrite.
The contents of "coraenv" have not changed. No
need to overwrite.
Entries will be added to the /etc/oratab file as needed
by
Database Configuration Assistant when a database is
created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file:
/u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-4402: The CSS daemon was started in exclusive mode
but found an active CSS daemon on node db-rac1, number 1, and is terminating
An active cluster was found during exclusive startup,
restarting to join the cluster
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ...
succeeded
[root@db-rac2 grid]#
·
Now Check crsd.bin is started on both the nodes.
[root@db-rac2 grid]# ps -ef|grep d.bin
root 5952 1 0
03:42 ? 00:00:02
/u01/app/grid/product/11.2.0/grid/bin/ohasd.bin reboot
grid 7733 1 0
03:43 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
grid 7744 1 0
03:43 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/mdnsd.bin
grid 7762 1 0
03:43 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gpnpd.bin
root 7805 1 0
03:43 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/cssdmonitor
grid 7808 1 0
03:43 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/gipcd.bin
root 7828 1 0
03:43 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/cssdagent
grid 7850 1 0
03:43 ? 00:00:01 /u01/app/grid/product/11.2.0/grid/bin/ocssd.bin
root 7905 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
root 7916 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/octssd.bin
root 7938 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/osysmond.bin
root 7963 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/ologgerd -m db-rac1 -r -d
/u01/app/grid/product/11.2.0/grid/crf/db/db-rac2
root 8124 1 3
03:44 ? 00:00:07
/u01/app/grid/product/11.2.0/grid/bin/crsd.bin reboot
grid 8141 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/evmd.bin
grid 8237 8141 0
03:44 ? 00:00:00 /u01/app/grid/product/11.2.0/grid/bin/evmlogger.bin
-o /u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.info -l
/u01/app/grid/product/11.2.0/grid/evm/log/evmlogger.log
grid 8273 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/oraagent.bin
root 8276 1 0
03:44 ? 00:00:00
/u01/app/grid/product/11.2.0/grid/bin/orarootagent.bin
root 9196 4627 0
03:48 pts/1 00:00:00 grep d.bin
·
check cluster status and resources.
[root@db-rac1 bin]# ./crsctl check cluster -all
**************************************************************
db-rac1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
db-rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[root@db-rac1 bin]#
[root@db-rac1 bin]#
[root@db-rac1 bin]# ./crsctl status resource -t
--------------------------------------------------------------------------------
NAME
TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.OCR.dg
ONLINE ONLINE db-rac1
ONLINE ONLINE db-rac2
ora.asm
ONLINE ONLINE db-rac1 Started
ONLINE ONLINE db-rac2 Started
ora.gsd
OFFLINE OFFLINE db-rac1
OFFLINE OFFLINE db-rac2
ora.net1.network
ONLINE ONLINE db-rac1
ONLINE ONLINE db-rac2
ora.ons
ONLINE ONLINE db-rac1
ONLINE ONLINE db-rac2
ora.registry.acfs
ONLINE ONLINE db-rac1
ONLINE ONLINE db-rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE
ONLINE db-rac1
ora.cvu
1 ONLINE
ONLINE db-rac1
ora.db-rac1.vip
1 ONLINE
ONLINE db-rac1
ora.db-rac2.vip
1 ONLINE
ONLINE db-rac2
ora.oc4j
1 ONLINE
ONLINE db-rac1
ora.scan1.vip
1 ONLINE
ONLINE db-rac1
Finally we have done OCR recovery….
[root@db-rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total
space (kbytes) : 262120
Used space
(kbytes) : 2604
Available
space (kbytes) : 259516
ID : 1732998155
Device/File Name : +OCR
Device/File integrity check succeeded
Device/File
not configured
Device/File
not configured
Device/File
not configured
Device/File
not configured
Cluster
registry integrity check succeeded
Logical
corruption check succeeded
DISCLAIMER
The views expressed here are my own and do not
necessarily reflect the views of any other individual, business entity, or
organization. The views expressed by visitors on this blog are theirs solely
and may not reflect mine.