Document 1053147.1: 11gR2 Clusterware and Grid Home - What You Need To Know (Doc ID 1053147.1)
Document 1053147.1: 11gR2 Clusterware and Grid Home - What You Need To Know (Doc ID 1053147.1)
Document 1053147.1: 11gR2 Clusterware and Grid Home - What You Need To Know (Doc ID 1053147.1)
11gR2 Clusterware and Grid Home - What You Need to Know (Doc ID 1053147.1) To Bottom
Purpose
Scope Type:
Details Status:
BULLETIN
Last
11gR2 Clusterware Key Facts PUBLISHED
Major
Oct 30, 2014
Clusterware Startup Sequence Update:
Oct 30, 2014
Important Log Locations Last
Update:
Clusterware Resource Status Check
Clusterware Resource Administration
OCRCONFIG Options: Related Products
Information Center:
Oracle Database - Enterprise Edition - Version 11.2.0.1 to 11.2.0.1 [Release 11.2] Overview of Database
Information in this document applies to any platform. Security Products
Currency checked on Oct 30 2014 [1548952.2]
Information Center:
PURPOSE Overview Database
Server/Client Installation
The 11gR2 Clusterware has undergone numerous changes since the previous release. For information on the previous release(s), and Upgrade/Migration
see Note: 259301.1 "CRS and 10g Real Application Clusters". This document is intended to go over the 11.2 Clusterware which [1351022.2]
has some similarities and some differences from the previous version(s).
Troubleshooting 11.2
11gR2 Clusterware Key Facts Clusterware Node
Evictions (Reboots)
11gR2 Clusterware is required to be up and running prior to installing a 11gR2 Real Application Clusters database. [1050693.1]
The GRID home consists of the Oracle Clusterware and ASM. ASM should not be in a separate home.
Troubleshooting 11.2 Grid
The 11gR2 Clusterware can be installed in "Standalone" mode for ASM and/or "Oracle Restart" single node support. This
Infrastructure root.sh
clusterware is a subset of the full clusterware described in this document.
Issues [1053970.1]
The 11gR2 Clusterware can be run by itself or on top of vendor clusterware. See the certification matrix for certified
combinations. Ref: Note: 184875.1 "How To Check The Certification Matrix for Real Application Clusters" CTSSD Runs in Observer
The GRID Home and the RAC/DB Home must be installed in different locations. Mode Even Though No
The 11gR2 Clusterware requires a shared OCR files and voting files. These can be stored on ASM or a cluster filesystem. Time Sync Software is
The OCR is backed up automatically every 4 hours to <GRID_HOME>/cdata/<clustername>/ and can be restored via Running [1054006.1]
ocrconfig.
The voting file is backed up into the OCR at every configuration change and can be restored via crsctl. How To Check The
The 11gR2 Clusterware requires at least one private network for inter-node communication and at least one public Certification Matrix for
network for external communication. Several virtual IPs need to be registered with DNS. This includes the node VIPs Real Application Clusters
(one per node), SCAN VIPs (three). This can be done manually via your network administrator or optionally you could [184875.1]
configure the "GNS" (Grid Naming Service) in the Oracle clusterware to handle this for you (note that GNS requires its Show More
own VIP).
A SCAN (Single Client Access Name) is provided to clients to connect to. For more information on SCAN see Note:
887522.1 Recently Viewed
The root.sh script at the end of the clusterware installation starts the clusterware stack. For information on
Grid Infrastructure
troubleshooting root.sh issues see Note: 1053970.1
Single Client Access
Only one set of clusterware daemons can be running per node.
Name (SCAN) Explained
On Unix, the clusterware stack is started via the init.ohasd script referenced in /etc/inittab with "respawn".
[887522.1]
A node can be evicted (rebooted) if a node is deemed to be unhealthy. This is done so that the health of the entire
cluster can be maintained. For more information on this see: Note: 1050693.1 "Troubleshooting 11.2 Clusterware Node CVE-2014-3566 -
Evictions (Reboots)" Instructions to Mitigate
Either have vendor time synchronization software (like NTP) fully configured and running or have it not configured at all the SSL v3.0
and let CTSS handle time synchronization. See Note: 1054006.1 for more information. Vulnerability (aka
If installing DB homes for a lower version, you will need to pin the nodes in the clusterware or you will see ORA-29702 "Poodle Attack") in
errors. See Note 946332.1 and Note:948456.1 for more information. ILOM [1935986.1]
1 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
The clusterware stack can be started by either booting the machine, running "crsctl start crs" to start the clusterware How to Reload the JVM
stack, or by running "crsctl start cluster" to start the clusterware on all nodes. Note that crsctl is in the in 11.2.0.x [1112983.1]
<GRID_HOME>/bin directory. Note that "crsctl start cluster" will only work if ohasd is running. How to Reload the JVM
The clusterware stack can be stopped by either shutting down the machine, running "crsctl stop crs" to stop the in 11.1.0.x [457279.1]
clusterware stack, or by running "crsctl stop cluster" to stop the clusterware on all nodes. Note that crsctl is in the
How to Reload the JVM
<GRID_HOME>/bin directory.
in 10.1.0.X and
Killing clusterware daemons is not supported.
10.2.0.X [Video]
Instance is now part of .db resources in "crsctl stat res -t" output, there is no separate .inst resource for 11gR2 instance.
[276554.1]
Show More
Note that it is also a good idea to follow the RAC Assurance best practices in Note: 810394.1
The following is the Clusterware startup sequence (image from the "Oracle Clusterware Administration and Deployment Guide):
Don't let this picture scare you too much. You aren't responsible for managing all of these processes, that is the Clusterware's
job!
Short summary of the startup sequence: INIT spawns init.ohasd (with respawn) which in turn starts the OHASD process (Oracle
High Availability Services Daemon). This daemon spawns 4 processes.
orarootagent - Agent responsible for managing all root owned crsd resources.
oraagent - Agent responsible for managing all oracle owned crsd resources.
2 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
Clusterware daemon logs are all under <GRID_HOME>/log/<nodename>. Structure under <GRID_HOME>/log/<nodename>:
3 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
./evmd:
./gipcd:
./gnsd:
./gpnpd:
./mdnsd:
./ohasd:
./racg:
./racg/racgeut:
./racg/racgevtf:
./racg/racgmain:
./srvm:
The cfgtoollogs dir under <GRID_HOME> and $ORACLE_BASE contains other important logfiles. Specifically for rootcrs.pl and
configuration assistants like ASMCA, etc...
The diagcollection.pl script under <GRID_HOME>/bin can be used to automatically collect important files for support. Run this
as the root user.
The following command will display the status of all cluster resources:
Srvctl and crsctl are used to manage clusterware resources. The general rule is to use srvctl for whatever resource management
you can. Crsctl should only be used for things that you cannot do with srvctl (like start the cluster). Both have a help feature to
see the available syntax.
4 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
Note that the following only shows the available srvctl syntax. For additional explanation on what these commands do, see
the Oracle Documentation.
Srvctl syntax:
$ srvctl -h
Usage: srvctl [-V]
Usage: srvctl add database -d <db_unique_name> -o <oracle_home> [-m <domain_name>] [-p <spfile>] [-r
{PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY}] [-s <start_options>] [-t
<stop_options>] [-n <db_name>] [-y {AUTOMATIC | MANUAL}] [-g "<serverpool_list>"] [-x <node_name>]
[-a "<diskgroup_list>"]
Usage: srvctl config database [-d <db_unique_name> [-a] ]
Usage: srvctl start database -d <db_unique_name> [-o <start_options>]
Usage: srvctl stop database -d <db_unique_name> [-o <stop_options>] [-f]
Usage: srvctl status database -d <db_unique_name> [-f] [-v]
Usage: srvctl enable database -d <db_unique_name> [-n <node_name>]
Usage: srvctl disable database -d <db_unique_name> [-n <node_name>]
Usage: srvctl modify database -d <db_unique_name> [-n <db_name>] [-o <oracle_home>] [-u
<oracle_user>] [-m <domain>] [-p <spfile>] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY |
SNAPSHOT_STANDBY}] [-s <start_options>] [-t <stop_options>] [-y {AUTOMATIC | MANUAL}] [-g
"<serverpool_list>" [-x <node_name>]] [-a "<diskgroup_list>"|-z]
Usage: srvctl remove database -d <db_unique_name> [-f] [-y]
Usage: srvctl getenv database -d <db_unique_name> [-t "<name_list>"]
Usage: srvctl setenv database -d <db_unique_name> {-t <name>=<val>[,<name>=<val>,...] | -T <name>=
<val>}
Usage: srvctl unsetenv database -d <db_unique_name> -t "<name_list>"
5 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
Usage: srvctl add listener [-l <lsnr_name>] [-s] [-p "[TCP:]<port>[, ...][/IPC:<key>][/NMP:
<pipe_name>][/TCPS:<s_port>] [/SDP:<port>]"] [-o <oracle_home>] [-k <net_num>]
Usage: srvctl config listener [-l <lsnr_name>] [-a]
Usage: srvctl start listener [-l <lsnr_name>] [-n <node_name>]
Usage: srvctl stop listener [-l <lsnr_name>] [-n <node_name>] [-f]
Usage: srvctl status listener [-l <lsnr_name>] [-n <node_name>]
Usage: srvctl enable listener [-l <lsnr_name>] [-n <node_name>]
Usage: srvctl disable listener [-l <lsnr_name>] [-n <node_name>]
Usage: srvctl modify listener [-l <lsnr_name>] [-o <oracle_home>] [-p "[TCP:]<port>[, ...][/IPC:
<key>][/NMP:<pipe_name>][/TCPS:<s_port>] [/SDP:<port>]"] [-u <oracle_user>] [-k <net_num>]
Usage: srvctl remove listener [-l <lsnr_name> | -a] [-f]
Usage: srvctl getenv listener [-l <lsnr_name>] [-t <name>[, ...]]
Usage: srvctl setenv listener [-l <lsnr_name>] -t "<name>=<val> [,...]" | -T "<name>=<value>"
Usage: srvctl unsetenv listener [-l <lsnr_name>] -t "<name>[, ...]"
Usage: srvctl add srvpool -g <pool_name> [-l <min>] [-u <max>] [-i <importance>] [-n
"<server_list>"]
Usage: srvctl config srvpool [-g <pool_name>]
Usage: srvctl status srvpool [-g <pool_name>] [-a]
Usage: srvctl status server -n "<server_list>" [-a]
Usage: srvctl relocate server -n "<server_list>" -g <pool_name> [-f]
Usage: srvctl modify srvpool -g <pool_name> [-l <min>] [-u <max>] [-i <importance>] [-n
"<server_list>"]
Usage: srvctl remove srvpool -g <pool_name>
6 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
Crsctl Syntax (for further explanation of these commands see the Oracle Documentation)
$ ./crsctl -h
Usage: crsctl add - add a resource, type or other entity
crsctl check - check a service, resource or other entity
crsctl config - output autostart configuration
crsctl debug - obtain or modify debug state
crsctl delete - delete a resource, type or other entity
crsctl disable - disable autostart
crsctl enable - enable autostart
crsctl get - get an entity value
crsctl getperm - get entity permissions
crsctl lsmodules - list debug modules
crsctl modify - modify a resource, type or other entity
crsctl query - query service state
crsctl pin - Pin the nodes in the nodelist
crsctl relocate - relocate a resource, server or other entity
crsctl replace - replaces the location of voting files
crsctl setperm - set entity permissions
crsctl set - set an entity value
crsctl start - start a resource, server or other entity
crsctl status - get status of a resource or other entity
crsctl stop - stop a resource, server or other entity
crsctl unpin - unpin the nodes in the nodelist
crsctl unset - unset a entity value, restoring its default
For more information non each command. Run "crsctl <command> -h".
OCRCONFIG Options:
Note that the following only shows the available ocrconfig syntax. For additional explanation on what these commands do,
see the Oracle Documentation.
$ ./ocrconfig -help
Name:
ocrconfig - Configuration tool for Oracle Cluster/Local Registry.
Synopsis:
ocrconfig [option]
option:
[-local] -export <filename>
- Export OCR/OLR contents to a file
[-local] -import <filename> - Import OCR/OLR contents from a file
[-local] -upgrade [<user> [<group>]]
- Upgrade OCR from previous version
-downgrade [-version <version string>]
- Downgrade OCR to the specified version
[-local] -backuploc <dirname> - Configure OCR/OLR backup location
[-local] -showbackup [auto|manual] - Show OCR/OLR backup information
[-local] -manualbackup - Perform OCR/OLR backup
[-local] -restore <filename> - Restore OCR/OLR from physical backup
-replace <current filename> -replacement <new filename>
- Replace a OCR device/file <filename1> with
<filename2>
-add <filename> - Add a new OCR device/file
-delete <filename> - Remove a OCR device/file
-overwrite - Overwrite OCR configuration on disk
-repair -add <filename> | -delete <filename> | -replace <current filename>
-replacement <new filename>
- Repair OCR configuration on the local node
-help - Print out this help information
Note:
* A log file will be created in
7 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
OLSNODES Options
Note that the following only shows the available olsnodes syntax. For additional explanation on what these commands do,
see the Oracle Documentation.
$ ./olsnodes -h
Usage: olsnodes [ [-n] [-i] [-s] [-t] [<node> | -l [-p]] | [-c] ] [-g] [-v]
where
-n print node number with the node name
-p print private interconnect address for the local node
-i print virtual IP address with the node name
<node> print information for the specified node
-l print information for the local node
-s print node status - active or inactive
-t print node type - pinned or unpinned
-g turn on logging
-v Run in debug mode; use at direction of Oracle Support only.
-c print clusterware name
Note that the following only shows the available olsnodes syntax. For additional explanation on what these commands do,
see the Oracle Documentation.
Component Options:
USAGE:
cluvfy comp <component-name> <component-specific options> [-verbose]
Stage Options:
USAGE:
cluvfy stage {-pre|-post} <stage-name> <stage-specific options> [-verbose]
8 of 9 05/01/15 5:39 PM
Document 1053147.1 https://support.oracle.com/epmos/faces/DocumentDisplay?_a...
To discuss this topic further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the
My Oracle Support Database - RAC/Scalability Community
REFERENCES
NOTE:810394.1 - RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent)
NOTE:1050693.1 - Troubleshooting 11.2 Clusterware Node Evictions (Reboots)
NOTE:1053970.1 - Troubleshooting 11.2 Grid Infrastructure root.sh Issues
NOTE:1054006.1 - CTSSD Runs in Observer Mode Even Though No Time Sync Software is Running
NOTE:184875.1 - How To Check The Certification Matrix for Real Application Clusters
NOTE:259301.1 - CRS and 10g/11.1 Real Application Clusters
NOTE:887522.1 - Grid Infrastructure Single Client Access Name (SCAN) Explained
NOTE:946332.1 - Unable To Create 10.1 or 10.2 or 11.1(< 11gR2) ASM RAC Databases (ORA-29702) Using Brand New 11gR2
Grid Infrastructure Installation .
Attachments
11.2 Clusterware (58.88 KB)
cwadd004.gif (29.17 KB)
Related
Products
Oracle Database Products > Oracle Database Suite > Oracle Database > Oracle Database - Enterprise Edition > Clusterware > Miscellaneous Issues
Keywords
11GR2; CLUSTER; CLUSTERWARE; CRSCTL; GRID; GRID INFRASTRUCTURE; REAL APPLICATION CLUSTERS; SCAN
Errors
ORA-29702
Back to Top
Copyright (c) 2014, Oracle. All rights reserved. Legal Notices and Terms of Use Privacy Statement
9 of 9 05/01/15 5:39 PM