RAC - Cheatsheet

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

RAC Cheatsheet

This is a quick and dirty cheatsheet on Oracle RAC 10g, as my experience with RAC grows I will update this section, below is a beginners guide on the commands and information that you will require to administer Oracle RAC. Acronyms
Acronyms GCS GES GRD GRM Global Cache Services Global Enqueue Services Global Resource Directory Global Resource Manager in memory database containing current locks and awaiting locks, also known as PCM coordinates the requests of all global enqueues uses the GCS, also known as non-PCM all resources available to the cluster (formed and managed by GCS and GES), see GRD for more details helps to coordinate and communicate the locks requests between Oracle processes runs on each node with one GSD process per node. The GSD coordinates with the cluster manager to receive requests from clients such as the DBCA, EM, and the SRVCTL utility to execute administrative job tasks such as instance startup or shutdown. The GSD is not an Oracle instance background process and is therefore not started with the Oracle instance formly know as (integrated) Distributed Lock Manager, its another name for GCS it is a identifiable entity it basically has a name or a reference, it can be a area in memory, a disk file or an abstract entity a resource that can be accessed by all the nodes within the cluster examples would be the following Resource (Global) n/a Data Buffer Cache Block Transaction Enqueue Database Data Structures contains a small amount of data regarding the lock controls the DLM traffic between instances (messaging tickets)

GSD

Global Services Daemon

PCM (IDLM) Resource

Parallel Cache Management n/a

LVB TRFC

Lock Value Block Traffic Controller

Files and Directories


Files and Directories $ORA_CRS_HOME/cdata/<cluster_name> $ORA_HOME/log/<hostname>/client/ocrconfig_<pid>.log $ORA_CRS_HOME/crs/log $ORA_CRS_HOME/crs/init $ORA_CRS_HOME/css/log $ORA_CRS_HOME/css/init $ORA_CRS_HOME/evm/log $ORA_CRS_HOME/evm/init $ORA_CRS_HOME/srvm/log $ORA_CRS_HOME/log OCR backups (default location) OCR command log file contains trace files for the CRS resources contains trace files for the CRS daemon during startup, a good place to start contains cluster reconfigurations, missed check-ins, connects and disconnects from the client CSS listener. Look here to obtain when reboots occur contains core dumps from the cluster synchronization service daemon (OCSd) logfiles for the event volume manager and eventlogger daemon pid and lock files for EVM logfiles for Oracle Cluster Registry (OCR) log fles for Oracle clusterware which contains diagnostic messages at the Oracle cluster level

Useful Views/Tables
GCS and Cache Fusion Diagnostics v$cache v$cache_transfer v$instance_cache_transfer v$cr_block_server v$current_block_server v$gc_element v$lock v$ges_blocking_enqueue v$enqueue_statistics v$resource_limits v$locked_object v$ges_statistics v$ges_enqueue contains information about every cached block in the buffer cache contains information from the block headers in SGA that have been pinged at least once contains information about the transfer of cache blocks through the interconnect contains statistics about CR block transfer across the instances contains statistics about current block transfer across the instances contains one-to-one information for each global cache resource used by the buffer cache GES diagnostics contains information about locks held within a database and outstanding requests for locks and latches contains information about locks that are being blocked or blocking others and locks that are known to the lock manager contains details about enqueue statistics in the instance display enqueue statistics contains information about DML locks acquired by different transactions in databases with their mode held contains miscellaneous statistics for GES contains information about all locks known to the lock manager

v$ges_convert_local v$ges_convert_remote v$ges_resource v$ges_misc v$ges_traffic_controller v$hvmaster_info v$gcshvmaster_info v$gcspfmaster_info

contains information about all local GES operations contains information about all remote GES operations contains information about all resources known to the lock manager contains information about messaging traffic information contains information about the message ticket usage Dynamic Resource Remastering contains information about current and previous master instances of GES resources in relation to hash value ID of resource the same as above but globally conatins information about current and previous masters about GCS resources belonging to files mapped to a particular master, including the number of times the resource has remastered Cluster Interconnect contains information about interconnects that are being used for cluster communication same as above but also contains interconnects that AC is aware off that are not being used Miscellanous services running on an instance display LMS daemon statistics display LMD daemon statistics

v$cluster_interconnects v$configured_interconnects v$service x$kjmsdp x$kjmddp

Useful Parameters
Parameters cluster_interconnects _gcs_fast_config _lm_master_weight _gcs_resources _lm_tickets _lm_ticket_active_sendback _db_block_max_cr_dba _fairness_threshold _gc_affinity_time _gc_affinity_limit _gc_affinity_minimum _lm_file_affinity _lm_dynamic_remastering _gc_defer_time _lgwr_async_broadcast specify a specific IP address to use for the inetrconnect enables fast reconfiguration for gcs locks (true|false) controls which instance will hold or (re)master more resources than others controls the number of resources an instance will master at a time controls the number of message tickets controls the number of message tickets (aggressive messaging) limits the number of CR copies per DBA on the buffer cache (see grd) used when too many CR requested arrive for a particular buffer and the block becomes disowned (see grd) specifies interval minutes for reamstering defines the number of times a instance access the resource before remastering defines the minimum number of times a instance access the resource before remastering disables dynamic remastering for the objects belonging to those files enable or disable remastering define the time by which an instance deferred downgrading a lock (see Cache Fusion) change the SCN boardcast method (see troubleshooting)

Processes

Oracle RAC Daemons and Processes OPROCd Process Monitor provides basic cluster integrity services

EVMd OCSSd CRSd

Event Management Cluster Synchronization Services Cluster Ready Services

spawns a child process event logger and generates callouts basic node membership, group services, basic locking resource monitoring, failover and node recovery this is the cache fusion part, it handles the consistent copies of blocks that are tranferred between instances. It receives requests from LMD to perform lock requests. I rools back any uncommitted transactions. There can be upto ten LMS processes running and can be started dynamically if demand requires it. they manage lock manager service requests for GCS resources and send them to a service queue to be handled by the LMSn process. It also handles global deadlock detection and monitors for lock conversion timeouts.

LMSn

Lock Manager Server process - GCS

LMON

Lock Monitor Process - GES

this process manages the GES, it maintains consistency of GCS memory in case of process death. It is also responsible for cluster reconfiguration and locks reconfiguration (node joining or leaving), it checks for instance deaths and listens for local messaging. A detailed log file is created that tracks any reconfigurations that have happened.

LMD LCK0

Lock Manager Daemon - GES Lock Process - GES

this manages the enqueue manager service requests for the GCS. It also handles deadlock detention and remote resource requests from other instances. manages instance resource requests and cross-instance call operations for shared resources. It builds a list of invalid lock elements and validates lock elements during recovery. This is a lightweight process, it uses the DIAG framework to monitor the healt of the cluster. It captures information for later diagnosis in the event of failures. It will perform any neccessary recovery if an operational hang is detected.

DIAG

Diagnostic Daemon

General Administration
Managing the Cluster starting stopping
/etc/init.d/init.crs start crsctl start crs /etc/init.d/init.crs stop crsctl stop crs /etc/init.d/init.crs enable /etc/init.d/init.crs disable crsctl enable crs crsctl disable crs

enable/disable at boot time

Managing the database configuration with SRVCTL


srvctl start database -d <database> -o <option> Note: starts listeners if not already running, you can use the -o option to specify startup/shutdown options force open mount nomount srvctl stop database -d <database> -o <option> Note: the listeners are not stopped, you can use the -o option to specify startup/shutdown options immediate abort normal transactional srvctl [start|stop] database -d <database> -i <instance>,<instance> srvctl config database srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl status status status status status stop stop stop stop stop start start start start start database -d <database> instance -d <database> -i <instance>,<instance> service -d <database> nodeapps -n <node> asm -n <node>

start all instances

stop all instances

start/stop particular instance display the registered databases

status

stopping/starting

database -d <database> instance -d <database> -i <instance>,<instance> service -d <database> -s <service>,<service> -i <instance>,<instance> nodeapps -n <node> asm -n <node> database -d <database> instance -d <database> -i <instance>,<instance> service -d <database> -s <service>,<service> -i <instance>,<instance> nodeapps -n <node> asm -n <node>

srvctl add database -d <database> -o <oracle_home>

adding/removing

srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl srvctl

add add add add

instance -d <database> -i <instance> -n <node> service -d <database> -s <service> -r <preferred_list> nodeapps -n <node> -o <oracle_home> -A <name|ip>/network asm -n <node> -i <asm_instance> -o <oracle_home>

remove database -d <database> -o <oracle_home> remove instance -d <database> -i <instance> -n <node> remove service -d <database> -s <service> -r <preferred_list> remove nodeapps -n <node> -o <oracle_home> -A <name|ip>/network asm remove -n <node>

OCR utilities log file checking


$ORA_HOME/log/<hostname>/client/ocrconfig_<pid>.log ocrcheck Note: will return the OCR version, total space allocated, space used, free space, location of each device and the result of the integrity check ocrdump -backupfile <file> Note: by default it dumps the contents into a file named OCRDUMP in the current directory ocrconfig -export <file> ocrconfig -restore <file> # show backups ocrconfig -showbackup # to change the location of the backup, you can even specify a ASM disk ocrconfig -backuploc <path|+asm> # perform a backup, will use the location specified by the -backuploc location ocrconfig -manualbackup # perform a restore ocrconfig -restore <file> # delete a backup orcconfig -delete <file> Note: there are many more option so see the ocrconfig man page ## add/relocate the ocrmirror file to the specified location ocrconfig -replace ocrmirror '/ocfs2/ocr2.dbf'

dump contents export/import

backup/restore

add/remove/replace

## relocate an existing OCR file ocrconfig -replace ocr '/ocfs1/ocr_new.dbf' ## remove the OCR or OCRMirror file ocrconfig -replace ocr ocrconfig -replace ocrmirror

CRS Administration
CRS Administration
## Starting CRS using Oracle 10g R1 not possible ## Starting CRS using Oracle 10g R2 $ORA_CRS_HOME/bin/crsctl start crs ## Stopping CRS using Oracle 10g R1 srvctl stop database -d <database> srvctl stop asm -n <node> srvctl stop nodeapps -n <node> /etc/init.d/init.crs stop ## Stopping CRS using Oracle 10g R2 $ORA_CRS_HOME/bin/crsctl stop crs ## use to stop CRS restarting after a reboot

starting

stopping

disabling/enabling

## Oracle 10g R1 /etc/init.d/init.crs [disable|enable] ## Oracle 10g R2 $ORA_CRS_HOME/bin/crsctl [disable|enable] crs $ORA_CRS_HOME/bin/crsctl $ORA_CRS_HOME/bin/crsctl $ORA_CRS_HOME/bin/crsctl $ORA_CRS_HOME/bin/crsctl $ORA_CRS_HOME/bin/crsctl check check check check check crs evmd cssd crsd install -wait 600

checking

Resource Applications (CRS Utilities) status create profile register/unregister application Start/Stop an application Resource permissions
$ORA_CRS_HOME/bin/crs_stat $ORA_CRS_HOME/bin/crs_profile $ORA_CRS_HOME/bin/crs_register $ORA_CRS_HOME/bin/crs_unregister $ORA_CRS_HOME/bin/crs_start $ORA_CRS_HOME/bin/crs_stop $ORA_CRS_HOME/bin/crs_getparam $ORA_CRS_HOME/bin/crs_setparam $ORA_CRS_HOME/bin/crs_relocate

Relocate a resource member number/name local node name activates logging display delete set
olsnodes -n olsnodes -l olsnodes -g

Nodes

Oracle Interfaces
oifcfg getif oicfg delig -global oicfg setif -global <interface name>/<subnet>:public oicfg setif -global <interface name>/<subnet>:cluster_interconnect

Global Services Daemon Control starting stopping status create a new configuration
gsdctl start gsdctl stop gsdctl status

Cluster Configuration (clscfg is used during installation)


clscfg -install

upgrade or downgrade and existing clscfg -upgrade clscfg -downgrade configuration add or delete a node from the configuration create a special single-node configuration for ASM brief listing of terminology used in the other nodes used for tracing help
clscfg -add clscfg -delete clscfg -local

clscfg -concepts clscfg -trace clscfg -h

Cluster Name Check print cluster name print the clusterware version
cemulto -n Note: in Oracle 9i the ulity was called "cemutls" cemulto -w Note: in Oracle 9i the ulity was called "cemutls"

Node Scripts Add Node Delete Node


addnode.sh Note: see adding and deleting nodes deletenode.sh Note: see adding and deleting nodes

Enqueues
SQL> SQL> SQL> SQL> column column column column current_utilization heading current max_utilization heading max_usage initial_allocation heading initial resource_limit format a23;

displaying statistics

SQL> select * from v$resource_limit;

Messaging (tickets) ticket usage


select local_nid local, remote_nid remote, tckt_avail avail, tckt_limit limit, snd_q_len send_queue, tckt_wait waiting from v$ges_traffic_controller; SQL> oradebug setmypid SQL> oradebug unlimit SQL> oradebug lkdebug -t

dump ticket information

Lighwork Rule and Fairness Threshold


select cr_requests, light_works, data_requests, fairness_down_converts from v$cr_block_server;

downconvert

Note: lower the _fairness_threshold if the ratio goes above 40%, set to 0 if the instance is a query only instance.

Remastering
## Obtain the OBJECT_ID form the below table SQL> select * from v$gcspfmaster_info;

force dynamic remastering (DRM)

## Determine who masters it SQL> oradebug setmypid SQL> oradebug lkdebug -a <OBJECT_ID> ## Now remaster the resource SQL> oradebug setmypid SQL> oradebug lkdebug -m pkey <OBJECT_ID>

GRD, SRVCTL, GSD and SRVCONFIG Tracing Enable tracing


$ export SRVM_TRACE=true

Disable tracing

$ export SRVM_TRACE=""

Voting Disk
adding deleting querying
crsctl add css votedisk <file> crsctl delete css votedisk <file> crsctl query css votedisk

You might also like