Sun Cluster Commands
Sun Cluster Commands
File locations
log files
sccheck logs
/var/cluster/sccheck/report
N/a
/var/cluster/logs/cluster_ch (U2)
CC
files
/etc/cluster/ccr /etc/cluster/ccr/infrastructure
/etc/cluster/ccr/<!one nam
/etc/cluster/ccr/<!one nam
SCSI Reservations
cluster 3.0/3.1 scsi2$ /usr/cluster/li%/sc/pgre &c pgre_inke#s &d /dev/did/rdsk/d's2 "ispla# reservation ke#s scsi($ /usr/cluster/li%/sc/scsi &c inke#s &d /dev/did/rdsk/d's2
cluster 3.2
scsi2$ scsi2$ /usr/cluster/li%/sc/pgre &c p /usr/cluster/li%/sc/pgre &c pgre_inresv &d /dev/did/rdsk/d's2 /dev/did/rdsk/d's2 determine the device o)ner scsi($ /usr/cluster/li%/sc/scsi &c inresv &d /dev/did/rdsk/d's2
Cluster information
cluster 3.0/3.1 *uorum info Cluster components scstat &+ scstat &pv
cluster 3.2 cl+uorum sho) cluster sho) clrg sho) clrs sho)
scstat &g
,- Net)orking .ultipathing /tatus of all nodes "isk device groups 0ransport info
scstat &i scstat &n scstat &" scstat &1 clnode sho) cldg sho) clintr sho) clrs sho) &v clrg sho) &v cluster sho) &v scinstall &pv
scrgadm &pv
Cluster Configuration
cluster 3.0/3.1 ,ntegrit# check Configure the cluster (add nodes2 add data services2 etc) sccheck
scinstall
scinstall
Cluster configuration utilit# (+uorum2 data sevices2 scsetup resource groups2 etc) 3dd a node scconf &a &0 node4<host>
clsetup
scconf &r &0 node4<host> scconf &a &0 node4. scconf &c &+ node4<node>2maintstate
Note$ use the scstat &+ command to verif# that the node is Note: use the clquorum status node is in maintenance mode in maintenance mode2 the vote count should %e !ero for zero for that node. that node. scconf &c &+ node4<node>2reset clquorum reset
Note$ use the scstat &+ command to verif# that the node is Note: use the clquorum status in maintenance mode2 the vote count should %e one for node is in maintenance mode that node. one for that node.
dmin !uorum "evice *uorum devices are nodes2 disk devices2 and +uorum servers. so the total +uorum )ill %e all nodes and devices added together.
cluster 3.0/3.1 scconf &a &+ glo%aldev4d66 3dding a device to the +uorum Note$ if #ou get the error message 7una%le to scru% device7 use scgdevs to add device to the glo%al device namespace. scconf &r &+ glo%aldev4d66 8vacuate all nodes put cluster into maint mode 9scconf &c &+ installmode emove the last +uorum device remove the +uorum device 9scconf &r &+ glo%aldev4d66 check the +uorum devices 9scstat&+ scconf &c &+ reset esetting +uorum info Note$ this )ill %ring all offline +uorum devices online
Note$ if #ou get the error m device7 use cldevice to add device namespace.
cl+uorum reset
o%tain the device num%er 9scdidadm &; 9scconf &c &+ glo%aldev4<device>2maintstate scconf &c &+ glo%aldev4<device><device>2reset
"evice Configuration
cluster 3.0/3.1 ;ists all the configured devices including paths across all nodes. ;ist all the configured devices including paths on node onl#. econfigure the device data%ase2 creating ne) instances num%ers if re+uired. ;ists all the configured devices including paths < fencing ename a did instance Clearing no longer used did -erform the repair procedure for a particular path (use then )hen a disk gets replaced) Configure the glo%al device namespace
cluster 3.2
scdidadm &;
scdidadm &l
scdidadm &r
scdidadm &r
N/3
N/3 scdidadm &C scdidadm & <c=t=d=s=> & device scdidadm > 2 & device id scgdevs scdpm &p all$all
cldevice populate
/tatus of all disk paths Note$ (<host>$<disk>) .onitor device path Unmonitor device path scdpm >m <node$disk path> scdpm >u <node$disk path>
cldevice status
"evice grou#
cluster 3.0/3.1
cluster 3.2
3dding/ egistering
scconf &a &" cldg create &t <device t#pe4v?vm2name4appdg2nodelist4<host>$<host>2preferenced4true <device> <devicegrou
cldg remove&node@&t < <node> <devicegroup emoving scconf &r &" name4<device group>
/)itch
$rans#ort ca%le
cluster 3.2
Resource &rou#s
3dding emoving changing properties ;isting "etailed ;ist "ispla# mode t#pe (failover or scala%le) Dfflining Dnlining
scrgadm &a &g <res_group> &h <host>2<host> scrgadm &r &g <res_group> scrgadm &c &g <res_group> &# <propert#4value> scstat &g scrgadm &pv &g <res_group> scrgadm &pv &g <res_group> B grep C es 5roup modeC scs)itch &A &g <res_group> scs)itch &E &g <res_group> scs)itch &u &g <res_group>
clrg set &p <name4value> < clrg sho) Clrg sho) &v <res_group> Clrg sho) &v <res_group> clrg offline <res_group> clrg online <res_group>
Unmanaging
Note$ (all resources in group must %e disa%led) scs)itch &o &g <res_group> scs)itch &! &g <res_group> &h <host>
.anaging /)itching
Resources
cluster 3.0/3.1 3dding failover net)ork resource 3dding shared net)ork resource scrgadm a L g <res_group !l <logicalhost scrgadm >a >/ >g <res_group> &l <logicalhost>
cluster 3.2
clreslogicalhostname create !g
clressharedaddress create
<logicalhost> scrgadm >a >F apache_res &g <res_group> G &t /UN1.apache &# Net)ork_resources_used 4 <logicalhost> &# /cala%le4Aalse ># -ort_list 4 H=/tcp G &? :in_dir 4 /usr/apache/%in scrgadm >a >F apache_res &g <res_group> G &t /UN1.apache &# Net)ork_resources_used 4 <logicalhost> &# /cala%le40rue ># -ort_list 4 H=/tcp G &? :in_dir 4 /usr/apache/%in
clresource create !g <res_grou scrgadm !a !g <res_group !" <res !t #$N%.&'#torage(lus ) #$N%.&'#torage(lus ) !p + !* +ile#ystem,ount(oints-.oracle.data/0 !* 'ffinityon-true !* 'ffinityon-true <res scrgadm >r >F <resource>
emoving
Note$ must disa%le the resource first scrgadm &c &F <resource> &# <propert#4value> scstat &g scrgadm >pv >F <resource> scrgadm >pvv >F <resource> scrgadm >n >. >F <resource> scrgadm >e >. >F <resource> scs)itch >n >F <resource> scs)itch >e >F <resource> scs)itch &c &h<host>2<host> &F <resource> &f /0D-_A3,;8" scrgadm >pvv >F <resource> B grep >, net)ork offline the group$ scs)itch >A >g <res_group>
"etailed ;ist
remove the resource$ scrgadm >r >F <resource> remove the resource group $scrgadm >r >g <res_group>
Resource $'#es
cluster 3.2
scrgadm >a >t <resource t#pe> i.e /UN1.I3/torage-lus clrt register <resource t#pe scrgadm r t <resource type
"eleting
;isting
clrt list
Sun Cluster 3.1 Cheat Sheet: Sun Cluster 3.1 cheat sheet
Daemons
This is used by cluster kernel threads to execute userland commands (such as the run_reserve and dofsck commands). It is also used to run cluster commands remotely (like the cluster shutdown command). This daemon re isters with failfastd so that a failfast device driver will !anic the kernel if this daemon is killed and not restarted in "# seconds. This daemon !rovides access from userland mana ement a!!lications to the $$%. It is automatically restarted if it is sto!!ed. The cluster event daemon re isters and forwards cluster events (such as nodes enterin and leavin the cluster). There is also a !rotocol whereby user a!!lications can re ister themselves to receive cluster events. The daemon is automatically res!awned if it is killed. cluster event lo daemon lo s cluster events into a binary lo file. &t the time of writin for this course' there is no !ublished interface to this lo . It is automatically restarted if it is sto!!ed. This daemon is the failfast !roxy server.The failfast daemon allows the kernel to !anic if certain essential daemons have failed The resource rou! mana ement daemon which mana es the state of all cluster(unaware a!!lications.& failfast driver !anics the kernel if this daemon is killed and not restarted in "# seconds.
clexecd
cl_ccrad
cl_eventd
rpc.fed
This is the fork(and(exec daemon' which handles requests from r md to s!awn methods for s!ecific data services. & failfast driver !anics the kernel if this daemon is killed and not restarted in "# seconds. This is the !rocess monitorin facility. It is used as a eneral mechanism to initiate restarts and failure action scri!ts for some cluster framework daemons (in )olaris * +))' and for most a!!lication daemons and a!!lication fault monitors (in )olaris * and,# +)). & failfast driver !anics the kernel if this daemon is sto!!ed and not restarted in "# seconds. -ublic mana ment network service daemon mana es network status information received from the local I-.- daemon runnin on each node and facilitates a!!lication failovers caused by com!lete !ublic network failures on nodes. It is automatically restarted if it is sto!!ed. /isk !ath monitorin daemon monitors the status of disk !aths' so that they can be re!orted in the out!ut of the cldev status command. It is automatically restarted if it is sto!!ed.
rpc.pmfd
pnmd
scdpmd
File locations
man pages log files sccheck logs CCR files Cluster infrastructure file
0usr0cluster0man 0var0cluster0lo s 0var0adm0messa es 0var0cluster0sccheck0re!ort.<date> 0etc0cluster0ccr 0etc0cluster0ccr0infrastructure
SCSI Reservations
scsi12 0usr0cluster0lib0sc0! re (c ! re_inkeys (d 0dev0did0rdsk0d3s1
Cluster information
Quorum info Cluster components Resource/Resource group status I !etworking "ultipathing #tatus of all nodes Disk device groups $ransport info Detailed resource/resource group Cluster configuration info
scstat 4q scstat (!v scstat 4 scstat 4i scstat 4n scstat 4/ scstat 45 scr adm (!v scconf 4!
Cluster Configuration
Integrity check
sccheck
Configure the cluster %add nodes' add data services' etc& scinstall Cluster configuration utility %(uorum' data sevices' resource groups' etc& )dd a node Remove a node revent new nodes from entering ut a node into maintenance state
scsetu! scconf 4a 4T node6<host><host> scconf 4r 4T node6<host><host> scconf 4a 4T node6. scconf (c (q node6<node>'maintstate 7ote2 use the scstat (q command to verify that the node is in maintenance mode' the vote count should be 8ero for that node. scconf (c (q node6<node>'reset
*et a node out of maintenance state 7ote2 use the scstat (q command to verify that the node is in maintenance
mode' the vote count should be one for that node.
Admin Quorum Device 6uorum devices are nodes and dis7 devices, so the total quorum 8ill be all nodes and devices added together. 4ou can use the scsetup 9$: interface to add.remove quorum devices or use the belo8 commands.
scconf 4a 4q lobaldev6d,,
7ote2 if you et the error messa e 9uable to scrub device9 use sc devs to add device to the lobal device names!ace. scconf 4r 4q lobaldev6d,, :vacuate all nodes !ut cluster into maint mode ;scconf 4c 4q installmode
remove the quorum device ;scconf 4r 4q lobaldev6d,, check the quorum devices ;scstat 4q scconf 4c 4q reset
+ring a (uorum device into maintenance mode +ring a (uorum device out of maintenance mode
obtain the device number ;scdidadm 4< ;scconf 4c 4q lobaldev6<device>'maintstate scconf 4c 4q lobaldev6<device><device>'reset
Device Configuration
,ists all the configured devices including paths across all nodes.
scdidadm 4<
,ist all the configured devices including paths scdidadm 4l on node only. Reconfigure the device data-ase' creating new instances num-ers if re(uired.
scdidadm 4r
erform the repair procedure for a particular scdidadm 4% <c#t#d#s#> ( device scdidadm 4% 1 ( device id path %use then when a disk gets replaced& Configure the glo-al device namespace #tatus of all disk paths "onitor device path .nmonitor device path
sc devs scd!m 4! all2all 7ote2 (<host>2<disk>) scd!m 4m <node2disk !ath> scd!m 4u <node2disk !ath>
Disks grou
)dding/Registering Removing adding single node Removing single node #witch ut into maintenance mode take out of maintenance mode onlining a disk group offlining a disk group Resync a disk group
scconf (a (/ ty!e6vxvm'name6a!!d 'nodelist6<host>2<host>'!referenced6true scconf 4r 4/ name6<disk scconf 4r 4/ name6<disk scswitch 48 4/ <disk scswitch 4m 4/ <disk scswitch (8 (/ <disk scswitch (8 (/ <disk scswitch (= (/ <disk rou!> rou!>'nodelist6<host> scconf (a (/ ty!e6vxvm'name6a!!d 'nodelist6<host> rou!> (h <host> rou!> rou!> (h <host> rou!> (h <host> rou!>
Resource #rou s
)dding Removing changing properties ,isting
Detailed ,ist Display mode type %failover or scala-le& 0fflining 0nlining .nmanaging
<res_ rou!> <res_ rou!> > <res_ rou!> <res_ rou!> <res_ rou!> rou! must be disabled) re! ?%es @rou! mode?
"anaging #witching
scswitch 4o 4 scswitch 48 4
Resources
)dding failover network resource )dding shared network resource adding a failover apache application and attaching the network resource adding a shared apache application and attaching the network resource
scr adm 4a 4< 4 scr adm 4a 4) 4 <res_ rou!> (l <lo icalhost> <res_ rou!> (l <lo icalhost>
scr adm 4a 4B a!ache_res ( <res_ rou!> C (t )D75.a!ache (y 7etwork_resources_used 6 <lo icalhost> (y )calable6=alse 4y -ort_list 6 E#0tc! C (x Fin_dir 6 0usr0a!ache0bin scr adm 4a 4B a!ache_res ( <res_ rou!> C (t )D75.a!ache (y 7etwork_resources_used 6 <lo icalhost> (y )calable6True 4y -ort_list 6 E#0tc! C (x Fin_dir 6 0usr0a!ache0bin scr adm (a ( r _oracle (B has!_data#, (t )D75.G&)tora e-lus C > (x =ile)ystem.ount-oints60oracle0data#, C > (x &ffinityon6true scr adm 4r 4B res(i!
Removing
7ote2 must disable the resource first
changing properties ,ist Detailed ,ist Disa-le resoure monitor /na-le resource monitor Disa-ling /na-ling Clearing a failed resource 2ind the network of a resource
scr adm (c (B <resource> (y <!ro!erty6value> scstat ( scr adm 4!v 4B res(i! scr adm 4!vv 4B res(i! scr adm 4n 4. 4B res(i! scr adm 4e 4. 4B res(i! scswitch 4n 4B res(i! scswitch 4e 4B res(i! scswitch 4c 4h<host>'<host> (B <resource> (f )T+-_=&I<:/ ; scr adm 4!vv 4B <resource> > offline the rou! ; scswitch 4= 4 r rou!(, re! 4I network
remove the resource ; scr adm 4r 4B res(i! remove the resource rou! ; scr adm 4r 4 r rou!(,
Resource !$ es
scr adm 4a 4t <resource ty!e> scr adm 4r 4t <resource ty!e> scr adm 4!v >
(c (change)
(B (resource)
(r (remove)
(t (resource type)
(!JvvK (print) (< (for Logicalhostnames) ( v = state of resources n netiflist vv = parameter value s () (for Shared Addresses) (n netiflist (L auxn odelist
Examples
Register a resource type named .y/atabase: # scrgadm -a -t MyDatabase Create a failover resource group named .y/atabase%@: # scrgadm -a -g MyDatabaseRG Create a scalable resource group named .y5eb)erver%@:
# scrgadm -a -g MyWebServerRG \ -y Maximum_primaries=integer \ -y Desired_primaries=integer Create a resource of a given type in a resource group: # scrgadm -a -j resource-name -t resource-type-name -g RG-name See the rg_properties(5) man page for a description of the resource group properties. See the r_properties(5) man page for a description of the resource properties. scconf ----- Update the Sun Cluster software configuration ----Action (a (add) Object What
($ (cluster option cluster6clustername ) add2 trty!e6type'name6name'node6nodeJ'other-optionsK chan e name6name'node6nodeJ'state6stateKJ'other-options_remove name=_name'node6node add2 ty!e6type'name6nameJ'other-optionsK chan e name6nameJ'state6stateKJ'other-optionsK remove name6name add2 end!oint6Jnode2KnameJMportK'end!oint6Jnode2KnameJMportK J'noenableK chan e2 end!oint6Jnode2KnameJMportK'state6state remove2 end!oint6Jnode2K nameJMportK add2 node6nodeJ'!rivatehostname6nodeJ'!rivatehostname6hostaliasK add2 lobaldev6devicenameJ'node6node'node6nodeJ'...KK chan e2 node6node'Nmaintstate > resetKO chan e lobaldev6 devicename' Nmaintstate > resetO chan e2resetchan e2 installmoderemove2 lobaldev6devicename
(/ (devicegroup) add2 ty!e6type'name6nameJ'nodelist6node J2 nodeK...K J'!referenced6Ntrue > falseOKJ'failback6Nenabled > disabledOK J'otheroptions K chan e name6name J'nodelist6nodeJ2nodeK...K J'!referenced6Ntrue > falseOKJ'failback6Nenabled > disabledOK J'other-options K removename6name J'nodelist6nodeJ2nodeK...K (T (authenticatio add2 node6nodenameJ'...KJ'authty!e6authtypeK chan e authty!e6authtype remove Nnode6nodename J'...K > allO n)
(h (nodes)
Examples
Register a new disk group: # scconf -a -D type=vxvm,name=new-disk-group,nodelist=nodex:nodex Synchronize device group information after adding a volume: # scconf -c -D name=diskgroup,sync Add a shared quorum device to the cluster: # scconf -a -q globaldev=nodename Clear "installmode": # scconf -c -q reset Configure a second set of cluster transport connections: # scconf -a \ -A trtype=transport,name=ifname1,node=nodename1 \ -A trtype=transport,name=ifname2,node=nodename2 \ -m endpoint=nodename1:ifname1,endpoint=nodename2:ifname2 Secure the cluster against other machines that might attempt to add themselves to the cluster: # scconf -a -T node=. scswitch ----- Perform ownership/state change of resource groups and disk device groups in Sun Cluster configurations ----Action (8 (bring online) Object Who Special
( (resource (h (target host) (h 99 (no receiver) takes resource group offline group) (/ (device group) ( (resource group) ( (resource group)(/ (device group) (h (losing host) (P # specifies the number of seconds to keep no ( brings all resource groups online
resource groups from switching back onto a node after that node has been successfully evacuated. Default is 60 seconds, and can be set up to 65535 seconds. Starting with Sun Cluster 3.1 Update 3. (% (restart all RG) (m (set maintenance mode) (u (unmanage RG) (/ (device group) ( (resource group) ( (resource group) (B (resource) (B (resource) (B (resource) (f ( (h (target host) flag name) ( (resource group) (. disables monitoring only (h (target host)
(o (online RG)
Examples
Switch over resource-grp-2 to be mastered by node1: # scswitch -z -h node1 -g resource-grp-2 Switch over resource-grp-3, a resource group configured to have multiple primaries, to be mastered by node1, node2, node3: # scswitch -z -h node1,node2,node3 -g resource-grp-3 Switch all managed resource groups online on their most preferred node or nodes: # scswitch -z Quiesce resource-grp-2. Stops RG from continuously bouncing around from one node to another in the event of the failure of a )T&%T or )T+- method: # scswitch -Q -g resource-group-2 Switch over all resource groups and disk device groups from node1 to a new set of primaries:
# scswitch -S -h node1 Restart some resource groups on specified nodes: node1# scswitch -R -h node1,node2 -g resource-grp-1,resource-grp-2 Disable some resources: # scswitch -n -j resource-1,resource-2 Enable a resource: # scswitch -e -j resource-1 Take resource groups to the unmanaged state: # scswitch -u -g resource-grp-1,resource-grp-2 Take resource groups to the online state: # scswitch -o -g resource-grp-1,resource-grp-2 Switch over device-group-1 to be mastered by node2: # scswitch -z -h node2 -D device-group-1 Put device-group-1 into maintenance mode: # scswitch -m -D device-group-1 Move all resource groups and disk device groups persistently off of a node: # scswitch -S -h iloveuamaya -K 120 This situation arises when resource groups attempt to switch back automatically when strong negative affinities have been configured (with %@_affinities). scstat ----- Monitor the status of Sun Cluster ----Action (/ (shows status for all disk groups) ( (shows status for all resource groups) (h host (show status of all components related to specified host) Options
(i (shows status for all IPMP groups and public network adapters) (n (shows status for all nodes) (h host (shows status for the specified node)
(! (shows status for all components in the cluster) (q (shows status for all device quorums and node quorums) (5 (shows status for cluster transport path)
Examples
Show status of all resource groups followed by the status of all components related to node1: # scstat -g -h node1 or: # scstat -g and # scstat -h node1 scdpm Available starting in Sun Cluster 3.1 update 3. ----- Disk-path monitoring administration command ----Action (m (monitor the new disk path that is specified by node:disk path. All is default option) What JnodeK [node:all] [node:/dev/dis/rdsk/d-] JnodeK [node:all] [node:/dev/dis/rdsk/d-]
(u (unmonitor a disk path. The daemon on each node stops monitoring the specified path. All is default option)
(! J(=K (print current status of a specified disk path from all the nodes that are attached to JnodeK the storage. All is default option. The (= option prints only faulty disk paths) [node:all] [node:/dev/dis/rdsk/d-] (f filename (read the list of disk paths to monitor or unmonitor for a specified file name.)
Examples
Force daemon to monitor all disk paths in the cluster infrastructure:
# scdpm -m all Monitor a new path on all nodes where path is valid: # scdpm -m /dev/did/dsk/d3 Monitor new paths on just node,: # scdpm -m node1:d4 -m node1:d5 Print all disk paths in the cluster and their status: # scdpm -p all:all Print all failed disk paths: # scdpm -p -F all Print the status of all disk paths from node,: # scdpm -p node1:all
'lso all the commands in version <.0 are available to version <.1 Daemons and &rocesses 't the bottom of the installation guide : listed the daemons and processing running after a fresh install, no8 is the time e*plain 8hat these processes do, : have managed to obtain informtion on most of them but still loo7ing for others.
Versions 3.1 and 3.2 clexecd
This is used by cluster kernel threads to execute userland commands (such as the run_reserve and dof
commands). It is also used to run cluster commands remotely (like the cluster shutdown command). This daemon re isters with failfastd so that a failfast device driver will !anic the kernel if this killed and not restarted in "# seconds.
This daemon !rovides access from userland mana ement a!!lications to the $$%. It is automatically re is sto!!ed.
The cluster event daemon re isters and forwards cluster events (such as nodes enterin and leavin t There is also a !rotocol whereby user a!!lications can re ister themselves to receive cluster events The daemon is automatically res!awned if it is killed.
cluster event lo daemon lo s cluster events into a binary lo file. &t the time of writin for this there is no !ublished interface to this lo . It is automatically restarted if it is sto!!ed.
This daemon is the failfast !roxy server.The failfast daemon allows the kernel to !anic if certain e daemons have failed The resource rou! mana ement daemon which mana es the state of all cluster(unaware a!!lications. & driver !anics the kernel if this daemon is killed and not restarted in "# seconds.
This is the fork(and(exec daemon' which handles requests from r md to s!awn methods for s!ecific dat failfast driver !anics the kernel if this daemon is killed and not restarted in "# seconds.
rpc.pmfd
This is the !rocess monitorin facility. It is used as a eneral mechanism to initiate restarts and action scri!ts for some cluster framework daemons (in )olaris * +))' and for most a!!lication daemon a!!lication fault monitors (in )olaris * and,# +)). & failfast driver !anics the kernel if this daem and not restarted in "# seconds.
pnmd
-ublic mana ment network service daemon mana es network status information received from the local I runnin on each node and facilitates a!!lication failovers caused by com!lete !ublic network failure It is automatically restarted if it is sto!!ed.
/isk !ath monitorin daemon monitors the status of disk !aths' so that they can be re!orted in the o cldev status command. It is automatically restarted if it is sto!!ed.
scdpmd
.ulti(threaded /-. daemon runs on each node. It is automatically started by an rc scri!t when a node monitors the availibility of lo ical !ath that is visiable throu h various multi!ath drivers (.-xI+) -ower!ath' etc. &utomatically restarted by r!c.!mfd if it dies.
Version 3.2 only (d_userd cl_execd ifconfig_proxy_serverd rtreg_proxy_serverd cl_pnmd scprivipd sc_3onesd c3netd rpc.fed sc(dmd pnm mod serverd
This daemon serves as a !roxy whenever any quorum device activity requires execution of some command i.e a 7&) quorum device
is a daemon for the !ublic network mana ement (-.7) module. It is started at boot time and starts th service. It kee!s track of the local host?s I-.- state and facilities inter(node failover for all IThis daemon !rovisions I- addresses on the cl!rivnet# interface' on behalf of 8ones.
This daemon monitors the state of )olaris ,# non( lobal 8ones so that a!!lications desi ned to failo 8ones can react a!!ro!riately to 8one bootin failure It is used for reconfi urin and !lumbin created' also see the c8netd.xml file.
This is the 9fork and exec9 daemin which handles requests from r md to s!awn methods for s!ecfic dat =ailfast will hose the box if this is killed and not restarted in "# seconds The quorum server daemon' this !ossibly use to be called 9scqsd9
File locations
Both Versions (3.1 and 3.2) man pages
0usr0cluster0man
log files Configuration files %CCR' eventlog' etc& Cluster and other commands sccheck logs Cluster infrastructure file sccheck logs Cluster infrastructure file Command ,og
SCSI Reservations
scsi12 0usr0cluster0lib0sc0! re (c ! re_inkeys (d 0dev0did0rdsk0d3s1
Command shortcuts :n version <.1 there are number of shortcut command names 8hich : have detailed belo8, : have left the full command name in the rest of the document so it is obvious 8hat 8e are performing, all the commands are located in .usr.cluster.bin
shortcut cldevice cldevicegroup clinterconnect clnasdevice cl(uorum clresource clresourcegroup clreslogicalhostname clresourcetype clressharedaddress
cldev cld clintr clnas clq clrs clr clrslh clrt clrssa
3.1
;;other nodes in cluster scswitch () (h <host> shutdown (iR ( # (y ;; <ast remainin node scshutdown ( # (y
3.2
cluster shutdown ( # (y
Cluster information
4.5 Cluster
scstat (!v cluster list (v cluster show cluster status clnode list (v clnode show clnode status cldevice list cldevice show cldevice status clquorum list (v clquorum show clqorum status clinterconnect show clinterconnect status clresource list (v clresource show clresource status clresource rou! list (v clresource rou! show clresource rou! status clresourcety!e list (v clresourcety!e list(!ro!s (v clresourcety!e show scstat 4i clnode status (m clnode show(rev (v
4.6
!odes
scstat 4n
Devices
scstat 4/
Resource *roups
Cluster Configuration
3.1 Release Integrity check
sccheck
cat 0et
cluster
Configure the cluster %add nodes' scinstall add data services' etc& Cluster configuration utility %(uorum' data sevices' resource
scsetu!
scinsta
clsetu!
cluster
cluster
;; <ist cluster
,ist
;; /is! cluster
;; <ist cluster
;; /eta cluster
#tatus Reset the cluster private network settings lace the cluster into install mode )dd a node Remove a node
scconf 4a 4T node6<host><host> scconf 4r 4T node6<host><host> scconf (c (q node6<node>'maintstate
cluster
cluster
cluster
revent new nodes from entering scconf 4a 4T node6. ut a node into maintenance state
7ote2 use the scstat (q command to verify that the node is in maintenance mode' the vote count should be 8ero for that node. scconf (c (q node6<node>'reset
7ote2 use the scstat (q command to verify that the node is in maintenance mode' the vote count should be one for that node.
)ode Configuration
3.1 )dd a node to the cluster
Remove a node from the cluster /vacuate a node from the cluster Cleanup the cluster configuration %used after removing nodes&
scswitch () (h <node>
;; .ake sure you are on the nod clnode remove clnode evacuate <node> clnode clear <node> ;; )tandard list clnode list JS><node>K
,ist nodes
;; /estailed list clnode show JS><node>K
Admin Quorum Device 6uorum devices are nodes and dis7 devices, so the total quorum 8ill be all nodes and devices added together. 4ou can use the scsetup(3.1)/clsetup(3.2) interface to add.remove quorum devices or use the belo8 commands.
3.1
scconf 4a 4q lobaldev6d,,
)dding a #C#I device to the (uorum )dding a !)# device to the (uorum )dding a Quorum #erver Removing a device to the (uorum
7ote2 if you et the error messa e 9uable to scrub device9 use sc devs to add device to the lobal device names!ace. n0a n0a scconf 4r 4q lobaldev6d,, ;; :vacuate all nodes ;; -ut cluster into maint mode scconf 4c 4q installmode
;; %emove the quorum device scconf 4r 4q lobaldev6d,, ;; $heck the quorum devices scstat 4q
,ist
+ring a (uorum device into maintenance mode scdidadm 4< %4.6 known as ena-led& scconf 4c 4q +ring a (uorum device out of maintenance mode %4.6 known as disa-led&
scconf 4c 4q
Device Configuration
3.1 Check device Remove all devices from node
"onitoring
Rename
#tatus
,ists all the configured devices including paths scdidadm 4< across all nodes. ,ist all the configured devices including paths scdidadm 4l on node only. Reconfigure the device data-ase' creating new instances num-ers if re(uired.
scdidadm 4r
erform the repair procedure for a particular scdidadm 4% <c#t#d#s#> ( device scdidadm 4% 1 ( device id path %use then when a disk gets replaced&
Disks grou
3.1 Create a device group Remove a device group )dding Removing #et a property
n0a n0a scconf (a (/ ty!e6vxvm'name6a!!d 'nodelist6<host>2<host>'!referenced6true scconf 4r 4/ name6<disk rou!>
cld
cld
cld
cld
cld
;; cld
,ist
scstat
;; cld
status adding single node Removing single node #witch ut into maintenance mode take out of maintenance mode onlining a disk group offlining a disk group Resync a disk group
scstat scconf (a (/ ty!e6vxvm'name6a!!d 'nodelist6<host> scconf 4r 4/ name6<disk scswitch 48 4/ <disk scswitch 4m 4/ <disk scswitch (8 (/ <disk scswitch (8 (/ <disk scswitch (= (/ <disk rou!>'nodelist6<host> rou!> (h <host> rou!> rou!> (h <host> rou!> (h <host> rou!>
cld
cld
cld
cld
n0a
n0a
cld
cld
cld
;; )tandard and detailed l clinterconnect show J(n <n clinterconnect status J(n
Resource #rou s
3.1 )dding %failover& )dding %scala-le& )dding a node to a resource group
scr adm (a ( <res_ rou!> (h <host>'<host> clresource clresource clresource
;; %emove a clresource
Removing
scr adm 4r 4
< rou!>
;; %emove a clresource
Removing a node from a resource group changing properties #tatus ,isting Detailed ,ist Display mode type %failover or scala-le&
scr adm (c ( scstat ( scstat 4 scr adm 4!v 4 scr adm (!v ( <res_ rou!> <res_ rou!> > re! ?%es @rou! mode? <resource rou!> (y <!ro!ety6value>
clresource clresource
0fflining
scswitch 4= 4
<res_ rou!>
0nlining
scswitch (A (
<res_ rou!>
;; Individu clresource
/vacuate all resource groups from a node %used when shutting down a node&
scswitch 4u 4 <res_ rou!>
clresource
.nmanaging
7ote2 (all resources in rou! must be disabled)
clresource scswitch 4o 4 scswitch 48 4 <res_ rou!> <res_ rou!> 4h <host> n0a n0a clresource clresource clresource clresource
Remaster %move the resource group/s to their preferred node& Restart a resource group %-ring offline then online&
n0a n0a
clresource clresource
Resources
3.1 )dding failover network resource )dding shared network resource adding a failover apache application and attaching the network resource adding a shared apache application and attaching the network resource
scr adm 4a 4< 4 scr adm 4a 4) 4 <res_ rou!> (l <lo icalhost> <res_ rou!> (l <lo icalhost>
clreslo ica
clresshared
scr adm 4a 4B a!ache_res ( <res_ rou!> C (t )D75.a!ache (y 7etwork_resources_used 6 <lo icalhost> (y )calable6=alse 4y -ort_list 6 E#0tc! C (x Fin_dir 6 0usr0a!ache0bin scr adm 4a 4B a!ache_res ( <res_ rou!> C (t )D75.a!ache (y 7etwork_resources_used 6 <lo icalhost> (y )calable6True 4y -ort_list 6 E#0tc! C (x Fin_dir 6 0usr0a!ache0bin scr adm (a ( r _oracle (B has!_data#, (t )D75.G&)tora e-lus C > (x =ile)ystem.ount-oints60oracle0data#, C > (x &ffinityon6true scr adm 4r 4B res(i!
Removing
7ote2 must disable the resource first
;; $han in clresource
,ist
scstat ( scr adm 4!v 4B res(i! scr adm 4!vv 4B res(i! scstat ( scr adm 4n 4. 4B res(i! scr adm 4e 4. 4B res(i! scswitch 4n 4B res(i! scswitch 4e 4B res(i! scswitch 4c 4h<host>'<host> (B <resource> (f )T+-_=&I<:/ scr adm 4!vv 4B <resource> > ;; offline the rou! scswitch 4= 4 r rou!(, re! 4I network
Detailed ,ist #tatus Disa-le resoure monitor /na-le resource monitor Disa-ling /na-ling Clearing a failed resource 2ind the network of a resource
;; remove the resource scr adm 4r 4B res(i! ;; remove the resource scr adm 4r 4 r rou!(, rou!
;; remove t clresource
;; remove t clresource
Resource !$ es
3.1 )dding %register in 4.6& Register a resource type to a node Deleting %remove in 4.6& Deregistering a resource type from a node ,isting ,isting resource type properties #how resource types #et properties of a resource type
scr adm 4a 4t <resource ty!e> n0a scr adm 4r 4t <resource ty!e> n0a scr adm 4!v > re! H%es Ty!e nameI i.e )D75.G&)tora e-lus
clresourcet
clresourcet
clresourcet
clresourcet
clresourcet
clresourcet
clresourcet
clresourcet
*o' to Sto +onitoring a *eavil$ ,sed Resource in a Sun Cluster 3.- .nvironment
Karan Grover, December 2007 3his tech tip provides information about ho8 to stop monitoring a heavily used resource in a #un ;luster <.* resource group 8hen you see a high load on that particular resource. 3his scenario can be seen, for instance, 8hen batch "obs are running and pushing database applications to the ma*imum e*tent, 8hich causes the databases to @go numb@ and the monitors to initiate a failover. :n order to stop the resource group from failing over to other nodes and causing a loss in connectivity, you can stop the monitoring on the resource for the duration of the high load by using the follo8ing procedure. :f the high load occurs predictably at 7no8n times, you could even automate the process by using cron. 3he commands to disable the monitoring for a resource in a particular resource group are as follo8s: 0. 2un the follo8ing command to stop monitoring the resource:
scswitch (. (n (B <resource(name>
Note: 4ou need solaris.cluster.resource.admin 2A'; authorization to use the scswitch command 8ith the (e or (n option. #ee the scs8itch>0m? man page for more information. +or e*ample, if the listener of a particular Bracle database instance is initiating a failover, you can stop the monitoring on the initiator resource as follo8s, 8here <oralisten(serv(res> is the name of the resource you are trying to disable:
scswitch (. (n (B <oralisten(serv(res>
Note: : tried this on Bracle C.1./.D./, but it should 8or7 8ith Bracle Ci or any version of Bracle. 1. 3o re!enable monitoring on the resource after the load period is over, use the follo8ing command:
scswitch (. (e (B <resource(name>
3his command initiates monitoring on the resource, so that in the future, a failover can be initiated 8hen necessary.