Cisco UCSM Plugin and Addon: For Nagios Core
Cisco UCSM Plugin and Addon: For Nagios Core
Cisco UCSM Plugin and Addon: For Nagios Core
User Guide
October 8, 2015
Table of Content
1. OVERVIEW ............................................................................................................................................... 1
1.1 ACRONYMS AND ABBREVIATIONS .................................................................................................................... 1
1.2 SYSTEM REQUIREMENTS ................................................................................................................................ 1
2. DEPLOYING THE SOLUTION .................................................................................................................. 3
2.1 INSTALL PATHS ............................................................................................................................................ 3
2.2 INSTALL NAGIOS MONITORING PLUGIN............................................................................................................ 3
2.3 AUTO DISCOVERY NAGIOS ADD-ON ................................................................................................................ 4
3. FEATURES ............................................................................................................................................... 6
3.1 MAPS VIEW ................................................................................................................................................. 6
3.2 SERVICE VIEW .............................................................................................................................................. 6
3.3 DETAIL FAULT VIEW ..................................................................................................................................... 7
3.4 FAULT FOR OPERATIONAL POWER STATE ........................................................................................................ 8
4. MONITORING PLUGIN ............................................................................................................................. 9
4.1 PLUGIN SCRIPT ............................................................................................................................................ 9
4.2 PLUGIN CLI EXAMPLE ................................................................................................................................. 11
5. AUTO DISCOVERY ADDON ................................................................................................................... 16
5.1 WORKING WITH AUTO DISCOVERY ............................................................................................................... 16
5.2 ADD SERVICE ............................................................................................................................................. 18
6. CUSTOMIZING MONITORING PLUGIN .................................................................................................. 19
6.1 CUSTOMIZE INVENTORY INFORMATION .......................................................................................................... 19
6.2 CUSTOMIZE STATISTICS INFORMATION .......................................................................................................... 20
6.3 CUSTOMIZE FAULT INFORMATION ................................................................................................................. 23
6.4 SKIPPING FAULTS ....................................................................................................................................... 23
7. UNINSTALL ............................................................................................................................................ 23
8. KNOWN CAVEATS ................................................................................................................................. 23
8.1 FREQUENT SERVICE TIMEOUTS ..................................................................................................................... 23
List of Tables
Table 1 : Acronyms and Abbreviations ........................................................................................... 1
Table 2 : CSV to CLI parameter mapping ........................................................................................ 4
Table 3 : Cisco UCSM Fault Severity to Nagios State Mapping ................................................... 7
Table 4 : Plugin Argument Parameters ......................................................................................... 10
Table 5 : Host and Service Mapping ............................................................................................... 17
Table 6 : Auto discovery CLI options ............................................................................................. 17
Table 7 : Auto discovery CFG file options .................................................................................... 18
List of Figures
Figure 1 : UCS Domain Map in Nagios .............................................................................................. 6
Figure 2 : UCS Service Overview in Nagios .................................................................................... 6
Figure 3 : UCS Service View in Nagios ............................................................................................. 7
Figure 4 : UCS Fault Details View in Nagios ................................................................................... 8
Figure 5 : UCS Fault View for Operation Power State in Nagios ............................................... 8
Figure 6 : Custom Service under domain host ............................................................................ 19
Figure 7 : Custom Service with its own host ............................................................................... 19
Figure 8 : Custom Inventory Information ..................................................................................... 20
Figure 9 : Performance Data ........................................................................................................... 21
Figure 10 : Graph plotted using performance data ................................................................... 23
Figure 11 : Custom fault details ..................................................................................................... 23
User Guide Overview
1. Overview
Data center administrators have been using Nagios for more than a decade now and it has
emerged as one of the favorite open source tool for the Data Center monitoring.
Nagios is an open source computer system monitoring, network monitoring and infrastructure
monitoring software application. Nagios offers monitoring and alerting services for servers,
switches, applications, and services.
The solution provides end-user with two primary components. The first is the Nagios monitoring
plugin script which will provide end-user with the capability of monitoring the UCS domains.
The second is an add-on to the Nagios, which will provide end-user with the capability to auto
discover UCS domains.
Abbreviation Translation
1
User Guide Deploying the solution
Listed below are typical install locations and directories for different Linux distributions
For Debian/SUSE
Appending 'images/logos' to the value of the above variable provides us the logos
directory path for Nagios.
The cgi.cfg file can be found in NAGIOS_ETC_DIR
NAGIOS_LOGOS_DIR=/usr/share/nagios3/htdocs/images/logos/
NAGIOS_HOME=/usr/local/nagios
NAGIOS_ETC_DIR=/usr/local/nagios/etc
NAGIOS_PLUGIN_DIR=/usr/local/nagios/libexec
NAGIOS_LOGOS_DIR=/usr/local/nagios/share/images/logos/
b. Now run the installer, which should be present in the extracted folder.
# ./installer.py
c. Installer auto detects various install paths and prompt with default options for installing this plugin.
3
User Guide Deploying the solution
d. Installer also updates the configuration files which are required for the working of this plugin. It
prompts and creates the backups of all the files which will be modified in this process
e. By default installer will install the monitoring plugin along with the auto discovery scripts. In
case, only monitoring plugin is to be installed then use the '--plugin' option
# ./installer.py --plugin
The servers that are defined in this CSV/CLI will be discovered and added to the Nagios
for monitoring.
Example CSV:
<HostName>,<User>,<Password>,<Port>,<NoSSL(True/False)>,<Proxy URL>
10.65.183.10,admin,password,80,True
10.65.183.5,admin,password,80,True,http://proxy.ip.com:8080
10.65.183.5-10,admin,password
The HostName, User and Password fields are mandatory for the auto discovery to
discover the UCS domain.
User can provide IP range in the hostname. Auto-Discovery script allows range
definition by passing “–“in the fourth octet. For all IPs in that range, connection
parameters will be same i.e. the username, password, port, SSL and proxy data if
applicable.
Note:
4
User Guide Deploying the solution
10.65.183.16,admin,"My_password"
#./NagiosAutoDiscoveryUCS.py
Note:
CLI parameters will be given preference over CSV file, i.e. if the Host Parameter
via CLI is given then script will skip reading the CSV file.
5
User Guide Features
3. Features
Once the installation is complete and auto discovery is executed, user can now see the
components of the Cisco UCS domains which are discovered.
6
User Guide Features
NAGIOS_CRITICAL=critical|major
NAGIOS_WARNING=minor|warning
7
User Guide Features
8
User Guide Monitoring Plugin
4. Monitoring Plugin
4.1 Plugin Script
As per the Nagios standards, the Cisco UCS Nagios monitoring plugin takes multiple standard
inputs like the host information, connection information and service status criteria. The plugin is
named as “cisco_ucs_nagios” and can take the following cli inputs
Example
"ucs-QALAB\admin"
9
User Guide Monitoring Plugin
Example
--proxy http://<Proxy IP>:<Port>
--proxy http://user:pass@<Proxy IP>:<Port>
-R / --inHierarchical If specified this will provide a hierarchical overall Optional
health status for all the elements under the given
class or dn. The information that user may want
from this option can be controlled via
CLASS_FILTER_LIST parameter defined in the plugin
configuration file.
--verbose If specified it will work with inHierarchical flag and Optional
will provide a detailed status information for all the
subcomponents which may be there in the provided
dn or class.
-F / --faultDetails If specified it will work with inHierarchical flag and Optional
will look for fault details under the given class or dn.
It is quick way for checking the overall status of the
given dn or class
-f / --filter Provide a filter string in the format Optional
attribute:value. This filter is only valid for type
class and will apply on the class provided in the CLI.
User can provide wildcard filter which uses standard
regular expression syntax.
Example
--filter dn:sys/chassis-1/blade-1
--filter dn:sys/chassis-1/blade-1/*
--filter dn:^sys/chassis-1$
--debug If defined it will print the xml communication Optional
between the plugin and UCS domain. It is also
helpful for detailed debugging in case of any error
that may have occurred while using this CLI. Use
this for debug purpose only.
-S/--useSharedSession If specified it will try and reuse an existing UCSM Optional
connection for a specified user.
If the connection does not exist, then a session will
be created and left for other processes to reuse this
connection again.
Else, if not defined, plugin will create a new user
session every time and will destroy this after each
run.
--getPerfStats If this flag is specified this will provide performance Optional
data for the Nagios to use to plot graphs. This flag
can only be used when -a option is used.
--version If specified, it will print the current Cisco UCSM Optional
Nagios Monitoring plugin version.
NOTE: Any other options will be ignored.
-h/--help Prints the help content and the plugin input Optional
arguments supported
There are multiple ways in which this script can work. For example in a conventional way, user can
provide a range for warning or critical values and based on the given values the plugin script can
decide the service state.
10
User Guide Monitoring Plugin
By default the script uses the Cisco UCSM faults as the basis for returning the service state. Here
user can just pass a dn or class as query and the plugin script will return CRITICAL, WARNING or
OK as per the faults found on that dn or class.
So based on the query it will fetch all the related faults and if this query has a critical fault then the
plugin script will return the service as CRITICAL.
In case there is no fault in the query passed then Nagios plugin script will fetch the
relevant inventory information and will display the same on the Nagios web UI or CLI.
Note: In one of the CLI combination where end user passes the ‘--inHierarchical’ flag with ‘--
verbose’ flag, user may get lot of information as per the query passed.
To help end user with limiting the required information we have provided a filter variable named
CLASS_FILTER_LIST where end user can provide name of those sub classes that user want the
information for.
So for example, for ComputeBlade class there are number of subclasses like
BiosVfConsoleRedirection, ComputeBoard, MemoryArray, MemoryUnit, BiosUnit,
MgmtController, AdaptorHostEthIfFsm etc to name some.
User may only be interested in say ComputeBoard, MemoryArray and MemoryUnit then these
classes can be defined in this filter list and the plugin will then only display the status information
only related to these three classes.
CLI (Class as input) – This will provide the status for all chassis objects in given UCS domain.
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t "class" -q
"EquipmentChassis"
Output
sys/chassis-1:CRITICAL -
CRITICAL - sys/chassis-1-Power state on chassis 1 is redundancy-failed
sys/chassis-2:CRITICAL -
CRITICAL - sys/chassis-2-Power state on chassis 2 is redundancy-failed
==== Fault # 1 ====
Dn : sys/chassis-1/fault-F0408
Descr : Power state on chassis 1 is redundancy-failed
severity : major
Cause : power-problem
Type : environmental
Created : 2013-09-16T23:42:17.258
11
User Guide Monitoring Plugin
CLI (with –a, -w and –c) – Here the end user can provide a warning and a critical value for a given
attribute. Based on these inputs the plugin will return the service status as per the attribute value.
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t "class" -q
"EquipmentFanStats" -a SpeedAvg -w 3600 -c 3700
Output
Overall Status : CRITICAL -
WARNING - sys/chassis-1/fan-module-1-1/fan-1/stats - SpeedAvg : 3696
CRITICAL - sys/chassis-1/fan-module-1-1/fan-2/stats - SpeedAvg : 3828
OK - sys/chassis-1/fan-module-1-2/fan-1/stats - SpeedAvg : 3520
CRITICAL - sys/chassis-1/fan-module-1-2/fan-2/stats - SpeedAvg : 3740
WARNING - sys/chassis-1/fan-module-1-3/fan-1/stats - SpeedAvg : 3608
WARNING - sys/chassis-1/fan-module-1-3/fan-2/stats - SpeedAvg : 3696
OK - sys/chassis-1/fan-module-1-4/fan-1/stats - SpeedAvg : 3564
CRITICAL - sys/chassis-1/fan-module-1-4/fan-2/stats - SpeedAvg : 3740
WARNING - sys/chassis-1/fan-module-1-5/fan-1/stats - SpeedAvg : 3652
CRITICAL - sys/chassis-1/fan-module-1-5/fan-2/stats - SpeedAvg : 3828
OK - sys/chassis-1/fan-module-1-6/fan-1/stats - SpeedAvg : 3476
WARNING - sys/chassis-1/fan-module-1-6/fan-2/stats - SpeedAvg : 3696
WARNING - sys/chassis-1/fan-module-1-7/fan-1/stats - SpeedAvg : 3608
CRITICAL - sys/chassis-1/fan-module-1-7/fan-2/stats - SpeedAvg : 3784
OK - sys/chassis-1/fan-module-1-8/fan-1/stats - SpeedAvg : 3520
OK - sys/chassis-1/fan-module-1-8/fan-2/stats - SpeedAvg : 3564
CLI (with –a and –r) – The end user can provide a regular expression for a given attribute and
based on these inputs the plugin will return if the service status is OK or in a CRITICAL state.
Output
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t "dn" -q sys/switch-
A/slot-1/switch-ether/port-7 -a operState -r up
CRITICAL - sys/switch-A/slot-1/switch-ether/port-7 - operState : sfp-not-present
CLI (with -a and --getPerfStats) – The end user can provide the getPerfStats flag with attribute
option. When this flag is set then the CLI will return the performance data appended to the other
output via a pipeline “|”.
Output
# cisco_ucs_nagios -u "admin" -p "Nbv12345" -H "10.65.183.5" -t "dn" -q "sys/switch-A/fan-
module-1-2/fan-2/stats" -a "speed" –getPerfStats
12
User Guide Monitoring Plugin
CLI (with –a, -w, -c and --getPerfStats) – Here the end user can provide a warning and a critical
value for a given attribute. Based on these inputs the plugin will return the service status as per
the attribute value. With getPerfStats flag the attribute value and the warning and critical values
are used to return the performance data.
Output
# cisco_ucs_nagios -u "admin" -p "Nbv12345" -H "10.65.183.5" -t "dn" -q "sys/switch-A/fan-
module-1-2/fan-2/stats" -a "speed" -w 10000 -c 12000 --useSharedSession –getPerfStats
CLI (with --inHierarchical) – This will provide an overall hierarchical overview of health status for
the given query
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t "dn" -q
"sys/chassis-1" --inHierarchical
Output
Overall Health Status:CRITICAL -
EquipmentChassis (sys/chassis-1) - CRITICAL - Power state on chassis 1 is redundancy-failed
*** Hierarchical Fault Filtering ON ***
Please Check CLASS_FILTER_LIST property.
CLI (with –inHierarchical and --verbose) – This will provide a detailed status for a given query
and the output details can be controlled via CLASS_FILTER_LIST property as detailed in Section 5
note.
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t "dn" -q "sys/chassis-
1/blade-2" –inHierarchical --verbose
Output
Overall Health Status:OK -
ComputeBlade (sys/chassis-1/blade-2)- OK
LsbootDef (sys/chassis-1/blade-2/boot-policy)- OK
ComputeBoard (sys/chassis-1/blade-2/board)- OK
MemoryArray (sys/chassis-1/blade-2/board/memarray-1)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-12)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-11)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-10)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-9)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-8)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-7)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-6)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-5)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-4)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-3)- OK
MemoryUnit (sys/chassis-1/blade-2/board/memarray-1/mem-2)- OK
… (text truncated)
*** Hierarchical Fault Filtering ON ***
13
User Guide Monitoring Plugin
CLI (with --filter) – With this option, user can provide a class attribute as a filter string to the
plugin CLI. This option helps in reducing the monitoring scope of the plugin.
Like for example, if user wants to monitor ‘processorUnit’ health for all the blades in chassis 1,
then user can define the plugin cli as
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t class -q
processorUnit --filter dn:sys/chassis-1/*
Output
sys/chassis-1/blade-1/board/cpu-1:OK - Cores : 4,Model : Intel(R) Xeon(R) CPU E5520 @
2.27GHz,CPU Speed(Mhz) : 2.266000
sys/chassis-1/blade-6/board/cpu-2:OK - Cores : 4,Model : Intel(R) Xeon(R) CPU E5520 @
2.27GHz,CPU Speed(Mhz) : 2.266000
sys/chassis-1/blade-6/board/cpu-1:OK - Cores : 4,Model : Intel(R) Xeon(R) CPU E5520 @
2.27GHz,CPU Speed(Mhz) : 2.266000
The filter uses wildcard filtering hence user can provide standard regular expression syntax which
can be used to fetch the desired results.
Another example can be to fetch health status for chassis 1.
As there can be more than one chassis in a UCS domain hence a simple filter like
“dn:sys/chassis-1” may end up matching all chassis-10,chassis-11…chassis-19.
In any such cases it is recommended that user should anchor the filter in between ^ and $
expression, like “dn:^sys/chassis-1$”. This filter will only match chassis-1 now.
# cisco_ucs_nagios -u <username> -p <password> -H <UCSM IP/FQDN> -t class -q
equipmentChassis --filter dn:^sys/chassis-1$
Output
sys/chassis-1:CRITICAL -
CRITICAL - sys/chassis-1-Current connectivity for chassis 1 does not match discovery policy:
unsupported-connectivity
14
User Guide Monitoring Plugin
15
User Guide Auto Discovery Addon
Service Details
Host Services
Ping UCS This service will check the ping to the given UCS domain.
Domain
UCS Domain virtual IP
This service checks overall faults status of the UCS chassis
and will return Nagios state accordingly. If there are no faults
Check on the UCS chassis then it will display the inventory
Chassis Chassis information for the same.
This service checks just the faults that may have occurred on
the chassis PSU’s and will return Nagios state accordingly. If
there are no faults on the chassis PSU(s) then it will display
Check PSU the inventory information for the same.
This service checks overall faults status of the UCS FEX and
its sub components. Based on this service will return Nagios
state accordingly. If there are no faults on the UCS FEX then
Fex Check Fex it will display the inventory information for the same.
This service checks overall faults status of the UCS FI switch
and its sub components. Based on this service will return
Nagios state accordingly. If there are no faults on the UCS FI
switch then it will display the inventory information for the
FI Check FI same.
This service checks overall faults status of the UCS Chassis
IOM module and its sub components. Based on this service
will return Nagios state accordingly. If there are no faults on
the UCS Chassis IOM module then it will display the
IOM Check IOM inventory information for the same.
This service checks overall faults status of the UCS blade
server and its sub components. Based on this service will
return Nagios state accordingly. If there are no faults on the
Check Blade UCS blade server then it will display the inventory
Blade Server Server information for the same.
This service checks all the faults that may have occurred on
the blade CPU(s) and will return Nagios state accordingly. If
there are no faults on the blade CPU(s) then it will display
Check CPU the inventory information for the same.
This service checks overall faults status of the memory array
units and its sub components. Based on this service will
return Nagios state accordingly. If there are no faults on the
Check memory array unit then it will display the inventory
Memory information for the same.
This service checks overall faults status of the UCSM
managed C-series rack server and its sub components.
Based on this service will return Nagios state accordingly. If
Check Rack there are no faults on the UCSM managed rack server then it
Rack Server Server will display the inventory information for the same.
This service checks all the faults that may have occurred on
the rack CPU(s) and will return Nagios state accordingly. If
there are no faults on the rack CPU(s) then it will display the
Check CPU inventory information for the same.
This service checks overall faults status of the memory array
units and its sub components. Based on this service will
return Nagios state accordingly. If there are no faults on the
Check UCS rack memory array then it will display the inventory
Memory information for the same.
This service checks just the faults that may have occurred on
Check PSU
16
User Guide Auto Discovery Addon
Note: In case user is a domain user then the user field should be defined as
“ucs-<Domain>\<User>”
User needs to prefix the domain with "ucs-"
Example
"ucs-QALAB\admin"
17
User Guide Auto Discovery Addon
If the script is invoked without using the “-r /--restartService” cli options then, at the
end of the discovery process it will prompt end user for input on restarting the Nagios
service.
This can be done by editing the 'NagiosAutoDiscoveryUCS.cfg' and updating the different service
variables which are defined as ‘Service_’ suffixed by the class name of the components.
For example:
Service_EquipmentChassis
Service_ComputeBlade
Service_EquipmentFex
Service_EquipmentNetworkElementFanStats
Here user can provide his own service name and service class or dn which is restricted to sub
classes or dn of the above said classes. Optionally, user can also provide various cli options that
user want to pass to the monitoring plugin script.
For example user can update the NagiosAutoDiscoveryUCS.cfg with the following custom service
list like
Service_EquipmentNetworkElementFanStats=Check Fan
Stats:EquipmentNetworkElementFanStats:"--useSharedSession -a speed -w 10000 -c
12000 --getPerfStats"
If user wants to get these services to be discovered by Auto-Discovery AddOn, then the
Class Name should be added in the “DISCOVERY_CLASS_LIST” present in the same
configuration file. This entry needs to be made when user is adding a new entry of
“Service_<class_name>” and wants these services to be discovered.
#This List defines the Class list for which Services need to be
discovered.
#DISCOVERY_CLASS_LIST=EquipmentChassis,ComputeBlade,EquipmentFex,…,<New
_Class>
DISCOVERY_CLASS_LIST=EquipmentChassis,ComputeBlade,...,EquipmentNetwork
ElementFanStats
18
User Guide Customizing Monitoring Plugin
By default all the services created via new “Service_” will be kept under the Domain host.
So, now when the auto discovery is executed again the following list of services will
appear in the Nagios web UI.
If user wants that these new services should have their own host and should not be
placed under domain. Then an entry for the class should be made under
“HOST_CLASS_LIST” parameter present in the same configuration file.
#This List defines the Class Names which can have Nagios Host Created.
HOST_CLASS_LIST=EquipmentChassis,ComputeBlade, EquipmentFex
,...,<New_Class>
Now on re-running the auto-discovery process these new services will be placed under
a new host of that class.
For fetching Inventory attributes from the class user needs to provide "Inv_" as prefix
followed by the class name as the variable name and the list of attributes as its value.
So the configuration property string will be of the following type
Inv_<class id >=<AttributeName>,<AttributeName>:<UserGivenName>,<AttributeName>
Here user can also customize the attribute name that user wants to see on the Nagios
Web UI. So for example, if the attribute name is say ‘OperPower’ and user wants that to be
seen as say ‘Power(W)’, then user can define the attribute first followed by a colon “:” and
the name that user wants to see on the Nagios Web UI.
19
User Guide Customizing Monitoring Plugin
Like OperPower:Power(W)
So a complete example for a class ‘ComputeRackUnit’, the entry may look like
Inv_ComputeRackUnit=Serial,Uuid,Model,Vendor,OperPower:Power(W),TotalMemory:Memo
ry(MB),NumOfCores:Cores,NumOfCpus:CPUs
One could add entries in this configuration file for getting performance stats for specific
“Class” by adding at least one of its attribute.
<Class Name>: It’s the name of the class for which the stats needs to be generated.
The plugin will look for this entry and read the given parameters.
<Attribute Name>: This is a MANDATORY parameter. This will be one of the valid
attribute from the selected class. This attribute should return a numeric value as graphs
are plotted against the numeric values only.
One could also give an optional name to this attribute by writing this name after “:”. If this
optional name is given then this will be shown as the label instead of the “attribute
name”. Below is an example for it.
Stats_MemoryArray=CurrCapacity:”Current Capacity(MBs)”
<class name> = <attribute name> : <attribute optional name>
<UOM>: Unit of measurement. It’s the unit associated to the value of the attribute. This
field is OPTIONAL and can be left blank. It can have one of the following values.
a. no unit specified - assume a number (int or float) of things (eg, users,
processes, load averages)
b. s - seconds (also us, ms)
c. % - percentage
d. B - bytes (also KB, MB, TB)
e. c - a continuous counter (such as bytes transmitted on an interface)
Note:
Allowed Unit of measurement is controlled by “STATS_UOM_LIST” parameter
present in configuration file. User can update the list according to the use.
#User can append more "Unit Of Measurements" which they want to allow in
getting performance statistics.
STATS_UOM_LIST =%,s,us,ms,c,B,KB,MB,TB
20
User Guide Customizing Monitoring Plugin
Warn – It sets the warning threshold in graphing the stats for that attribute.
Crit – it sets the Critical level threshold in graphing the stats for that attribute.
Min – This field sets the minimum possible value for the selected attribute.
Max – This field sets the maximum possible value for the selected attribute.
Below are few possible ways to write Stats Class definition in the configuration file.
Now when Nagios service is called and uses one of these Stats “class”, then with the
normal inventory related data, the plugin will also return the listed attribute as
performance stats.
Here the attribute “Speed” after ‘|’ is the performance stat for this service. When such a
service is run in Nagios, then this performance stat is stored in historical information
database which then a third party graphing tool uses to populate graphs.
On Nagios GUI “Performance Data” field in the service gets populated when this service
is run.
Note:
The plugin follows Nagios generic guidelines for generating performance data.
User can install any third party graphing plugin from Nagios Communities to
populate graphs by using performance data.
21
User Guide Customizing Monitoring Plugin
So for example user can have the following list of attributes which user wants to see as a part of the
fault details
FaultInst=Dn,Descr,severity,Cause,Type,Created
Example
SKIP_FAULT_LIST=Lc:suppressed,Type:fsm,Severity:info,Severity:condition
Note:
Skipping fault based on “Description” field is not advisable as it might contain some
special characters which might not let the fault to be skipped when a comparison is done
22
User Guide
7. Uninstall
To uninstall the Cisco UCSM Nagios integration, follow the step as mentioned below
8. Known Caveats
8.1 Frequent Service timeouts
If user has a large UCSM domain with more than 600 services and is seeing frequent service
timeout, then it is recommended that the user should check for network related issues. For
example ping timeouts, high network latencies, etc.
If the above doesn’t help, then user may try and tweak the following parameters in Nagios
configuration file ‘nagios.cfg’ to check if this resolves the issue.
Format: service_check_timeout=<seconds>
Example: service_check_timeout=600
This is the maximum number of seconds that Nagios will allow service checks to run. If the
network is slow or UCSM is slow in responding to the xml requests then it is recommended to
increase this value and check the results.
It may be a case that increasing only this value may not help.It is then recommended that user
should use this value in conjunction with ‘max_concurrent_check’ option.
Format: max_concurrent_checks=<max_checks>
Example: max_concurrent_checks=50
In case of slow Nagios host or network or UCSM responding slowly to the xml requests, user is
recommended to keep this value to a minimum. This option will run minimum number of
concurrent services on the Nagios host thereby stabilizing the system and in turn handling the
frequent service time out issue.
More details on the above Nagios configuration options can be found at the following link
http://nagios.sourceforge.net/docs/3_0/configmain.html
23