Array Setup and Networking
1748148 Members
3823 Online
108758 Solutions
New Discussion

Re: Monitoring with Nagios

 
SOLVED
Go to solution
srevel131
New Member

Monitoring with Nagios

Hi,

We're looking to set up monitoring for our Nimble CS500 via Nagios.  I've been following : Integrate Nagios and Nimble

which looks promising.  Things like the Nimble load, volume and array info should work ok, but I have a question on how I would properly monitor

disk info following that doc.:  The problem I'm having is doing something like :

sleep 39 && ssh adminuser@nimble disk --info 7 >/tmp/nimble/ssd1 will return something like:

Shelf location #: 0

Slot #: 8

HBA: 0

Port: 8

Serial: BTWL345200PV600TGN

Type: SSD

Disk size: 600.127 GB

Disk state: in use

Raid state: N/A

Percent resynchronized: N/A

Current resync speed: N/A

Average resync speed: N/A

Model: INTEL SSDSC2BB60

Vendor: Nimble

Firmware version: 0355

Smart Data Attributes:

But we have a head shelf, and two expansion shelves.  The disk slot # repeats 1-16 for each expansion shelf.  How can I pull data from the command line for unique disks? (e.g in the example above we would be missing the two SSD drives we have in slot 8 in our expansion shelves).    I don't see a way from the commandline to say shelf #,  slot # when doing a check.  Any help is appreciated.

2 REPLIES 2
rugby0134
Esteemed Contributor
Solution

Re: Monitoring with Nagios

When you add trays, the shelf location changes. so you should be able to pull shelf location and disk to determine location.

Nimble OS $ disk --list

------+--------------------+----+---------+-------+---------------+---------+-----

Slot # Serial #             Type Disk Size Disk    RAID            Shelf     Shelf

                                 (GB)      State   Status          Serial    Loca-

                                                                             -tion

------+--------------------+----+---------+-------+---------------+---------+-----

     1 WD-WCAW31997974      HDD    1000.20 in use  okay            AA-100453 A.0

     2 WD-WCAW31995695      HDD    1000.20 in use  okay            AA-100453 A.0

     3 WD-WCAW31976057      HDD    1000.20 in use  okay            AA-100453 A.0

     4 WD-WCAW31994675      HDD    1000.20 in use  okay            AA-100453 A.0

     5 WD-WCAW31994827      HDD    1000.20 in use  okay            AA-100453 A.0

     6 WD-WCAW31995709      HDD    1000.20 in use  okay            AA-100453 A.0

     7 PEPR2433004W300EGN   SSD     300.07 in use  N/A             AA-100453 A.0

     8 BTPR242402Y3300EGN   SSD     300.07 in use  N/A             AA-100453 A.0

     9 PEPR247600XD300EGN   SSD     300.07 in use  N/A             AA-100453 A.0

    10 PEPR2475008K300EGN   SSD     300.07 in use  N/A             AA-100453 A.0

    11 WD-WCAW31996833      HDD    1000.20 in use  okay            AA-100453 A.0

    12 WD-WCAW31996122      HDD    1000.20 in use  okay            AA-100453 A.0

    13 WD-WCAW31995578      HDD    1000.20 in use  okay            AA-100453 A.0

    14 WD-WCAW31997996      HDD    1000.20 in use  okay            AA-100453 A.0

    15 Z1W0L44Y00009424PQGS HDD    1000.20 in use  okay            AA-100453 A.0

    16 WD-WCAW31994824      HDD    1000.20 in use  okay            AA-100453 A.0

     1 22K0A04EFM16         HDD    1000.20 in use  okay            AC-102190 A.1

     2 22L0A009FM16         HDD    1000.20 in use  okay            AC-102190 A.1

     3 22L0A04IFM16         HDD    1000.20 in use  okay            AC-102190 A.1

     4 22M0A00HFM16         HDD    1000.20 in use  okay            AC-102190 A.1

     5 22K0A01MFM16         HDD    1000.20 in use  okay            AC-102190 A.1

     6 22L0A01PFM16         HDD    1000.20 in use  okay            AC-102190 A.1

     7 22K0A02LFM16         HDD    1000.20 in use  okay            AC-102190 A.1

     8 CVPR215400Y0160DGN   SSD     160.04 in use  N/A             AC-102190 A.1

     9 22K0A011FM16         HDD    1000.20 in use  okay            AC-102190 A.1

    10 22L0A03OFM16         HDD    1000.20 in use  okay            AC-102190 A.1

    11 22M0A00YFM16         HDD    1000.20 in use  okay            AC-102190 A.1

    12 22L0A033FM16         HDD    1000.20 in use  okay            AC-102190 A.1

    13 22L0A04SFM16         HDD    1000.20 in use  okay            AC-102190 A.1

    14 22L0A04OFM16         HDD    1000.20 in use  okay            AC-102190 A.1

    15 22L0A036FM16         HDD    1000.20 in use  okay            AC-102190 A.1

    16 1220A04PFRC5         HDD    1000.20 in use  okay            AC-102190 A.1

swilson120
Advisor

Re: Monitoring with Nagios

Hi,

We are new to Nimble but not to Nagios.  Well over 12+ years of Nagios usage and we moved to Centreon which is a fork of Nagios and has a way better UI.

I created a nagios/centreon plugin that uses the REST API to tell nagios/centreon about the health of the disks, pool free space, and snapshot quantity.  I am working on more checks as soon as I can move to version 4 of the OS.  I am on version 3.8 and some of the hardware checks are not in this version.  I am hopeful that HPE puts more volume, snapshot, and hardware info in the SNMP MIB since that is easier to work with but the REST API does work too.

The nagios check uses Perl + REST client.

You can download from here

You need to create a read only ID on the SAN and put the id and password in the script or modify it to pass in...enjoy.

Scott