Array Setup and Networking
cancel
Showing results for 
Search instead for 
Did you mean: 

Nimble storage monitoring with check_mk or snmp

SOLVED
Go to solution
rampp109
Occasional Visitor

Nimble storage monitoring with check_mk or snmp

Hello together,

How can I monitor the nimble storage with check_mk?

Does exist an plugin for check_mk or is snmp only possible?

Can I query in snmp more than interfaces, for example volumes?

Best regards,

Manuel

11 REPLIES
wezelboy54
Occasional Advisor

Re: Nimble storage monitoring with check_mk or snmp

I would love to see an answer to this.

InfoSight is okay, but I would prefer to get that (and more) telemetry into check_mk where I can really work some magic on it.

Probably the only course is to write a custom check_mk snmp plugin. I haven't had time to look at the MIBs.

-P

rvvliet78110
Valued Contributor

Re: Nimble storage monitoring with check_mk or snmp

Hi Manual,

You can check all volumes on performance (IOps and stuff) and the overall performance of the Nimble itself.

I use Zabbix and I have never worked with check_mk, but with zabbix I can check all volumes, this oid: .1.3.6.1.4.1.37447.1.2.1.3.

In Zabbix you can create a discovery rule, which uses the result of this oid as an index to collect all other data, I'm not sure if check_mk can do that to, otherwise you can get all your volumes by running an snmpwalk and figure out which oid is which volume.

Hope it helps.

Rick.

rohanfallon16
Occasional Visitor
Solution

Re: Nimble storage monitoring with check_mk or snmp

Hi Manuel, This is the custom check that I did to monitor two nimble units I have with check_mk I have included a perfometer as well, hope that helps

  
    #
    # Nimble Volume Check (Supports inventory and performance data)
    #
    # Author: Rohan Fallon
    #
    # FileName: nimble
    # Location: ~/local/share/check_mk/checks
    # Usage:  cmk --checks  nimble -II your_nimble
    #
    #
    
    nimble_default_values = (95.0, 98.0)
    
    
    def inventory_nimble(info):
       # Debug: lets see how the data we get looks like
       # print info
       # return []
       inventory = []
       for vol, state, connections, volsize, volusage in info:
           if state == "1":
              inventory.append( (vol, nimble_default_values) )
       return inventory
    
    def check_nimble(item, params, info):
       # unpack check parameters
       warn, crit = params
    
       for vol, state, connections, volsize, volusage in info:
          if vol == item:
             if state == "1":
                size_gb = int(volsize) / 1024.0
                usage_gb = int(volusage) / 1024.0
                usage_percent = float(volusage) / float(volsize) * 100.0
                perfdata = [ ( "percent", usage_percent, warn, crit ) ]
                if usage_percent > crit:
                   return (2, "Critical - Volume online - iSCSI Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{:6.2f}".format(usage_percent)+ "%", perfdata )
                elif usage_percent > warn:
                   return (1, "Warn - Volume online - iSCSI Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{:6.2f}".format(usage_percent)+ "%" , perfdata)
                else:
                   return (0, "OK - Volume online - iSCSI Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{:6.2f}".format(usage_percent)+ "%", perfdata)
             else:
                return (2, "CRITICAL - Volume %s offline " % vol)
       return (3, "UNKNOWN - Volume not found")
    
    
    check_info["nimble"] = {
        'check_function':            check_nimble,
        'inventory_function':        inventory_nimble,
        'service_description':       'Nimble Volume %s',
        'has_perfdata':              True,
    }
    
    snmp_info["nimble"] = (".1.3.6.1.4.1.37447.1.2.1" , [ "3", "10", "11", "4", "6" ] )

Code for Perfometer

#
# Perf-o-meter for Nimble Volume Check
#
# Author: Rohan Fallon
#
# FileName: nimble.py
# Location: :~/local/share/check_mk/web/plugins/perfometer
# Note:  "Service check command" is the key when registering in the perfometers dictionary
#
#

def perfometer_nimble(row, check_command, perf_data):

    used = float(perf_data[0][1])
    warn = float(perf_data[0][3])
    crit = float(perf_data[0][4])
    if used > crit:
        color = "#ff0000"
    elif used > warn:
        color = "#ffff00"
    else:
        color = "#00ff00"

    return "%.0f%%" % used, perfometer_linear(used, color)

perfometers['check_mk-nimble'] = perfometer_nimble




atwemlow45
Occasional Visitor

Re: Nimble storage monitoring with check_mk or snmp

Rick,

We also use Zabbix.

Would you be willing to share your template with me or upload it to zabbix share (Zabbix Share - Directory: Recently Added)

Andrew

rvvliet78110
Valued Contributor

Re: Nimble storage monitoring with check_mk or snmp

Kravotir
Occasional Visitor

Re: Nimble storage monitoring with check_mk or snmp

This really helped us, thanks for sharing!

wezelboy54
Occasional Advisor

Re: Nimble storage monitoring with check_mk or snmp

Thanks for sharing this Rohan!

FWIW, I had to change the usage_percent format strings to "{0:6.2f}" to get it to work with python 2.6.

rohanfallon16
Occasional Visitor

Re: Nimble storage monitoring with check_mk or snmp

Thanks Patrick,

Sorry for mucking you around, I will update my own checks, obviously a bit fast and loose when I did the first cut.  

wezelboy54
Occasional Advisor

Re: Nimble storage monitoring with check_mk or snmp

No worries. It was time well spent for me.

wezelboy54
Occasional Advisor

Re: Nimble storage monitoring with check_mk or snmp

Hi Rohan-

I made some changes to your check_mk plugin that add more per volume performance metrics.

It seems to work, but I think there may be some kind of undocumented unit mismatch because my numbers appear to be off by at least a factor of 1000.

My question to the nimble developers is...

Is it possible that this MIB entry-

volStatTimeEpochSeconds OBJECT-TYPE

    SYNTAX      Counter64

    MAX-ACCESS  read-only

    STATUS      current

    DESCRIPTION

    "Time at which the sample was taken, measured in seconds since UNIX epoch."

    ::= { volEntry 12 }

should actually be milliseconds instead of seconds?

I was thinking maybe byte counters might actually be kbyte counters, but the rates are low for IOPS as well.

It's also possible that I screwed something up and can't see it.

Here's the code...

-------------------------------------------------------------------------------------

#

# Nimble Volume Check (Supports inventory and performance data)

#

# Author: Rohan Fallon and Patrick Gavin

#

# FileName: nimble_vol

# Location: ~/local/share/check_mk/checks

# Usage:  cmk --checks  nimble -II rgtnimbleprod

#

#

    

nimble_default_values = (95.0, 98.0)

    

    

def inventory_nimble_vol(info):

   # Debug: lets see how the data we get looks like

   # print info

   # return []

   for vol, volsize, volusage, state, connections, stat_time, read_ops, read_bytes, write_ops, write_bytes in info:

       if state == "1":

          yield (vol, nimble_default_values)

    

def check_nimble_vol(item, params, info):

   # unpack check parameters

   warn, crit = params

   for vol, volsize, volusage, state, connections, stat_time, read_ops, read_bytes, write_ops, write_bytes in info:

      if vol == item:

         if state == "1":

            size_gb = int(volsize) / 1024.0

            usage_gb = int(volusage) / 1024.0

            usage_percent = float(volusage) / float(volsize) * 100.0

            read_iops = get_rate("read_ops.%s" % item, int(stat_time), int(read_ops))

            write_iops = get_rate("write_ops.%s" % item, int(stat_time), int(write_ops))

            read_bw = get_rate("read_bytes.%s" % item, int(stat_time), int(read_bytes))

            write_bw = get_rate("write_bytes.%s" % item, int(stat_time), int(write_bytes))

            perfdata = [ ("percent", usage_percent, warn, crit ),

                        ("connections", connections, 0, 0),

                        ("read_iops", read_iops, 0, 0),

                        ("write_iops", read_iops, 0, 0),

                        ("read_bandwidth", read_bw, 0, 0),

                        ("write_bandwidth", write_bw, 0, 0),

                     ]

            if usage_percent > crit:

               return (2, "CRITICAL - Volume online - Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{0:6.2f}".format(usage_percent)+ "%", perfdata )

            elif usage_percent > warn:

               return (1, "WARN - Volume online - Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{0:6.2f}".format(usage_percent)+ "%" , perfdata)

            else:

               return (0, "OK - Volume online - Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{0:6.2f}".format(usage_percent)+ "%", perfdata)

         else:

            return (2, "CRITICAL - Volume %s offline " % vol)

   return (3, "UNKNOWN - Volume not found")

    

check_info["nimble_vol"] = {

    'check_function'       : check_nimble_vol,

    'inventory_function'   : inventory_nimble_vol,

    'service_description'  : 'Nimble Volume %s',

    'has_perfdata'         : True,

    'snmp_info'            : (".1.3.6.1.4.1.37447.1.2.1" , [ "3", "4", "6", "10", "11", "12", "13", "15", "34", "36" ] )

}

wezelboy54
Occasional Advisor

Re: Nimble storage monitoring with check_mk or snmp

Here's the code that assumes milliseconds instead of seconds. The values it produces are consistent.

#

# Nimble Volume Check (Supports inventory and performance data)

#

# Author: Rohan Fallon and Patrick Gavin

#

# FileName: nimble_vol

# Location: ~/local/share/check_mk/checks

# Usage:  cmk --checks  nimble_vol -II rgtnimbleprod

#

#

    

nimble_default_values = (95.0, 98.0)

    

    

def inventory_nimble_vol(info):

   # Debug: lets see how the data we get looks like

   # print info

   # return []

   for vol, volsize, volusage, state, connections, stat_time, read_ops, read_bytes, write_ops, write_bytes in info:

       if state == "1":

          yield (vol, nimble_default_values)

    

def check_nimble_vol(item, params, info):

   # unpack check parameters

   warn, crit = params

   for vol, volsize, volusage, state, connections, stat_time, read_ops, read_bytes, write_ops, write_bytes in info:

      if vol == item:

         if state == "1":

            stat_secs = int(stat_time) / 1000

            size_gb = int(volsize) / 1024.0

            usage_gb = int(volusage) / 1024.0

            usage_percent = float(volusage) / float(volsize) * 100.0

            read_iops = get_rate("read_ops.%s" % item, stat_secs, int(read_ops))

            write_iops = get_rate("write_ops.%s" % item, stat_secs, int(write_ops))

            read_bw = get_rate("read_bytes.%s" % item, stat_secs, int(read_bytes))

            write_bw = get_rate("write_bytes.%s" % item, stat_secs, int(write_bytes))

            perfdata = [ ("percent", usage_percent, warn, crit ),

                        ("connections", connections, 0, 0),

                        ("read_iops", read_iops, 0, 0),

                        ("write_iops", read_iops, 0, 0),

                        ("read_bandwidth", read_bw, 0, 0),

                        ("write_bandwidth", write_bw, 0, 0),

                     ]

            if usage_percent > crit:

               return (2, "CRITICAL - Volume online - Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{0:6.2f}".format(usage_percent)+ "%", perfdata )

            elif usage_percent > warn:

               return (1, "WARN - Volume online - Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{0:6.2f}".format(usage_percent)+ "%" , perfdata)

            else:

               return (0, "OK - Volume online - Connections = " + connections  + " - Size = " + str(size_gb) + "Gb - Usage = " + "{0:6.2f}".format(usage_percent)+ "%", perfdata)

         else:

            return (2, "CRITICAL - Volume %s offline " % vol)

   return (3, "UNKNOWN - Volume not found")

    

check_info["nimble_vol"] = {

    'check_function'       : check_nimble_vol,

    'inventory_function'   : inventory_nimble_vol,

    'service_description'  : 'Nimble Volume %s',

    'has_perfdata'         : True,

    'snmp_info'            : (".1.3.6.1.4.1.37447.1.2.1" , [ "3", "4", "6", "10", "11", "12", "13", "15", "34", "36" ] )

}