Insight Control for Linux
cancel
Showing results for 
Search instead for 
Did you mean: 

extending supermon/mond monitoring in ICE-Linux (or NRPE)

craig mandelblit
Occasional Contributor

extending supermon/mond monitoring in ICE-Linux (or NRPE)

New to ICE-Linux but have some Nagios Experiance.

How do you figure out how to extend mon monitoring to perform additional threshold checks? I saw a warning to not use ssh or NRPE which are the standard Nagios solutions.

For example I want the functionality provided by check_procs where I can monitor for processes that are using too much Virtual Memory (--metric=VSZ).

Is there documentation on supermon? Does collectl play into this?
2 REPLIES
Donna Firkser
Regular Advisor

Re: extending supermon/mond monitoring in ICE-Linux (or NRPE)

Collectl is not integrated with IC-Linux so collectl won't help here.

If you run "shownode metrics" on your CMS and invoke it with the available options, do you see the information you're looking for?

e.g.
[root@mercury ~]# shownode metrics mem
Timestamp |Node |Total |Free |Buffer |Shared |TotalHigh |TotalFree |Cached
---------------------------------------------------------------------------
2010-05-27 11:17:17 |icelx1 |4046316 |236528 |161460 |0 |0 |0 |2315320
2010-05-27 11:17:17 |icelx2 |3922876 |159148 |133552 |0 |0 |0 |2220792
2010-05-27 11:17:17 |icelx4 |2054292 |113736 |143084 |0 |0 |0 |682600
2010-05-27 11:17:17 |icelx5 |1021932 |243328 |130168 |0 |0 |0 |425020
2010-05-24 17:05:09 |icelx7 |503784 |379124 |11084 |0 |0 |0 |51948

And you can run /opt/hptc/supermon/usr/bin/get_stats on your CMS. This will show you all the information/metrics collected by supermon.

Currently there is no way to extend/customize the data collected by Nagios/supermon.

If you need to collect additional data you will need to create a new Nagios plug-in.

Does this help?

And where did you see the warning about not using NRPE or SSH?

Thanks,
Donna
craig mandelblit
Occasional Contributor

Re: extending supermon/mond monitoring in ICE-Linux (or NRPE)

Thanks for getting back to me. The supermon data (shonode metrics mem) is great for overall memory stats on the blade.

I was looking for particular process data. I am trying to customize our monitoring/ICE install and this was the first case.

In a non-ICE Nagios install I have this example in place (called by check_by_ssh):

check_ps.sh -p http -w 2 -c 3 -t mem"

OK - Process: http, User: backup, CPU: 0.0%, RAM: 0.0%, Start: 12:14, CPU Time: 0 min | 'cpu'=0.0 'memory'=0.0 'cputime'=0 ]

So I can modify the nrpe_local.cfg and add in a:

command[check_http_process]=/opt/hptc/nagios/libexec/check_ps.sh -p http -w 2 -c 3 -t mem

but is that the best way to do it?