HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
Server Management - Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

ESX disk monitoring with SIM

 
Superfly.nn
Advisor

ESX disk monitoring with SIM

Hi all,

I would like to monitor all storage (local and SAN) attached to my VMware ESX(3.5 U1) hosts using HP SIM (Build version: C.05.01.00.02). I have configured the snmpd.conf and tested with some traps. So far so good. I have set disk threshold on 95% for critical and 85% for major. I have setup an Automated Event handling email alert so the admin team get threshold breaches alerts. However the emails are never sent. I have loaded a datastore to 96% and nothing.
How can I get this working and how can I check the frequency the agent polls the server for alerts.

Thanks,
Niall.
18 REPLIES
David Claypool
Honored Contributor

Re: ESX disk monitoring with SIM

Track the chain:

You say the email doesn't get sent, but is there in fact an event that HP SIM has received?

Did a trap get generated? There should be an entry on the ESX host in /var/log/messages corresponding to cciss and one corresponding to cmaidad if it's for local storage.

What is the 'datastore' that you loaded up? Do you see the size at 96% on the System Management Homepage?

Default frequency is 1 minute. See http://www.hp.com/go/proliantlinux --> More Linux Documentation --> "Managing ProLiant Servers with Linux" for information on setting the polling interval. (The ESX agents are based on the Linux agents)
marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

niall,

the agents on the esx server should send traps back to hp sim when they occur, the auto event handler then picks them up and actions whatever.
are you getting an event appearing in sim for the threshold breach and if so have you checked the last run time on your auto task, if it has run when or after the event appeared in sim then check your e-mail settings.
assume snmp strings are configured ok in smh on esx and in sim.


hope this helps

Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Claypool

We don't appear to have an event for HP SIM to respond to.

I have checked the logs and I see no snmp entry coresponding to the disk threshold as you have highlighted in /var/log/messages with cciss.

The datastore is a normal LUN that is presented as datastore_7. I have had to remove some data from that datastore which is now at 76% capacity. So I have moved the thresholds to 60% for major and 70% for critical on that LUN only, using the SMH webpage.

So if the ESX host is not creating the log for the event then it is likely to be a problem with the Agent or snmp config???

I've attached a graphic of the LUN and the thresholds.

Thanks for the help.
marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

niall,

in smh settings is the data source set to snmp ? if not set that and use the test indication to send a trap back to sim.

Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Hi Mark,

The ESX servers have recently been rebooted and we received email notification from the specific auto event handler that I setup (All major and critical events to be emailed to my teams email distro list) so the email is ok and snmp strings config seems ok. But the specific disk threshold breaches are not be getting emailed to us.

Forgot to mention that the LUN (datastore_7) we are using for this test is FC SAN based.

To give an overall idea of what we are achieving:
We have a disk monitor script from the below URL which emails the team an overall disk report first thing every morning:
http://www.yellow-bricks.com/2008/01/21/checking-the-diskspace-on-your-vmfs-volumes/#

But we need to be highlighted of disk capacity changes during business hours by email alerts.
marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

niall,

that's not necessarily true, ping from the default availability task would have
brought up a system unavailable event.

Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Mark,

I've opened smh -> Settings but I don't have an option to change the source. I assume it is snmp already as we have an snmpd.conf configured. The snmp config is attached. Maybe you can see something an incorrect configuration.

Your help is appreciated and points are being applied.
marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

niall,

the data source defaults to wbem as far as i know.

Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Mark,

I have no option to change the datasource in the settings page of SMH for the server in question.
I never seen an option to change the datasource for ESX host agents and never read about this in the doco from HP.
Attached is a pic of what I do see.
Do I have to edit a config file to change default data source to snmp. Are you refering to a feature in Windows based agents?
David Claypool
Honored Contributor

Re: ESX disk monitoring with SIM

Mark, WBEM is not available for ESX hosts. The data source option is only for Windows hosts with the optional WBEM (WMI) providers loaded.
Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Mark,

I'm pretty sure that WBEM default data source is only a Windows feature as per the following:
https://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1285577

Keep the ideas coming though as this is the most info/feedback I have been able to get on this issue.

Any thoughts on the snmp config I posted?
Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Also,

I am not using VMM - only the HP ESX agent (Version 8.1.1)
marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

david,niall,

did'nt know that about the esx agents, thanks.

niall,

the snmp config looks fine, the traps you sent were they from smh ? and received ok on sim ?

marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

niall,

also check event filter settings in hp sim just in case.

Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

Mark,

We sent test traps from SMH and we received emails in relation to them. When I check the events in SIM server we are able to see the test traps listed.
There are no event filters setup in HP SIM.
I'm starting to think the storage agent on the host is the source of the problem. Possible a firmware upgrade for the HBA's + local scsi controller. I know it's one of the first things HP support will ask me to do if I call them.
marsh_1
Honored Contributor

Re: ESX disk monitoring with SIM

niall,

agreed , they always do - latest firmware latest patch :-). apart from running event configuration from sim to the esx server there really doesn't seem to be much left it could be or to check !
Olivier Masse
Honored Contributor

Re: ESX disk monitoring with SIM

I've personally ran into a few bugs with the SIM ESX agents which I won't detail here. The ESX agents are ported from the Linux agents, so maybe they know how to monitor ext3/reiserfs/xfs/etc but don't know about vmfs at all.

Yet assuming you eventually make your disk monitoring work, I wouldn't trust it anyway... you'll have to test it each time you update SIM, the agents or ESX itself.

If you feel up to it, you can save yourself some worries and try to run from SIM a script that uses the VI toolkit to report the space. For example, there is a sample "diskfree" perl script available. See here: http://communities.vmware.com/thread/137364

In my case I have a job that does an SSH to the service console to run a homemade shell script to monitor my vmfs filesystems but it's not a long term solution. In the future (ESXi) we'll all have to use the toolkit anyway.

Good luck
Superfly.nn
Advisor

Re: ESX disk monitoring with SIM

While we did not fix the problem head on, we did manage to get around it buy moving all our ESX hosts to an alternative SIM server.
We uninstalled and reinstalled the agents (8.11) to each ESX hosts and setup a trust. So far we have managed to set disk thresholds and are getting the Alerts as expected. So our problem was either with the SIM server or the Agent that was installed on each host. My guess is the SIM server.