Server Management - Systems Insight Manager
1752723 Members
6577 Online
108789 Solutions
New Discussion юеВ

HP Insight Manager Email Alerts

 
snaus
Occasional Visitor

HP Insight Manager Email Alerts

Hi Guys

Something which I hoped would be fairly simple to implement has turned out to be a source of incredible frustration and hope to get your help.

I have installed and configured HP Insight Manager in the hope that it could provide me with email alerts if any hardware fails on Proliant servers, eg a disk drive. Too many times the failed disks are only noticed when walking past the physical hardware. With more and more servers located in data centres, this is obviously an issue.

I have followed the installation guide and installed HP SIM on a server and it performed a full discovery of the local network and added the local servers. I created a new Automatic Event Handling Task and chose the Important Proliant Events collection. I selected the "All Servers" system collection and then in the following step configured the email addresses for sending and receiving the email and confirmed the correct SMTP server settings. I then confirmed that the following categories are part of the monitored events:

Event category is Proliant Storage Events and subcategory is Logical Drive status
Event category is Proliant Storage Events and subcategory is Physical Drive Status

Now, I know that SMTP sending is working ok because I am receiving email notifications when systems are turned on or turned off. I even get email notifications when I log on or off HP SIM. I can take or leave those alerts, but ironically the really important messages relating to hardware failures do not seem to work.

I have simulated disk failures by pulling out disks in some of our servers and nothing happens. The HP agents on the server itself knows about the event and this is reflected on the HP Systems Management Homepage on the server itself, but HP SIM sees nothing and no Critical or Major event is logged and subsequently no email is sent. The only Critical events logged are ones relating to "Systems being unreachable". I have confirmed that the "Hardware Status Polling for Servers" task is enabled and configured to run every five minutes.

Can somebody please explain the exact steps to follow to get this kind of alerting going or whether I am missing something really obvious?

Many thanks
JM

2 REPLIES 2
snaus
Occasional Visitor

Re: HP Insight Manager Email Alerts

Both ESX and XenServer SIM / SNMP agents are installed and functioning. Had a look and unfortunately I do not this either Nagios or Splunk can easily tap in to the HP SNMP agents specifically related to the disk / storage system.

This is a bit frustrating because the whole point of HP SIM is to provide you with hardware monitoring / alerting capabilities and I am just curious if anybody with HP servers out there is actually using this product?

FYI, I have again done a drive failure simulation and have some screenshots to share which may provide a clue as to what may be wrong. 

The following screenshot clearly shows the failed disk:

 

HP-SYS-HOMEPAGE.JPG

 

Yes in HP SIM these are the only events listed for the server in question:

HP-SIM-EVENTS.JPG

 

As far as alerts are concerned, as you can see the following screenshot seems to indicate that the automatic alerts have not worked since earlier in the month. Not that it would have made any difference because HP SIM did not detect any critical events anyway:

 

HP-EVENT-HANDLING.JPG

 

So I am sure I am missing something obvious somewhere but just cant put my fingers on it. Hence I would really appreciate it if anybody out there can provide me with just a few simple bullet point steps in how to configure alerting for disk failures. I am not even bothered about other hardware failures for now, once the disk alerts are happening I think the rest will follow.

Thanks again
JM

jim goodman
Trusted Contributor

Re: HP Insight Manager Email Alerts

There are a lot of things to consider, but the first is - set up a task with no filters at all then, pull a drive and see if you get it. Also consider there may be a problem in the SNMP configuration at either point since the task has run at all for awhile perhaps it is the luck of the draw or something changed in the environment.

 

Does the SMH relect the failed drive status?

 

Does SIM HW status reflect the status of the SMH correctly?

 

Some things to consider as you try to chase it down. I would think a HDD failure, Array Status Change , and such would be Major events that would trigger in any HPmanaged environment.