Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

Automatic Event Handling no e-mails

Bob Zabaga
Advisor

Automatic Event Handling no e-mails

I can't figure out why I don't get e-mail alerts with systems that major or critical events (or even minor). I have a few servers with problems, and the hardware problems (and severity) are displayed correctly in SIM. But I never get e-mails about it. If I go to "Options -> Events -> Automatic Event Handling" I can manage my automated task, check my e-mail settings and successfully send myself an e-mail. So it appears that both pieces work: my server's status is reported to SIM correctly and test e-mail works. But I never get e-mails when a problem occurs. Can't figure it out. The default "hardware polling" on all servers is 5 minutes...and as I said SIM has no problem obtaining the status of my servers. Any suggestions would be welcome! thanks BZ
36 REPLIES
Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

Hello,

 

What you probably forgot it to configure your servers to send snmp or wbem events in case of an error like a broken disk ect.

 

What you see is that the polling of the servers you see that something is broken but that is not the way SIM works. You only get emails when a trap of an error is received. What type of servers do you have and do you use SNMP or WBEM? You can run a repair when you selected your servers and set a wbem subscription or configure the SNMP settings? If you need more pointers let me know.

Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Thanks for replying. I inherited this configuration recently, I didn't originally set it up. But it looks set-up completely to me. Some servers are monitored over SNMP and some WBEM (some both too). The server I'm using as an example has a memory issue and a bad hard drive. WBEM monitoring shows that, but I don't get an e-mail. Now I just pulled a 2nd redundant power cable on a spare server and I got that alert....hmmm...that server is only monitored over SNMP. I think I'd prefer WBEM, there's more detailed information. Anyway, I don't undestand why I wouldn't get the e-mail if the HP SIM server can tell I have a memory & disk problem. The servers are anywhere from Proliant G5s to Proliant G9s. I can see in a case where WBEM doesn't obtain ANY server information at all, it could be the age of the HP software/agent on it. But I'm confused since WBEM finds my bad memory & drive but never sends an alert. What can I check to resolve that?

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

Hello, If you restart your agent the event should be resend if a subscription is set. If you are not sure select your server, go to the menu Configure > Manage Communications and wait. Next you will get a screen that tells you if SIM can identify the server and if it can receive events from it. If you have a lot of servers the use WBEM you should create an automatic job that will create these subscriptions once a week. Go to Menu menu option Options > Events > Subscribe to WBEM Events, select all servers as a destination. Select Next and then Schedule, set this to Periodically to run one a week on any day. This way any newly added servers get a subscription within a week.
Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Hi. Those areas you listed check out fine - that's not where my problem lies. I can successfuly collect server information via SNMP and/or WBEM. I don't get e-mails when alerts occur. That's what I can't figure out. -Bob
Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

I know what you mean,

 

Polling SNMP or WBEM os a whole other thing (pull) than pushing event using SNMP or WBEM. The push events (traps) generate events in SIM. Do you see events under your server(s) or under all events? 

 

If the status changes it changes because off polling, events occur when traps are received. You can test this when you send a test trap from one of your servers. You should get an event if the traps are configured correctly.

 

If you see events see if you use the right event collection for the email forwarding.

Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Thanks for your input. I appreciate it. I'll include an attachment (screenshot).

Under the System Status, I show "health status" for "All systems". I do see correct output here.

It shows 2 with a critical error, 4 major, 4 minor. But only a few of these events have triggered an e-mail alert.

for example, in my screenshot, I got an e-mail when one server had an issue but not the other - same severity too.

As a side note, on an older machine i'm using to test with, I pulled a drive in the array to see the resulting message and it falls under a "minor" event....wow, how is that minor?!?  But that's a separate problem altogether. Our e-mails (I think) just send out for critical or major...so a bad/missing drive being 'minor' is really irritating.

 

I'm not soo sure how cleared/uncleared/deleted events effect the e-mailing either. i.e. if I clear or delete a warning, and the problem still exists, will I get further warnings and e-mails too? I figure if the problem still exists, I'll keep getting trap messages.  I believe I understand what you're saying about the SNMP traps....in that each server has to send them to my SIM server, as opposed to SIM polling the machine to discover a fault - is that right?  All of that does seem configured correctly here. SNMP settings on each monitored server look good. 

 

Because the servers are of varying age, they may or may not have completely updated HP management software/drivers/wbem on them.  So it may be better to stick with SNMP. I know all have SNMP configured/working.

 

Well anyway, to some degree I have to learn more about SIM - - - I'm plodding through the super long PDF documentation.  Since this was all set up by a predecessor, it's a little cumbersome to go through. I'd call HP tech support, but I don't thnk they'd help me, based on my past experience of using their servers & 'support' for 10+ years.

 

I'll have to check YouTube to see if anybody has posted good tidbits for configuing/troubleshooting.

 

Thanks again for your posts.

Bob

 

 

 

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

Let's start from the start.

 

Do you see the customize from the left top, click on it and then select the events also. Now you see the hardware ststus and the received events. Also look at the ES  green status on the server. This is the event status for your server. These change if there is an uncleared event. Please add a screenshot of the Events for both servers. If i create a rule that email i also select the option clear event. For test purposes you can use the all events collection to be used for forwardinh. You should  get a lot of emails.the collection you choose dictates the events you get emailed. I made an own collection that i want events from. These csn be created by hand.

Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Okay. See screenshot.

 

Both of those systems show the "Health Status" on the main SYSTEM tab with an error.  My screenshot shows the one server had a network card problem. I included the EVENTS tab too on the same screenshot - but there it shown no error. So that's confusing me.  Why is there conflicting information? I think that error is just because that server has multiple NICs and only 2 are connected. * I think * 

 

Same situation with server #2 - so I didn't take a screenshot:  SYSTEM tab shows error, EVENTS tab is clean.  It had the 2nd redundant power supply cord pulled accidentally. It shows an error on the main SYSTEM tab under Health status but if i pick the Events tab, it is also empty.  I got an e-mail when that power cord was pulled and the alert triggered. Tee power cord was put back in shortly after - I *guess* you don't get an e-mail when a problem is resolved?? That would be nice but I'm not sure that's a feature.

 

Thanks for your thoughts. I really appreciate it!!!

Bob

 

 

 

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

I'll get back to you tomorrow, local time here is 22:40 so talk to you tomorrow!
Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Okay, thanks! I'm 6 hours behind you! ....East Coast U.S....have a good night.

 

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

The info seems conflicting but as mentioned before you can have a server that has an error on the hardware status and no event. If the event was cleared and the error was not solved the polling shows that the error remains. This is one option, the other is that even if you don't have traps configured. If your server uses WBEM you have to tell the server where to send it's events to. So let's start with one server. Select your server in SIM. Go to Options > Events > Subscribe to WBEM event. Now select Next and then Run Now. Let this job run with an account that has admin rights on the destination. Now that this job should have finished you have told the server where to send his events Now log into the server that you just configured with the System Management Homepage, Go to Settings > Test Indications > Send Test Indications. Select a type of events that you want to test (1,2 or 3) and push the next button. This should send the type on event you just selected. Now look in SIM and click on the ES link for the server you just created the event(s) for. This should not be green by the way if the test event(s) already arrived. There you should see a Test Event that is not cleared. If the events is cleared there should be an Automatic Event Handling job that has cleared the event. After the event is cleared the ES (Event Status) the status color point should go back to green. SIM has buildin jobs that will delete events older that 90 days. So if the event is deleted and the problem is not fixed the event is gone but the hardware status stays degraded. One small note for the near future. Gen 8 and above should be monitored by the ILO and Agentless Management Software. But just for now to understand how SIM works lest stick with the WBEM monitoring. This is the best way for Gen 7 and below on a Windows or ESXi 5.5 platform.
Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Hello.  Okay, I will investigate and get back to you. Thanks for your suggestions.

In the meantme, I can tel you I made some amount of progress overnight.  I reconfigured SNMP on one old test server (where I can pull a drive). Then I restarted SNMP and the dependent HP services on that server. NOW - that machine shows both the memory and drive errors on SIMs main status page AND on the EVENTS tab for that server. That is new, didn't work before. BUT - no e-mail was sent.  I'll attach 2 screenshots.  I believe I'm monitoring this server with both SNMP and WBEM.

 

So let me get to your e-mail above and try that stuff and I'll let you know what the results are. 

 

take care,

Bob

 

 

 

Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Hi again.  Aside from my prior post, I went ahead and ran your test.

From the HP System Mgmt Homepage on my test server "eddydev03" I sent 2 warnings.

Then I went to HP SIM and saw the events had appeared. 

But my original problem remains that I don't get an e-mail - for the test events or the bad memory or pulled drive.

I do get some emails when servers report errors but mostly no. These e-mails come from HP SIM, correct, not from the individual servers I'm monitoring...

 

You can see by my screenshot that my test alert sent from my test server (eddydev03) was picked up by HP SIM.

Just no e-mail.

 

-Bob

 

 

Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Hey, one more thing! I see what you mean when you said Gen8 and Gen9 servers should use HP Agentless management. I read that section in the SIM guide. In iLO, there's an an entire SNMP section. My question is : do I use this in place of the standard SNMP service on the server or in conjuction with it?? Having SNMP in my iLO and configured in windows services seems redundant. So, between iLO SNMP, regular windows service SNMP and HP WBEM should I use all or some combination thereof? Wow, I feel like I should apologize for the stream of e-mails. I owe you at least a tremendous THANKS!
Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

Hello again,

Yes for the Gen8 and higher you should use the AMS in combination with the AMS ILO settings. If needed i can tell you more about this and how you can set this up. For other servers you can use the methode we use at this moment , agents using SNMP or WBEM traps. One thing to keep in mind use either AMS or wbem/wbem agent nog both. This will give problems like datacollection issues. Also you can have problems when you use both wbem and snmp agents on the same server. WBEM is prefered as this is what HP keep supporting in favour of the older SNMP agents on Windows servers. Gen8 and above has agents for windows, esx and linux, the real advantage of AMS monitoring is that you don't need OS credentials and thus i more save as you only need SNMP read rights.

To get back to the e-mails you don't get, look at the automatic event handeling and look for a email event handler that was triggered when the alarm was received. Can you see if that job uses email forwarding and if all settings seem ok. I always add the clear event trigger so i can see if the event was handeld or not.

 

Did you create your own email forward rule? Please writd the settings for this rule in the post. If the email server is comfigured you should also be sble to send a test email from the email settings.

 

We are almost there ! And don't worry about the number of messages i just like to help as you might have seen in this forum.  

Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Hi Andrew.

 

I get emails sometimes. When I go to Options -> events -> automatic event handling -> "e-mail settings" and pick "send test e-mail" it works every time. I don't see why SIM shows critical and major alerts that I don't get e-mails for...Automatic Event Handling is configured to collect on "Important Events" (which is critical, major)....and set to collect on "All Servers" & send an e-mail with no time constraints. All looks good to me. I even deleted the event handling settings and re-created myself to be sure. But the prior settings looked fine.

 

I thought AMS was just in iLO. So with GEN8 & 9 I have 3 choices? AMS, SNMP, WBEM? I guess I can individually configure servers to use one polling method or another - not sure where to set these individually but I can go thru the documentation. On one of the general config page settings, there's a spot where you configure ALL agent monitors - that's set. But per server, I'm not sure where to do that just yet. I think I do have some that are set for both SNMP and WBEM.

 

If you view my sceenshot you'll see what I find extremely confusing. This is just one of many examples. I think I showed this in a prior post. Main HP SIM page show a major alert on a machine called sgesx01. If I click on the machine in SIM, system tab, I see the "health status" network alert (orange "!"). On the events tab it doesn't show as an alert. If I go to iLO on that server and examine th NICs, there's no problem. So why is SIM showing that alert on the main SIM page and on the system tab for that machine? I can't use SIM to manage my hardware if I'm getting false alerts or mixed messages like that. Could this could be why I believe I'm not getting e-mails for alerts? My assumption is that if SIM alerts me on the main system status page and shows that NIC alert again on that server's system tab, I should get an e-mail. But on the other hand, the same alert isn't shown on the events tab or in iLO on the server itself. So is it a problem? Should I be getting an e-mail? Who knows?!

 

I have a lot of servers I need to monitor, automated as much as possible and it's just not working for me. I'm really discouraged with this product! 

Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Hey again....just adding an example to my last e-mail. See screenshot.

 

On FRI 18th, I added a new server "sgesx15" to be monitored. That seemed to work. I think its working....?

 

I'm confused because when I view "uncleared events" I see an Alert for sgesx15 that says "data collection task could not find any data to collect....taget system may not be running snmp, wbem, etc". 

 

That target system does have SNMP and WBEM running on it.  And if I click on the server in HP SIM you can see it reports back "Health status" as being monitored by WBEM and shows all hardware in green.

 

you'll see this in the screenshot. For me, that's sending another mixed message....Why would I get that warning on SAT 19th about "no data to collect" yet it seems to be using WBEM just fine?

 

-Bob

 

 

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

let's try to make things clear: On in a Gen 7 and lower you should alway use WBEM on Windows server, SNMP on Linux and WBEM for for ESX 5.5 and above. On a Gen8 or higher you should use the SNMP/AMS setup. Disable any other agent, SNMP, SNMP agents on the operating system. AMS only works right when all other agents and OS SNMP are disabled. If you set the ILO, be sure to set the following SNMP Read & Trap community and trap destination to the SIM server Set the send traps using the hostname instead of the ILO name. If you get datacollection erros it's probably with the Gen8 or higher using any other than the AMS/ILO setup. Datacollection is a job that collects information for the servers in SIM. It has nothing to do with monitoring functions. The two things SIm does is receive traps SNP or WBEM and datacollect status and hardware inventory. where do you want to focus on right now ? If monitoring is the main thing (i would imagine) ignore the datacollection timeouts for now, we can fix that as soon as the email function is working. If you look at the left and you look at the Event collections you should find the Collection used for emailing. So all events found in this collection should create an event. I made a selfmade collection for Minor, Major, Critical events. The screenshot you show shows a NIC problem on a server. Go to the servers SMH and see what is wrong. The most confusing thing HP has done is create SNMP, WBEM and ILO events and they don't always match. So stick with the protocols mentioned above a the top and only use one protocol at a time for each server type.
Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Okay, thanks I'll get back to you. That machne with the NIC warning is an ESXI 5.0 host. I'm either monitoring ESXI 5.0 hosts or Windows 2003/2008R2 machines.....G9 down to G5 maybe. I'd love to see those ESXI hosts upgraded to at least 5.5 but that's out of my control.
Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

Hello Bob, NIC events usualy are NIC's that era enabled but have no cable. So the adapter is fine but SIM sees the disconnected NIC as major error. If the ESX server has a SNMP agent you should still have the SMH where you can check the status. If the server uses the later 5.5 WBEM agents you no longer have a SMH. In your case you should still be able to check the SMH for the error. Be sure to clear the ESX server before changing NIC settings..
Kind regards,

Andrew
mrigendra
Occasional Visitor

Re: Automatic Event Handling no e-mails

exactaly the same issue i am facing with mixed environment.

 

keep this thread going guys i may get resolution here..

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

Hello there, The main problem is that there are to much ways to get event and hardware info information. All these conflicting ways of monitoring interfere and lead to errors. I will try to explain where to use what protocol and alway stick with only one protocol per server. I must say with the Gen8 and above with AMS things are getting better, but then you need to delete all other protocols on the OS. All communications work with the ILO. This is a great improvement over the root and admin credentials needed in the past. Don't forget to set your credentials in discovery jobs instead of in the global settings. If you don't the credentials you deleted for a Gen8 or higher server will get added again and will lead to errors in datacollection and conflicting events/statusses.
Kind regards,

Andrew
Bob Zabaga
Advisor

Re: Automatic Event Handling no e-mails

Okay, thanks. I have successfully changed over a few G9's to use AMS/iLO SNMP. I had to do "configure -> configure or repair agents" then pick the AMS option at the top. I also had to set up the credentials for the iLO. I'd like to do this globally but that didn't work for me. After setting up and running my repair agents task, it failed with bad creds. So I enterd the credentials on that specific system's iLO settings. Then it worked. Not a big deal at the moment...that's the least of my problems.

 

Hey - on the G9 server's iLO management screen, next to the SNMP, there's an Alertmail tab. See screenshot. I don't think I need to do this - because HP SIM is sending the alert e-mail when it gets an snmp trap. If I set up email on iLO, I'd probably get double the alerts...I assume if you don't have HP SIM, you'd then use that to send your alerts. Does that sound right?

Andrew_Haak
Honored Contributor

Re: Automatic Event Handling no e-mails

 Hello Bob,

 

On the second part, you are correct i would only set the SNMP part. This is enough. For the ilo part you can use powershell for mass config on the ILO, look here for more info http://h20566.www2.hpe.com/hpsc/swd/public/readIndex?sp4ts.oid=5440658&lang=en&cc=us

Kind regards,

Andrew