Server Management - Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

Hardware Status Polling Question

 
Highlighted
Advisor

Hardware Status Polling Question

In SIM, the Hardware Status Polling checks every 5 minutes to see if a server/device is up, by default. As it works with the default settings, we get alerted a LOT more than we want. Every time a server is rebooted, there's a decent chance we'll get an alert. So, ideally, I'd like to be able to configure it to wait to check a second time before sending out an alert.

I'm not finding any way to quite configure it like this, but was able to get it to sort-of function like that by changing the Global Protocol Settings for ping to a timeout of 60 seconds and 5 retries. This seems to behave as we would like. The down side is that the automated Discovery also uses these settings. Since we inherited a messy network, we've got a range of almost 9000 IP addresses that it has to check--using the same 60 second timeout and 5 retries. That means it takes around 450 hours for the Discovery to finish. Not a practical situation.

I've tried changing the timeout and retries on just the Hardware Status Polling task, but it seems to ignore that setting, for some reason. When I had that set, it still whipped through the Polling in less than a minute--even with a test server powered off. (and, of course, it sent out an alert right away)

Does anyone know of a way to accomplish what I'm trying to do with the Alerting/Polling without negatively impacting the Discovery?
3 REPLIES 3
Highlighted
Honored Contributor

Re: Hardware Status Polling Question

The best thing to do is to use Suspend Monitoring when you are planning on rebooting a server. That will suppress Automatic Event Handling during the time it is suspended.
Highlighted
Advisor

Re: Hardware Status Polling Question

Yeah, I hear ya. I'm part of a team that doesn't do that reliably, though. This isn't just limited to reboots, however. There can be odd hiccups that occur on the network that just make a ping fail--even though the server is fine.

I'm trying to get SIM to be a reliable system for alerting us to issues. As it is running currently, we get so many alerts that everyone on the team just ignores them.
Highlighted
Honored Contributor

Re: Hardware Status Polling Question

I did a quick search and couldn't find the message about it, but our old friend Rob Buxton approached this problem with a custom script that is called for notification from Automatic Event Handling instead of sending an email. His script will establish a counter when an offline status poll is encountered, and then will fire off an email using a CLI email tool only when it has remained offline for x counts. Look it up.