Server Management - Systems Insight Manager
1827295 Members
3865 Online
109717 Solutions
New Discussion

Re: HPSIM 4.2 Polling Sensitivity Question

 
SOLVED
Go to solution
Rob Buxton
Honored Contributor

HPSIM 4.2 Polling Sensitivity Question

Hi All,
I updated our HPSIM 4.1 SP1 to 4.2 on Friday. All seemed well.
This morning I've about a hundred Server outage notifications.
Nothing untoward has happened so my question; is HPSIM more sensistive?
Just a note, I split the Hardware Polling Task for Servers, I remove Ping from the default task and add a new Server Polling task that polls every minute.
But, it was Servers and Non-Servers that were reported as being non-contactable.
13 REPLIES 13
Jon Ward
Trusted Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

Have you tried to increase the timeout and retry values for SNMP within the Global Protocol Settings?

Was anything else changed on the OS level during the hp Systems Insight Manager 4.2 installation?
Rob Buxton
Honored Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

Jon,
I think I did make changes to the SNMP Timeout values under 4.1 Will these have been overridden by the upgrade to 4.2?
No, there were no other OS changes to any part of the infrastructure. And it was a reasonably typical weekend.
The Devices reported as being unavailable were both local and over the WAN.
Rob Buxton
Honored Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

I've further reviewed things. The Outages are, I think, being picked up by my Ping Only Polling Task that runs every minute. Certainly they're flagged as okay a minute later.
The settings for this are:
Default 4 Sec. Timeout with Custom 2 Retries.
This was increased to prevent issues with the WAN Sites.
The only other change has been the Enabling of the WBEM Interface with associated accounts to allow a test of the Vulnerability & Patch Management to work.
But I cannot see how that would upset the Ping Task.

Re: HPSIM 4.2 Polling Sensitivity Question

We have seen the same thing after upgrading this weekend.

Our ping settings are set to
Default Timeout: 15 seconds
Retries: 2

Most of our systems are being polled over the WAN. But we haven't seen this before the upgrade.
Michel van Verk
Occasional Advisor

Re: HPSIM 4.2 Polling Sensitivity Question

Same situation here. We have about 180 servers in HPSIM and it seems that some of the servers are more sensitive to these fake outage alerts than others. I have my time-out on 30 sec and 3 retries, both on SNMP and the polling tasks, but HPSIM even reports switches in the LAN as down. When I ping the equipment myself in a separate window, there are no time-outs.

It seems indeed that HPSIM 4.2 is a lot more sensitive, over sensitive for outages. Maybe there is a bug in the software that causes the system to report an outage, even when the retries are succesfull?
David Claypool
Honored Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

For those experiencing this, please post whether you are using ICMP ping or TCP port 80 ping.

There is some thought around here that HP SIM may be ignoring changes to the timeout and retries settings on the Global Protocol Settings page. To see if it makes any difference, do the following:

Select Options --> System Protocol Settings and choose 'All systems in the list 'All Systems' and then [Next]. Check 'Use values specified below" and change the numbers in the fields to what you want and click [Run now].

Please report back if it makes any difference.
Rob Buxton
Honored Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

David,
In reply to the first part, ICMP Ping.
I'll make the changes and report back.
As others have noted, issues here are on Switches, Servers over WANs, or older slower Servers as well as some Sun Boxes. But, when things got a bit busy a whole raft of Servers that were on the Local Network.
Michel van Verk
Occasional Advisor

Re: HPSIM 4.2 Polling Sensitivity Question

David,

I have used both ICMP and TCP ping's, both with the same faulty outages. I will apply the changes as you suggested and I will let you know.

Michel
Michel van Verk
Occasional Advisor

Re: HPSIM 4.2 Polling Sensitivity Question

David,

I have made the settings as you suggested this morning 9:00 CET. It's now 16:30 CET and the machine hasn't reported any faulty outages at all. It seems that the HPSIM doesn't use the global settings. I have tested if it reports servers that are shut down and it does. Everything works fine now.

Michel
Mike Strako
Trusted Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

I too had the same problem, upgrading from 4.1 to 4.2 yesterday. I monitor 407 servers and my in box was full & B-berry battery nearly dead. David, I did what you said and adjusted my numbers - Life is quiet again... Thank you very much.

Mike.
Rob Buxton
Honored Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

David,
The suggested fix seems to have worked here as well. The weekend is the best judge though.
I did get one outage reported and thought the fix hadn't worked as the outage was for 05:30, I found out a Server really had died!

If you could post any likely follow up we might expect to see that would be good and I'll assign points to flag a solution.
David Claypool
Honored Contributor
Solution

Re: HPSIM 4.2 Polling Sensitivity Question

Continue to use the workaround for the time being. Not sure if there will be a hotfix released for this or if it will be in a rollup patch release. Will advise.
Rob Buxton
Honored Contributor

Re: HPSIM 4.2 Polling Sensitivity Question

Workaround provided, we'll wait to see what HP come up with in the future to fix the problem.