Systems Insight Manager
cancel
Showing results for 
Search instead for 
Did you mean: 

SIM 7.4 - problem with alerts on Physical Drive Status Change

smamm
Contributor

SIM 7.4 - problem with alerts on Physical Drive Status Change

Strange problem.   I just got a new installation of SIM 7.4 up and running and I noticed that a bad disk didn't generate an email alert (like all of our other events).

 

I checked SNMP Trap Settings, and confirmed that the below event type was already enabled and set to Critical:

Mib Name:   cpqida.mib

Trap Name: cpqDa7PhyDrvStatusChange 

Event Type:   (SNMP) Physical Drive Status Change (3046)

Enable Trap Handling:  Yes

Category:  Proliant Storage Events

Severity:  Critical

 

We have automatic event handling activated for the "Important Events" default collection, which includes the "Proliant Storage Events" category.

 

But yet, I never got an email alert on this event:

 

(SNMP) Physical Drive Status Change (3046)

 

The system has been fully discovered, and other email alerts have generated on it successfully in recent weeks.  HOWEVER, the above alert was categorized as a "Warning", which contradicts the severity assigned to it above.   I've got to be missing something here.   Anyone have an idea as to what may be wrong?

1 REPLY
LGentile
Trusted Contributor

Re: SIM 7.4 - problem with alerts on Physical Drive Status Change

smamm,

 

Please see this thread, it may help:  http://h30499.www3.hp.com/t5/ITRC-HP-Systems-Insight-Manager/change-the-severity-of-an-event-in-HP-SIM-7-4/td-p/6680762#.VK6c63tlyYs

 

As a workaround for storage/"status change" events, you may want to set up a specific condition or collection to send you all non-informational events for these.  Basically, without reading the thread I linked, these are special events where you cannot change the underlying severity.  For example, a predictive failure may always be a warning, however it is the same event ID # used for a failure condition, which would be critical (event 3046).  There are a few of these types.. another would be the status for RAID cache batteries. 

 

A further example:

 

HDD failure = failure code 7, event ID 3046 - Status change for a physical drive (critical)

HDD PFA = failure code 9, event ID 3046 - Status change for a physical drive (warning)

HDD online = failure code 0, event ID 3046 - Status change for a physical drive (informational)

 

These are not the correct failure codes, but you get the idea here..  it's all the same event number.  I am not an SNMP/MIB expert so I can't fully explain this behavior, I just know you cannot change the code behavior in SIM.

 

Hope this helps!