Server Management - Systems Insight Manager
1752728 Members
5883 Online
108789 Solutions
New Discussion юеВ

Re: No Alert for Drive Failure

 
SOLVED
Go to solution
Richard Colebeck
Occasional Advisor

No Alert for Drive Failure

Hello,

When we experience a drive failure in our Raid 5 array we do not get a critical alert (SNMP Trap) sent to SIM. However we do get the informational alert about a status change of the array. At least this status change clues us in to some change but its not enough. In the "Integrated Managment Log" of the "System Management Homepage" on the server with the failed drive there is an event "Drive Array Device Failure (Slot 1, Bus 2, Bay 8)". Why does this event not get sent to our SIM server?

The server in question is an HP DL380G3 with a baseline of PSP 7.40B and the RAID is running off of a 6400 controller.

How do I get a critical alert sent to SIM for a failed array drive member?

Thanks,
Richard...
6 REPLIES 6
Albert Austin
Esteemed Contributor

Re: No Alert for Drive Failure

From Automatic Event Handling - Create a task for Harddrives choose servers you would like it to monitor and then from event type choose "ProLiant Storage Events".

This should help you monitor your hardrives more in depth.
David Claypool
Honored Contributor

Re: No Alert for Drive Failure

Richard, 3 possible reasons:

- trap destination not set properly
- trap community string not correct
- dropped SNMP packet

To fix the first 2 automatically, HP SIM 4.2 and later has 'Configure or Repair agents...' which will correct those problems and more to do with communication.

SNMP is a UDP transport and there is no guaranteed delivery. It's possible that it can be lost along the way, but not constantly. But, that's why we do polling, as a backup measure.
Richard Colebeck
Occasional Advisor

Re: No Alert for Drive Failure

Thanks for the replies,

Albert, I looked into creating another Event Handling monitor but this just created duplicate email alerts since we already have a general Event Handler task that sends an email for each event.

David, I checked over the trap settings and they seem OK. As we do get some events sent to the SIM server from the server with failed drives.

I find it odd that there are lots of Microsoft System event log entries with regard to the failed drive but only 2 events registered in SIM.

To reproduce my results, take a RAID 5 array with a spare and pull out one of the active drives.

Running an HP DL380G(x) with Windows 2003 configured to send traps to another HP Dl380G(x) with Windows 2003 running HP SIM 5.00.00, I will only get 2 events as follows when I pull out a drive;

Event Name: Logical Drive Status Change (3034) Event originator: cokcvma1 Event Severity: Informational Event received: 06-Jul-2006, 15:26:41

Event description: Logical Drive Status Change. This trap signifies that the agent has detected a change in the status of a drive array logical drive. The variable cpqDaLogDrvStatus indicates the current logical drive status.

Location: Slot 2
Board Status: rebuilding


Event Name: Logical Drive Status Change (3034) Event originator: cokcvma1 Event Severity: Informational Event received: 06-Jul-2006, 15:26:41

Event description: Logical Drive Status Change. This trap signifies that the agent has detected a change in the status of a drive array logical drive. The variable cpqDaLogDrvStatus indicates the current logical drive status.

Location: Slot 2
Board Status: readyForRebuild


Please confirm with me if you can that I should get more information/events logged on my SIM server than that when I pull a drive out of an array.

Thanks

Richard...
G Edwards
Advisor
Solution

Re: No Alert for Drive Failure

Make sure you have applied the 7.40 MIB Update (upd740mib.zip). It is available from the HP SIM Download page

I had the same issue where I was only getting minor alerts when pulling drives, aplying the MIB update the update changed it to major (or critical?).
Richard Colebeck
Occasional Advisor

Re: No Alert for Drive Failure

Thanks for the input everyone, it was appreciated.

G Edwards, your answer was exactly what was needed to fix the issue.

Take care,

Richard...
Andrew_346
Regular Advisor

Re: No Alert for Drive Failure

I had been running the 7.50 MIB update after installing 7.50 agents on all of my servers. I have been beating this over my head as to why I was not receiving email alerts when a drive would fail in an array with a HS.

I've recently finished updating to the latest agents and installed the 7.60 MIB update, but I've also changed the default event handling configuration to send emails on "Important" and "Warning" events.