Operating System - Linux
1829030 Members
2486 Online
109986 Solutions
New Discussion

How to monitor HDD failure alert

 
joe tam
Occasional Contributor

How to monitor HDD failure alert

Dear all,

I am new in using HP Sim manager, I have install software, but don't know how to setup a rule to detect HDD failure alert.

Please help.
4 REPLIES 4
Steven E. Protter
Exalted Contributor

Re: How to monitor HDD failure alert

Shalom,

You can use snmp if you wish.

You can write a script to detect failure in /var/log/messages or dmesg output.

Either one has specific strings what can be detected that will definitely let you know when a disk has failed.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
joe tam
Occasional Contributor

Re: How to monitor HDD failure alert

Dear sir,

As I have setup the HP SIM with proliant support pack already, moreover, the SNMP is also enabled.

How to use SNMP ? Can you give me more hint ?

Or for the script, can you give me more?

Regards,
Joe
Ralph Grothe
Honored Contributor

Re: How to monitor HDD failure alert

Hi Joe,

if your drives are [P|S]-ATA the smartmontools should work very well.

http://sourceforge.net/projects/smartmontools

They claim that also SCSI drives are supported but I haven't tried out on them yet.

If you have MIB data of the disk (raid) controller that you use you could as well set up SNMP traps in net-snmp, or extend by an own custom plugin script etc.

Without a MIB for your controller I wouldn't know how to access the SMART states from your drives.
Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: How to monitor HDD failure alert

Forgot, if you are using Linux SW MD Raidtools on your disks monitoring their state is extremely easy (that's what we are doing with our Linux boxes, i.e. LVM volumes on top of RAID1 MD devices).
You simply need to give a valid mail address where you wish to receive notifications if an array gets faulty to the MAILADDR directive within /etc/mdadm.conf at mdadm --monitor start up (e.g. after reboot).
If you wish to send a trap to any sort of magaement system simply supply a working script to POGRAM directive in mdadm.conf
Madness, thy name is system administration