1827790 Members
2674 Online
109969 Solutions
New Discussion

Re: DL360's and Smartd

 
SOLVED
Go to solution
Nick Gushlow
Advisor

DL360's and Smartd

Hi,

Is there anyway I can get smartd to work with the raid controller in my DL360?

Can't seem to get it to play nice
12 REPLIES 12
Steven E. Protter
Exalted Contributor

Re: DL360's and Smartd

Shalom,

The DL360 server as with all HP/Compaq servers has a CD that comes with it and can be downloaded.

You boot off the CD, set up the storage with the GUI and then you can boot off the Linux CD and use the storage.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Vitaly Karasik_1
Honored Contributor
Solution

Re: DL360's and Smartd

As far as I understand, Nick asked not about SmartCD, but about smartd - daemon for harddisk health monitoring.

HP provides a s/w for their raid controller monitoring - http://h18004.www1.hp.com/support/files/server/us/locate/1116_5561.html#76
Nick Gushlow
Advisor

Re: DL360's and Smartd

Indeed I did ask about smartd.

I've installed the proliant support pack which I believe contains the array diagnostics util, but I can't seem to find much in the way of helpful documentation.

Running hpadu -start seems to work but I can't see a way of querying service or setting it up to notify me in the event of problems with the array / disks.

I see that I can run hpaducli -f filename to get a dump of the info, but I really want something to a) notify me of problems and b) an ability to monitor the disks current status & temps etc.

Looks like this might do the job, I could just do with a few more pointers.

I think I might remove the complete PSP and just install the ADU, don't think there's anything else I really need.

Vitaly Karasik_1
Honored Contributor

Re: DL360's and Smartd

I don't sure for 100%, but IMHO you should install hpasm package:
http://h18004.www1.hp.com/support/files/server/us/download/22893.html (for RHEL4)
Nick Gushlow
Advisor

Re: DL360's and Smartd

Hmmm maybe it'll be simpler just to keep PSP installed.

I've now figured out there is a web interface, but that just seems to be a nice view of the same output.

There's no man page or docs that I can find to tell me if I can do anything else.

Is there a way to configure ADU to email alerts on problems (or developing problems)?

Nick Gushlow
Advisor

Re: DL360's and Smartd

I've just stumbled across Insight Manager.

I'm guessing I need this as well to have notification on problems.
Trever Furnish
Regular Advisor

Re: DL360's and Smartd

I for one would appreciate it if you take the time to update this thread with any new info you work out. I have the same issue (although I'm using DL380's) -- I've installed the PSP, but I'm not seeing how that really helps.

Under HPUX, a variety of mechanisms are automatically set up to let root know when hardware is failing. I'm hoping to at least get find that something in the PSP can be configured to behave like ESM under HPUX, emailing messages to root whenever hardware fails a diagnostic.

In my case on one of my servers I even already know that a disk has failed, so I have a great test case -- but nothing seems to be firing off to let me know about it.
Hockey PUX?
Nick Gushlow
Advisor

Re: DL360's and Smartd

Haven't figured out anything yet, I've kind of back-burnered this issue until I finish building all my new servers and I've got them in our data-centre.

When I come back to this in a week or so, I'm currently thinking of just writing scripts to call the cli commmands, parse the output, and then write a plugin for nagios (which I already use to monitor other things).

I don't have a server with a failed disk in it, so if you could post the output of hpaducli that would help me.
Trever Furnish
Regular Advisor

Re: DL360's and Smartd

Actually I may have spoken too soon. On the system with the failing drive I hadn't yet installed the PSP. Installed it last night, but the error count on the drive hasn't increased yet, so it could be that the only reason I'm not getting traps via email is because nothing has triggered them yet.

I have a replacement disk on the way - I'll update this post with what happens when I yank out the disk for replacement.

The disk in question hasn't actually failed yet - it's a smart predicted failure. Note the "hard read" field in the attached file, which is the output of hpaducli. Hope it helps.

Not sure if you'd noticed, but there's an entry in /opt/compaq/cma.conf that seems to be meant to cause the insight manager agents to send traps via email to root:

trapemail /bin/mail -s 'HP Insight Management Agents Trap Alarm' root

If you write a nagios check for this, I'd be interested in seeing the code. I run one "system health" check via ssh from a nagios system, and that check in turn checks everything else I'm concerned about - filesystem space, nic speed and duplex, free memory, free swap, the presence of critical processes, overall process count, etc. I may just add a check to that script for changes in the output of hpaducli.
Hockey PUX?
Ryan Hobbs_2
Advisor

Re: DL360's and Smartd

I too would be interested in a nagios script that checks the output of the various HP utilities.
Nick Gushlow
Advisor

Re: DL360's and Smartd

We'll I'm coming up to the end of our server move; hopefully this means I'll be able to start on the nagios scripts shortly.
Trever Furnish
Regular Advisor

Re: DL360's and Smartd

I still haven't replaced my disk yet, but I did bump into this page describing nagios plugins specificly for using the output of the hp proliant tools:

http://gwfl.daimonic.org/index.pl?p=nagiosplugs

Haven't gotten to play with them yet, but looks very promising.
Hockey PUX?