1829264 Members
1750 Online
109988 Solutions
New Discussion

DL360 failed disk drive

 
SOLVED
Go to solution
Jeff Hoevenaar
Frequent Advisor

DL360 failed disk drive

Is there a way to determine if a disk drive has failed in a HW RAID of a DL360 from the Linux OS? Currently the only way I know a disk is failed is by looking at the front of the server.
20 REPLIES 20
MattJ123
Frequent Advisor

Re: DL360 failed disk drive

not 100% with that model, but i've had some success with snmp.
Rick Garland
Honored Contributor

Re: DL360 failed disk drive

Do you have RH Enterprise loaded on that ProLiant DL? If so, do you have CIM installed and configured?

This can provide hardware monitoring and send alerts if you want it to.

Comes with the ProLiant/Linux package.
Stuart Browne
Honored Contributor

Re: DL360 failed disk drive

The SNMP traps should show that a disk failed (matthew is correct), but if you want a pretty way, the ACU-XE.

This provides a web interface to the major funky functions of the machine, including the RAID management utilities.

It should be part of the Linux tools as Rick said.
One long-haired git at your service...
Jeff Hoevenaar
Frequent Advisor

Re: DL360 failed disk drive

Thanks for all the info.

I am not familiar with CIM or ACU-XE. I do not have a Proliant/Linux Package. I have RedHat OS CDs. I am also not sure how to get information from SNMP traps.

Any info would be great.

Thanks.
Stuart Browne
Honored Contributor
Solution

Re: DL360 failed disk drive

Start by installing the Linux ProLiant software packages, available from http://h18000.www1.hp.com/support/files/server/us/family/model/6011.html?lang=en&cc=us (listed as 'ProLiant Support Pack for Red Hat' under the appropriate OS grouping.

This will install both the SNMP software, as well as the ACU software.

Once the ACU-XE is installed, you point a web-browser at your server at port 2380 (I think, don't have one with me here), and that will show you the RAID controller and containers, as well as the Proliant event log.
One long-haired git at your service...
Rouchon_2
Occasional Advisor

Re: DL360 failed disk drive

Yes you must install Compaq softwares. without those packages the kernel will never know if a disk device is failed and then will never report that in the syslog facility. the more important tool is cmastor a kernel module that communicate with the SmartArray Card.

Ross Minkov
Esteemed Contributor

Re: DL360 failed disk drive

Jeff,

As Stuart said -- install the ProLiant Support Pack for Linux and you'll be all set.

Stuart,

the port is 2301 or 2381:

http://server:2301
or
https://server:2381

HTH,
Ross
Rick Garland
Honored Contributor

Re: DL360 failed disk drive

The ProLiant Support Pack (PSP) will do lots more for you than just hardware monitoring. Can do filesystem monitoring, log monitoring, etc...

It should be with your ProLiant Linux install.
As Ross said, install it, open a browser with http://:2301 and you will see some management screens you can work with.
Jeff Hoevenaar
Frequent Advisor

Re: DL360 failed disk drive

Will PSP automaticlly monitor HW issues and send them to the syslog.log or do I still need to configure the Insight Manager stuff?

I just want to get an email when a disk drive (or some other HW) goes bad.
Rick Garland
Honored Contributor

Re: DL360 failed disk drive

You have your choice. Can setup snmp traps, can send output to logs, can send output to email.

Ross Minkov
Esteemed Contributor

Re: DL360 failed disk drive

The PSP agents will send an email to root when there are hardware problems with your server. You do not need to configure HP Systems Insight Manager. HP SIM is free, btw, so if you have lots of ProLiant servers to manage it will prove helpfull.
Ross Minkov
Esteemed Contributor

Re: DL360 failed disk drive


Here is something I wrote for my customers to give them a quick overview of the PSP for Linux.

---------------------------------------

ProLiant Support Packs (PSPs) represent operating system specific bundles of ProLiant optimized drivers, utilities, and management agents. These bundles of software are tested together to insure proper installation and functionality. Each PSP consists of a deployment utility, setup and software maintenance tools designed to provide an efficient way to manage routine software maintenance tasks. These deployment utilities remotely deploy driver and management agent updates to network attached servers. PSPs provide customers with a ProLiant system software baseline that has already been tested and verified by HP.

Part of the PSP for Linux is the hp Server Management Drivers and Agents (hpasm -- stands for hp Advanced System Management) package. hpasm is a collection of driver and tools which enable monitoring of fans, power supplies, temperature and other management events. This package includes the basic server support. ProLiant Servers are equipped with hardware and firmware to monitor certain abnormal conditions such as abnormal temperature readings, fan failures, ECC memory errors, etc. The Management Drivers and Agents monitor these conditions and notify the system administrator of abnormal conditions.

The following is a list of some features supported by hpasm:

- Monitoring abnormal temperature conditions
- Monitoring fan failures
- Monitoring the system Fault Tolerant Power Supply
- Monitoring ECC memory errors
- Automatic Server Recovery (ASR)

Other packages (such as the hprsm package) provide additional functionality. The hprsm package, for instance, provides HP ProLiant Remote "Lights out" management for the ProLiant servers equipped with the Integrated Lights Out (iLO) ASIC or for the ProLiant Remote Insight "Lights Out Edition) (RILOE) add-on adapter.

For more information visit the PSP website at http://www.hp.com/servers/psp.

---------------------------------

HTH,
Ross
Jeff Hoevenaar
Frequent Advisor

Re: DL360 failed disk drive

I am now recieving emails from PSP on RHES3 and RH7.3.

I cannot get the webpage from RH7.3 to work. Should there be a web page? https://server:2301 or 2381

Thanks for all the help!
Rick Garland
Honored Contributor

Re: DL360 failed disk drive

Do you have the http processes running?

Jeff Hoevenaar
Frequent Advisor

Re: DL360 failed disk drive

There is no http process running on the RHE3 or the RH7.3 box.

Here is output from netstat -a on the RHE3. These are not on RH7.3.

tcp 0 0 localhost.localdomain:2381 *:* LISTEN
tcp 0 0 chpadm98:2381 *:* LISTEN
tcp 0 0 localhost.localdomain:2301 *:* LISTEN
tcp 0 0 chpadm98:2301 *:* LISTEN
Stuart Browne
Honored Contributor

Re: DL360 failed disk drive

Check the system's firewall's to make sure that TCP port 2381 is allowed through your INPUT chain from your workstation.
One long-haired git at your service...
Gopi Sekar
Honored Contributor

Re: DL360 failed disk drive


PSP listens on port number 2381

so try with https://server:2381

other possibility could be that firewall is blocking on that system, try stopping iptables and see whether you are able to access it.

also it will ask for authentication. if you have installed host based authentication method then you can use login password of root user on that system otherwise you can create your own administrator account.

Regards,
Gopi
Never Never Never Giveup
Ross Minkov
Esteemed Contributor

Re: DL360 failed disk drive

Did you reboot after the PSP for Linux installation?
Jeff Hoevenaar
Frequent Advisor

Re: DL360 failed disk drive

I got it figured out. I had to go in and manually install the hpasm rpm and configure it. I guess on the newer version it configures during the install.

Thanks.
Jeff Hoevenaar
Frequent Advisor

Re: DL360 failed disk drive

got it working