Integrity Servers
1748264 Members
4037 Online
108760 Solutions
New Discussion юеВ

Re: HPUX rx3600 SmartArray P400 detect disk failure script

 
BERTRAND_7
Frequent Advisor

HPUX rx3600 SmartArray P400 detect disk failure script

Hello

We installed a brand new rx3600 including a SmartArray P400 with HPUX 11iv3.

We have 8 146Gb disks in the rx3600.

We defined a first logical drive using 2 disks in RAID 1 through the Option ROM Configuration for Arrays (ORCA) before the operating system installation.
After the operating system installation, through the 'saconfig' command we defined a second logical drive using the 6 remaining disks in RAID 5 for the data.

The question is how to detect some disk failure using command line ?
I had a look on the sautil and saconfig commands man pages but it seems there is no specific option to get the status of a logical drive including any failing disk.
Is the only solution to get through 'awk' from the output of 'sautil' ?

Any suggestion is welcome. Thanks
Regards

Bruno
6 REPLIES 6
Torsten.
Acclaimed Contributor

Re: HPUX rx3600 SmartArray P400 detect disk failure script

The diagnostic will take care of your drives. Just in case 1 drive fails you will get a message (mail) to root and an entry to syslog.

You can check with sautil:

# sautil ...
...
---- SAS/SATA DEVICE SUMMARY -------------------------------------------------

Location Ct Enc Bay WWID Type Capacity Status

internal 1I 1 6 0x5000c50000123456 DISK 36.4 GB OK
internal 2I 1 1 0x500000e017123456 DISK 36.4 GB OK
internal 2I 1 2 0x5000c50000123456 DISK 36.4 GB OK
internal 2I 1 4 0x5000c50000123456 DISK 36.4 GB OK


You will see a status other than "OK".

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
BERTRAND_7
Frequent Advisor

Re: HPUX rx3600 SmartArray P400 detect disk failure script

Yes I understand that in anycase root will receive an e-mail.

Yet, this platform is mainly used by a 3rd part application.
I would like to automate the detection of a disk failure to generate a message to the 3rd part application.

Unfortunately there's no specific option in 'sautil' to get a brief status of all logical drives and physical disks in each logical drive.

If no other suggestion, I think I will write an awk programm to decript the 'sautil' full output.
Torsten.
Acclaimed Contributor

Re: HPUX rx3600 SmartArray P400 detect disk failure script

What is your 3rd part application?

Monitoring?

You can force the diags to send snmp or wbem notifications.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
BERTRAND_7
Frequent Advisor

Re: HPUX rx3600 SmartArray P400 detect disk failure script

No this is a dedicated application already managing some industrial device using a specific protocol.
Torsten.
Acclaimed Contributor

Re: HPUX rx3600 SmartArray P400 detect disk failure script

I'm not sure what this means and how it works, anyway, you can just grep the sautil output for the following states:

Logical Drive State Definitions
Standard input OK All physical disks in the logical drive are operational.

FAILED Some possible causes:
1) Multiple physical disks in a fault-tolerant (RAID 1,
1+0, 5, ADG) logical drive have failed.
2) One or more disks in a RAID 0 logical drive have
failed.
3) Cache data loss has occurred.
4) Array expansion was aborted.
5) The logical drive is temporarily disabled because
another logical drive on the controller had a missing
disk at power-up.

USING INTERIM RECOVERY MODE
Also known as "degraded" state. A physical disk in a fault
tolerant logical drive has failed. For RAID 1, 1+0 or 5,
data loss may result if a second disk should fail. For
RAID ADG, data loss may result if two additional disks
should fail.

READY FOR RECOVERY OPERATION
A replacement disk is present, but rebuild hasn't started
yet (another logical drive may be currently rebuilding).
Standard input The logical drive will also return to this state if the
rebuild had been aborted due to unrecoverable read errors
from another disk.

RECOVERING One or more physical disks in this logical drive are being
rebuilt.

WRONG PHYSICAL DISK WAS REPLACED
While the logical drive was in a degraded state, the
system was powered off and a disk other than the failed
disk was replaced. Shut off the system and replace the
correct (failed) disk.

PHYSICAL DISK(S) NOT PROPERLY CONNECTED
While the system was off, one or more disks were removed.
Note: the other logical drives are held in a temporary
"failed" state when this occurs.

EXPANDING The data in the logical drive is being reorganized

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
mits
Respected Contributor

Re: HPUX rx3600 SmartArray P400 detect disk failure script

Here is an example you can catch the disk drive failure for a logical drive. The drive 1I:1:11 was removed for a test. You can try your test by removing a disk and check the sautil output.

# sautil /dev/ciss5

******************************************************************************
**** ****
**** S A U T I L S u p p o r t U t i l i t y ****
**** ****
**** for the HP SmartArray RAID Controller Family ****
**** ****
**** version A.02.11 ****
**** ****
**** (C) Copyright 2003-2006 Hewlett-Packard Development Company, L.P. ****
******************************************************************************


---- DRIVER INFORMATION ------------------------------------------------------

Driver State........................ READY

---- CONTROLLER INFORMATION --------------------------------------------------

Controller Product Number........... P400
Controller Product Name............. HP PCIe SmartArray P400
Hardware Path....................... 0/6/0/0/0/0/1/0/0/0
Serial Number....................... PA5360BBFSW2ON
Device File......................... /dev/ciss5
Hardware Revision................... 'B'
Boot Block Revision................. 0.02
Firmware Revision (running)......... 2.02
Firmware Revision (in ROM).......... 2.02
Firmware Revision (inactive)........ 1.96
# of Logical Drives................. 2
# of Physical Disks Configured...... 5
# of Physical Disks Detected........ 7
Logical Drive Rebuild Priority...... 118 (high)
Array Capacity Expansion Priority... 0 (low)
Auto-Fail Missing Disks at Boot..... enabled
SCSI Transfer Detection Mode........ Auto Detect


---- ARRAY ACCELERATOR (CACHE) INFORMATION -----------------------------------

Array Accelerator Board Present?.... yes
Cache Configuration Status.......... cache enabled
Cache Ratio......................... 100% Read / 0% Write
Total Cache Size (MB)............... 208
Read Cache........................ 208
Write Cache....................... 000
Transfer Buffer................... 000
Battery Pack Count.................. 1
Battery Status (pack #1)............ ok


---- LOGICAL DRIVE SUMMARY ---------------------------------------------------

# RAID Size Status

0 1+0 34700 MB OK
1 0 69401 MB OK


---- SAS/SATA DEVICE SUMMARY -------------------------------------------------

Location Ct Enc Bay WWID Type Capacity Status

internal 1I 1 12 0x500000e01117c732 DISK 36.4 GB OK
N/A 1I 1 11 0x0000000000000000 N/A N/A FAILED
internal 1I 1 10 0x5000c5000032b839 DISK 36.4 GB SPARE (activated)
internal 1I 1 9 0x5000c5000030b0c5 DISK 36.4 GB OK
internal 2I 1 16 0x500000e011213482 DISK 36.4 GB OK
internal 2I 1 15 0x5000c500002084c9 DISK 73.4 GB UNASSIGNED
internal 2I 1 14 0x5000c5000030b9c9 DISK 36.4 GB UNASSIGNED
internal 2I 1 13 0x500000e01118a7a2 DISK 36.4 GB UNASSIGNED


---- SAS/SATA ENCLOSURE SUMMARY ----------------------------------------------

Location Ct Enc Expander_count Bay_count SEP_count

internal 1I 1 0 4 1
internal 2I 1 0 4 1


---- LOGICAL DRIVE 0 ---------------------------------------------------------

Logical Drive Device File........... c5t0d0
Fault Tolerance Mode................ RAID 1+0 (Disk Mirroring)
Logical Drive Size.................. 34700 MB
Logical Drive Status................ OK
# of Participating Physical Disks... 2

Participating Physical Disk(s)...... Ct:Enc:Bay:WWID
1I:1:12:0x500000e01117c732
1I:1:11:0x0000000000000000 <-- NOT RESPONDING

Participating Spare Disk(s)......... Ct:Enc:Bay:WWID
1I:1:10:0x5000c5000032b839 <-- activated
for 1I:1:11:0x0000000000000000

Stripe Size......................... 128 KB
Logical Drive Cache Status.......... cache enabled
Configuration Signature............. 0xA00148CC
Media Exchange Detected?............ no