LVM and VxVM

Disk array disk offline diagnosis: Help needed

 
kenny chia
Regular Advisor

Disk array disk offline diagnosis: Help needed

Hi all
For case background Please refer to http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1005265

I have solved the problem but I need to know why the disk went offline
----- server spec ---
Server: rp5430
Disk : DS2300
OS : HPUX 11.00
---------------------

Attached is the syslog and the irdiag -v output when this incident occurred

Some findings
1. All the disk offlined belongs to lun0
2. No amber light for disks

What could have happened? Are there any more log files that I can check?
All Your Bases Are Belong To Us!
3 REPLIES 3
Arunvijai_4
Honored Contributor

Re: Disk array disk offline diagnosis: Help needed

Hello,

What does # ioscan -fnC disk say ? Do you see the disks in claimed state ? Also, if you see no amber light for disks, check the hardware.

-Arun
"A ship in the harbor is safe, but that is not what ships are built for"
Devender Khatana
Honored Contributor

Re: Disk array disk offline diagnosis: Help needed

Hi,

The error clearly indicates a hardware error. Four out of total nine disks are showing as failed which lead to not suspect the disk but something else like cable/terminator or controller. If the system is in support I would not take a chance and would like to call vendor for diagnosing this.

I would really be interested to know how did you recovered from the situation. The process mentioned in earlier thread only mentions recovery at OS level and not hardware level. After the LUN going offline due to disk failures you should have been required to recreate the LUNs but the earlier thread mentioned that two file systems were recovered succesfully using fsck. I would still suspect the functionality and will not rely it untill something is received in details to clarify this from the vendor.

HTH,
Devender
Impossible itself mentions "I m possible"
kenny chia
Regular Advisor

Re: Disk array disk offline diagnosis: Help needed

Hi Devender
I did the following

1) Check server, no amber light found
2) use the irm utility and online all the disk that are in "failed" state
3) umount all the affected filesystems belonging to /users, /users1, /users2
4) Performed fsck on lvol1 (/users), lvol2 (/users1), and lvol3 (/users2)
5) Mount /users and /users1
6) Could not mount /users2, performed mkfs on it and mount it

I do not think it is a cable problem as the affected disk do not belong to any disk channel (see my irdiag-v attachment output on the top of thread)
All Your Bases Are Belong To Us!