Operating System - HP-UX
1751802 Members
5018 Online
108781 Solutions
New Discussion юеВ

how to identify the failed disk

 
newunix
Frequent Advisor

how to identify the failed disk

suppose if a disk gets failed in storage, how could we find that particular failed disk in command prompt,,

once one disk got failed ,, what others did is they just removed the failed disk from san and inserted the new disk ,, used command view to see the leveling,,once leveling done they said its no problem

but my doubt is how can we identify the particular failed disk since it might be used in any server and may have lot of vg with it..

i cant do ioscan to check in all server(since servers are too many )
7 REPLIES 7
Kapil Jha
Honored Contributor

Re: how to identify the failed disk

I think there are lot of issue, you are querying about SAN and looking in UNIX box.
HPUX does not know how the LUNs are presented to it, how many disks span those LUNs etc.

Regarding SAN how do they replace or check disk, I am not sure how do they do it but generally at SAN level disks are redundant so they would be having there own scenarios to replace the faulty disks.


BR,
Kapil+
I am in this small bowl, I wane see the real world......
Johnson Punniyalingam
Honored Contributor

Re: how to identify the failed disk

>>how to identify the failed disk<<

dmesg,
/var/adm/syslog/syslog.log
/var/opt/resmon/log/event.log
ioscan -fnC disk -> look for NO_HW disk
dd if=/dev/dsk/cxtxdx of=/dev/null bs=128k -> command to check "Health Check"

>>>once one disk got failed ,, what others did is they just removed the failed disk from san and inserted the new disk ,, used command view to see the leveling,,once leveling done they said its no problem<<<<<<<<

Yes offcourse, you will not see any impact because it may be "Redundancy" or "RAID" concept under storage,

>>but my doubt is how can we identify the particular failed disk since it might be used in any server and may have lot of vg with it..<<<

very hard to answer, hence you are new to UNIX, its part of "Sysadmin" Jobs to check if your organisation has setup proper "alert" level notification , like email alerts, SMS alert, Patrol alerts ... etc.. this called "Monitoring"

>>i cant do ioscan to check in all server(since servers are too many )<<<<

if its connected to several servers , example you have "LUNS" which has been accessed by 4 node cluster aware, no choice you need run ioscan in all the server to check NO_HW has been shown as claimed after you replace faulty disk, sometime it depends on "Storage Technology" they have "Hot Spare" disk kicking in.. so you won't know the impact
Problems are common to all, but attitude makes the difference
sujit kumar singh
Honored Contributor

Re: how to identify the failed disk

hi

if this is a local disk that is not a SAN disk as presensetd to the server, this type of failed disk can have :

1) lbolt error in syslog
2) if this is a part of some VG this can show as UNAVAILABLE in vgdisplay -v for that VG
3) diskinfo can show 0 Bytes of size.
4) ioscan can show this as no_hw


but if that happens to be a SAN LUN, basically that happens to be Virtual Disk Presented as LUN to the server. The VDisk is taken care of by the SAN Architechture like raid level, Redundancy, Failure level Protection etc. that is entirely managed by the Storage. What Storage sees for that is LUN but not the underlying structure for the DIsk Architecture. So what has happened from your side is nothing but if a disk in SAN storage fails, HP-UX OS might not be able to know that, that will be an Error/Alert in the SAN Storage Only.

Regards
Sujit
amithp
Frequent Advisor

Re: how to identify the failed disk

This cannot be identified from servers unless you ioscan for disks (in case those are unavailable) on each servers sequentially. or see events in syslog(or event in /vat/opt/resmon/log/event.log) which will specify the hardware path of the failed disk.

But i think the best way to find out the trouble from the Storage first. The LUN associated with the disk will be presented/mapped to a server. This can bee seen from command view. You can check SAN switch Zoning information also. later on you can see logs(Syslog, Event.log) on the server/s that will show the troubled disk.
AVV
Super Advisor

Re: how to identify the failed disk

Hi,

Normally a disk fail from storage end would not affect the server as they allocate LUNs in such a way where it does have redundancy of the same. Normally that is the procedure and this is why we use storage technics.

You can identify failed disks from your HP UX system in many ways.

1. syslog
2.dmesg
3.If OVO tools installed, it gives you notification.
4.vgdisplay -v
5.diskinfo
6. ioscan
7. dd
subodhbagade
Regular Advisor

Re: how to identify the failed disk

HI,

Follow the steps:--

(1)#ioscan ├в fnC disk ----To list all the disk


(2)#diskinfo /dev/rdsk/cxtxdx

If the size is 0 KB then disk is malfunctioning.

(3)
The following shows an unsuccessful read of the whole disk:
# dd if=/dev/rdsk/c1t3d0 of=/dev/null bs=1024k
dd read error: I/O error
0+0 records in 0+0 records out


Regards,
Subodh.

Torsten.
Acclaimed Contributor

Re: how to identify the failed disk

If you say "SAN" and "leveling" and "command view" - is it an EVA?

You cannot see failed disks from hp-ux, but from command view only.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!