1826783 Members
1435 Online
109702 Solutions
New Discussion

Lost a disk drive

 
Don Spare
Regular Advisor

Lost a disk drive

I have an HP 9000 L2000 with 4 internal 9GB disk drives which are set up as 2 sets of mirrors - one for vg00 and one for vg01. vg00 has the unix installation and vg01 has the Oracle software and other Oracle related stuff.
The filesystems on vg01 will not mount and fsck complains that there is no logical volume. When I look at the drives, there are no error lights and they seem to be running properly but the system cannot see them.

What is the best/easiest way to recover this? Where do I look for error messages about what the problem is? Fortunately this is not a production machine but the developers are getting a bit antsy about not having their server/databases available.
18 REPLIES 18
melvyn burnard
Honored Contributor

Re: Lost a disk drive

Perhaps this document may assist you

http://docs.hp.com/en/5991-1236/When_Good_Disks_Go_Bad.pdf
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Rick Garland
Honored Contributor

Re: Lost a disk drive

Did something happen recently?

Are the VGs active? (vgchange)

What does the vgdisplay output show?

What about the 'strings /etc/lvmtab' command. What is output?

Need some more info...
DCE
Honored Contributor

Re: Lost a disk drive

From the sounds of it, it would appear you have a problem with either your SCSI controllers or the cables. Look in the syslog file (/var/adm/syslog/syslog.log) for error messages - probably smething with a LBolt error.

Mel Burslan
Honored Contributor

Re: Lost a disk drive

Looks like your lvm headers on vg01 volumes bit the dust. One thing you can try is

vgcfgrestore -n vg01 /dev/rdsk/cXtXdX

for the disks in vg01 and after that

vgchange -a y vg01

then try running your fsck commands to see if they work. If fsck still complains about not finding the logical volumes, it is time to recall the backup tapes (you have them, don't you ?) from the vault.
________________________________
UNIX because I majored in cryptology...
Cem Tugrul
Esteemed Contributor

Re: Lost a disk drive

Hi Don,
if i were you i would start with the command
at first;

#ioscan -fnCdisk
and try to look S/W State colon
What are the disks state?
Our greatest duty in this life is to help others. And please, if you can't
Cem Tugrul
Esteemed Contributor

Re: Lost a disk drive

and then go with Rick's description...
Good Luck,
Our greatest duty in this life is to help others. And please, if you can't
Don Spare
Regular Advisor

Re: Lost a disk drive

vgdisplay shows:
vgdisplay: Volume group not activated.
vgdisplay: Cannot display volume group "/dev/vg01".
VG Name /dev/vgt01
VG Write Access read/write
VG Status available
Max LV 255
Cur LV 1
Open LV 1
Max PV 16
Cur PV 1
Act PV 1
Max PE per PV 14080
VGDA 2
PE Size (Mbytes) 4
Total PE 14077
Alloc PE 14077
Free PE 0
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0

ioscan -fnCdisk shows:
Class I H/W Path Driver S/W State H/W Type Description
==================================================================================
disk 1 0/0/1/1.2.0 sdisk CLAIMED DEVICE SEAGATE ST39204LC
/dev/dsk/c1t2d0 /dev/rdsk/c1t2d0
disk 3 0/0/2/0.2.0 sdisk CLAIMED DEVICE SEAGATE ST39204LC
/dev/dsk/c2t2d0 /dev/rdsk/c2t2d0
disk 4 0/0/2/1.2.0 sdisk CLAIMED DEVICE HP DVD-ROM 304
/dev/dsk/c3t2d0 /dev/rdsk/c3t2d0
disk 77 0/4/0/0.97.4.19.0.0.0 sdisk CLAIMED DEVICE DGC CX600WDUNB
/dev/dsk/c12t0d0 /dev/dsk/disk_query /dev/rdsk/c12t0d0
disk 79 0/4/0/0.97.4.19.0.0.1 sdisk CLAIMED DEVICE DGC CX600WDR5
/dev/dsk/c12t0d1 /dev/rdsk/c12t0d1
disk 76 0/4/0/0.97.8.19.0.0.0 sdisk CLAIMED DEVICE DGC CX600WDUNB
/dev/dsk/c13t0d0 /dev/rdsk/c13t0d0
disk 81 0/4/0/0.97.8.19.0.0.1 sdisk CLAIMED DEVICE DGC CX600WDR5
/dev/dsk/c13t0d1 /dev/rdsk/c13t0d1
disk 80 0/7/0/0.98.4.19.0.0.0 sdisk CLAIMED DEVICE DGC CX600WDUNB
/dev/dsk/c16t0d0 /dev/rdsk/c16t0d0
disk 82 0/7/0/0.98.4.19.0.0.1 sdisk CLAIMED DEVICE DGC CX600WDR5
/dev/dsk/c16t0d1 /dev/rdsk/c16t0d1
disk 78 0/7/0/0.98.8.19.0.0.0 sdisk CLAIMED DEVICE DGC CX600WDUNB
/dev/dsk/c17t0d0 /dev/rdsk/c17t0d0
disk 83 0/7/0/0.98.8.19.0.0.1 sdisk CLAIMED DEVICE DGC CX600WDR5
/dev/dsk/c17t0d1 /dev/rdsk/c17t0d1
.
.
strings /etc/lvmtab shows:
/dev/vg00
/dev/dsk/c1t2d0
/dev/dsk/c2t2d0
/dev/vg01
/dev/dsk/c1t0d0
/dev/dsk/c2t0d0
/dev/vgt01
/dev/dsk/c12t0d1
/dev/dsk/c13t0d1
/dev/dsk/c16t0d1
/dev/dsk/c17t0d1
.
So I'm thinking the disks are there but somehow the volume group info got lost. Would that be right? Should I try the volume group recovery commands?


Mel Burslan
Honored Contributor

Re: Lost a disk drive

vgcfgrestore -n vg01 /dev/rdsk/c1t0d0
vgcfgrestore -n vg01 /dev/rdsk/c2t0d0

should restore the LVM config information on these disks. then don't forget to activate the vg by

vgchange -a y vg01

good luck
________________________________
UNIX because I majored in cryptology...
Rick Garland
Honored Contributor

Re: Lost a disk drive

Looks like all that is need is to activate the VG

vgchange -a y vg01

Andrew Rutter
Honored Contributor

Re: Lost a disk drive

hi,

the volume group recovery wont work as the disks are missing from the ioscan

/dev/vg01
/dev/dsk/c1t0d0
/dev/dsk/c2t0d0

do a full ioscan to scan the whole system and then
#insf -e and repeat the ioscan for the discs

if this still shows the disks as missing then i would try a reboot and reseat the drives

if there still missing likely is there faulty and will need replacing

can you see the led on the drive from the front?
should be green even with no activity

andy
melvyn burnard
Honored Contributor

Re: Lost a disk drive

I agree with Andrew, you appear to be missing the relevant disks in the ioscan output.

You need to check that the discs are seated properly, and possibly do a reboot.
Other than that, itmay need a hardware call with HP.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Cem Tugrul
Esteemed Contributor

Re: Lost a disk drive

How about these steps?

1- vgchange -a y vg01
if succesfully activated
then
cat /etc/fstab for to check mount point
then
if you do not see;
2- mount /dev/vg01/lvol1 /mountdir
if you see then just write;
mount /mountdir

Good Luck,
Our greatest duty in this life is to help others. And please, if you can't
Cem Tugrul
Esteemed Contributor

Re: Lost a disk drive

opppsss..sorry

your ioscan -fnCdisk command does not show

dev/vg01
/dev/dsk/c1t0d0
/dev/dsk/c2t0d0

so ignore my steps but you may go with
first
#insf -e

then just run
#ioscan

and 1 more time
#ioscan -fnCdisk
if both /dev/dsk/c1t0d0
/dev/dsk/c2t0d0 not seen
Finally if possible reboot your server

good luck,

Our greatest duty in this life is to help others. And please, if you can't
Don Spare
Regular Advisor

Re: Lost a disk drive

Okay......
The server is back up with all disks showing and the databases up and running. I took it down, removed the power, reseated the cards, put it all back together and brought it back up without attempting to mount the Oracle sw filesystem. I then did an ioscan which showed the missing devices. A mountall then took care of the rest. I'm still not sure what exactly was wrong but for now it is running and I don't need to request a service call on a server that has no service contract.

Thank you all for your suggestions.
Matti_Kurkela
Honored Contributor

Re: Lost a disk drive

A L2000 with internal disk drives that seem to vanish from ioscan and come back when the server is power-cycled? Sounds like a problem we had on several L series servers a year or two ago. Several versions of 9 and 18 GB disks required a firmware upgrade.

The affected disks were these Seagate models with HP firmware:
ST136403LC
ST318203LC
ST318203LW
ST39103LC
ST39103LW

If you have one of these disk types, check the firmware version with the "diskinfo -v" command. If the "rev level" is less than HP04 , you will need to upgrade.

Use the ITRC's patch databese to find a disk firmware patch named PF_DSEACH3HP04.
MK
Don Spare
Regular Advisor

Re: Lost a disk drive

We have the ST39103LC disks. But... we won't be making any changes. This server is scheduled for shutdown by the end of October and management is unwilling to pay for upgrades to a server that has been running for 5 years and will be gone shortly. Thanks for the info though.


Matti_Kurkela
Honored Contributor

Re: Lost a disk drive

Suit yourself.

However, I seem to recall that the problem in those disks was that the disk's internal diagnostic does not accurately compensate for the aging of some electronic component.

When the disk "vanishes" for the first time, it is a sign that the component is "borderline" according to the (invalid) diagnostic rule. It will pass the diagnostic when cold, but may fail when hot. As the time passes, the component will get "worse". Eventually the situation may get so bad the disk will not start up even when cold.

The catch is: a disk that is in a "failed" state won't take the firmware upgrade, so if you want to use the patch, you must do it before the disk fails completely.

The firmware patch changes the disk's internal diagnostic rules, so that normal aging of the disk will not cause a failure.
You won't need a HP engineer to install the patch: see the patch description for installation instructions.

If you choose not to run the firmware upgrade, I suggest you to keep your backups carefully up to date.

I think you might survive till October, but you might have to power-cycle the server once or twice during that time to get the disks back again.

If you lose one of the disks permanently, be assured that the other is not far behind.
MK
Don Spare
Regular Advisor

Re: Lost a disk drive

Matti,
Thank you for your input. Since I am not a trained HP-UX admin, I was not aware of that issue. I will pass this info along.

Don