Operating System - HP-UX
1827482 Members
2049 Online
109965 Solutions
New Discussion

Re: Root disk failing but not power-failed. How do I replace?

 
SOLVED
Go to solution
Ray Humpage
Frequent Advisor

Root disk failing but not power-failed. How do I replace?

I have a mirrored root disk which is failing.

In the pvdisplay is shows as unavailable.

root cadprod:/> pvdisplay /dev/dsk/c1t2d0
--- Physical volumes ---
PV Name /dev/dsk/c1t2d0
VG Name /dev/vg00
PV Status unavailable
Allocatable yes


Looking at one of the lv's is shows it's extents as stale.

00000 /dev/dsk/c1t2d0 00545 stale /dev/dsk/c2t2d0 00545 current
00001 /dev/dsk/c1t2d0 00546 stale /dev/dsk/c2t2d0 00546 current
00002 /dev/dsk/c1t2d0 00547 stale /dev/dsk/c2t2d0 00547 current


But it's still claimed.

disk 0 0/0/1/1.2.0 sdisk CLAIMED DEVICE SEAGATE ST39204LC
/dev/dsk/c1t2d0 /dev/rdsk/c1t2d0


Should I lvreduce all of the vg00 lv's
and then vgreduce vg00? Or do I wait for this drive to powerfail or just simply pull it and replace it?
9 REPLIES 9
RAC_1
Honored Contributor

Re: Root disk failing but not power-failed. How do I replace?

lvreduce -m 0 /dev/vg00/lvolx

(repaeat it for all lvols. also you may need to use -k option)

vgreduce /dev/vg00 /dev/dsk/cxtxdx

Shutdowndown. Replace disk. No need to shutdown if disk is hot swappable.

Once replaced.

vgcfgrestore -n /dev/vg00 /dev/dsk/cxtxdx
vgchange -an vg00
vgsync

Anil
There is no substitute to HARDWORK
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Root disk failing but not power-failed. How do I replace?

First make sure that all the extents on the good drive are "current". I'll assume this is a hot-plug drive. Next, we will convert this to a hard failure by pulling the bad drive out a few centimters and letting in spin down. Wait about 60 seconds and then pull the drive completely out. You can then begin the normal "hot" replacement of a failed boot drive.

Insert the replacement drive and wait about 60 seconds for it to become ready.

vgcfgrestore -n /dev/vg00 /dev/rdsk/c1t2d0
vgchange -a y /dev/vg00
mkboot /dev/rdsk/c1t2d0
mkboot -a "hpux -lq (;0)/stand/vmunix" /dev/rdsk/c1t2d0
lvlnboot -R
vgsync /dev/vg00

That should fix you.
If it ain't broke, I can fix that.
Mel Burslan
Honored Contributor

Re: Root disk failing but not power-failed. How do I replace?

Although it reports as claimed, it does not necessarily mean that it is still functional. Unavailable status comes from disk's failure to communicate with the disk controller. So, your drive is dead in all meaning.

Depending on your particular case, i.e., whether it is hot pluggable or not, the regular procedures need to be followed to replace a bad root disk.

if it is hot plug type, just yank the drive out of the cage and put back in an identical drive. Then I usually follow my cookbook as seen below:

DEVICE=c1t2d0
vgcfgrestore -n vg00 /dev/rdsk/${DEVICE}
mkboot -a "hpux -lq (;0)/stand/vmunix" /dev/rdsk/${DEVICE}
mkboot -l /dev/rdsk/${DEVICE}
cd /usr/sbin/diag/lif
mkboot -b updatediaglif2 -p ISL -p AUTO -p HPUX -p LABEL /dev/rdsk/${DEVICE}
lvlnboot -R
lvlnboot -v #(to verify everything looks right)
vgchange -a y /dev/vg00
vgsync vg00

Good luck and cross your fingers :)
________________________________
UNIX because I majored in cryptology...
Ray Humpage
Frequent Advisor

Re: Root disk failing but not power-failed. How do I replace?

I have one lv which is only on the bad disk.
For some reason it never was mirrored.

/dev/vg00/lvlocal 1024000 395765 589708 40% /usr/local

These are programs we created and they are all backup up etc.

I'm just wondering if I remove the disk and this lv has problems will it then cause problems with the whole vg00?

Or can I just cause the drive to fail and rebuild and then restore this lv from backup?
Steven E. Protter
Exalted Contributor

Re: Root disk failing but not power-failed. How do I replace?

A disk doing what this is doing must be replaced.

As A. Clay notes, make sure the extents are current. Then see if you can get a make_tape_recovery backup to run and get that disk out of the system, repleaced.

It would be a very bad idea to just pull and replace. You can save yourself a lot of work by trying to get a good backup.

If you already have a recent make_tape_recovery backup then you can pull the drive, replace it and then boot the system off the tape.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Ray Humpage
Frequent Advisor

Re: Root disk failing but not power-failed. How do I replace?

Ok. I got the disk to fail.

disk 0 0/0/1/1.2.0 sdisk NO_HW DEVICE SEAGATE ST39204LC
/dev/dsk/c1t2d0 /dev/rdsk/c1t2d0


I show one of lv's
--- Logical extents ---
LE PV1 PE1 Status 1
00000 /dev/dsk/c1t2d0 01795 current
00001 /dev/dsk/c1t2d0 01796 current
00002 /dev/dsk/c1t2d0 01797 current
00003 /dev/dsk/c1t2d0 01798 current

How can it be current on a disk that doesn't even exist?
A. Clay Stephenson
Acclaimed Contributor

Re: Root disk failing but not power-failed. How do I replace?

Because from an LVM perspective those PE's have not been written to; ie there have been to changes to those PE's. If you needed to update those extents then those on the bad disk would become stale.
If it ain't broke, I can fix that.
Devender Khatana
Honored Contributor

Re: Root disk failing but not power-failed. How do I replace?

Hi,

Steven - As the disk was allready mirrored make_tape_recovery should be the second option. First options should be by the vgcfgrestore method as stated in earlier posts. The biggest advantage here would be no downtime involved.

Your Lvol which was allocated only to failed disk and was not mirroed across second disk would be created and will have no impact on the vg00. Once vgcfgrestore and other procedures which are required as it is a bootable disk ,are complete you can restore data. Data can be retrived from file system backup or any recent make_tape_recovery tape (As it was a file system in vg00)

Prior to restoring backup I suggest to mirror this lvol across second disk by
#lventend -m 1 /dev/vg00/lvlocal /dev/dsk/c1t2d0

Please note that if for any reason you require to reboot your system before restoring failed disk your system might not boot properly. Check this with following command for existing disk.

This could be because of two reasons. check to ensure trouble free rebooting .

#lifcp /dev/rdsk/cxtydz:AUTO - ( On exising good disk)
(It should display "hpux -lq" or similar depending upon what was set.
(If hpux -lq is not yet then while rebooting halt at pdc prompt and provide "hpux -lq" after interacting with IPL. or set in hpux -lq now.

#setboot -a (H/W path of /dev/rdsk/cxtydz)
( Required is your alternate boot path is not set to boot from this drive confirm by setboot command)


This should add your second disk to alternate path if it was not done earlier.

This is important as your mirrored drive is intact and original drive is failed.

Just a thought.
Good Luck.

HTH,
Devender
Impossible itself mentions "I m possible"
Saravanan_15
Advisor

Re: Root disk failing but not power-failed. How do I replace?

Hai Ray,
As our experts told u can remove the disk and re-mirror the LVs using another PV. If you are not planning to remove the disk now, you can try this.
You can try to increase the value of PV i/o time out. By default the value of PV timeout is 30 seconds. you can increase this value to 120 seconds by using,

#pvchange -t 120 /dev/dsk/c1t2d0

But after the VG re-activation or system reboot only the PV will become available. Once it's become availble you can synchronize the volume by,

# lvsync

Most probably ,this will help you . Try to install latest HE( Hardware Enablement )patches released by HP.