Operating System - HP-UX
1834921 Members
2638 Online
110071 Solutions
New Discussion

Replacing a non-mirrored disk that is a part of a stripe

 
SOLVED
Go to solution
Jerry Friend_1
Frequent Advisor

Replacing a non-mirrored disk that is a part of a stripe

Help. We have a k100 running HPUX 11.0 We also have all of our disks in a Jamaica array. Only the OS disk is mirrored. The other disks are striped into a single volume group. One of the disks in vg01 a stripe of 6 disks had i/o problems last night and it shutdown our db. We rebooted the server and everything looks ok now, but I am worried about a hardware failure and would like to replace the disk in question. I am a newbie so I don't know what I need to do to replace a single disk that is part of a stripe. What would be the best way to accomplish this?
Its not what you know that counts but, your willingness to learn.
5 REPLIES 5
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Replacing a non-mirrored disk that is a part of a stripe

Well, since you aren't mirrored there is no magic answer here. You are first going to have to do a backup (or restore from an existing backup).

I assume that multiple filesystems are affected. You can do an pvdisplay -v /dev/dsk/cXtYd0 to determine which lvols are contained on the disk in question. You will only need to restore data to those LVOL's displayed by this command; the others should be ok.

Here are your steps:

0) Shutdown database and Backup.
1) Shutdown and replace the disk.
2) Boot (and expect problems with vg01).
3) vgcfgrestore -n /dev/vg01 /dev/rdsk/cXtYd0
4) vgchange -a y /dev/vg01
5) Make filesystem(s) for the LVOL's affected. e.g. newfs -F vxfs /dev/vg01/rlvol2
6) mount the filesystem(s)
7) restore the data

Man vgcfgrestore, vgchange, pvdisplay, newfs for details.


If it ain't broke, I can fix that.
Ted Ellis_2
Honored Contributor

Re: Replacing a non-mirrored disk that is a part of a stripe

you will not be able to just swap this disk out. To replace this disk and keep the same configuration. Take a complete backup of everything on that volume group... to tape most likely.

Once you have a clean backup, you will need to shutdown the database and anything else running on that volume group.

I am not sure on the K platform, but to be safe... note the logical volume layout on that disk somewhere so you can rebuild it the same way. lvremove the logical volumes, vgremove the volume group, shutdown the server, replace the disk, reboot. When machine is backup, rebuild the volume group, then rebuild the logical volumes using your notes.

You may also be able to use vgchange and vgcfgrestore to swap the disk and rebuild the logical volume information without doing the first bit... it is just a recommendation on a way to do it safe and clean... even if it takes a bit longer.

Ted
Eric Hess
Advisor

Re: Replacing a non-mirrored disk that is a part of a stripe

Replace a striped drive is similar to replacing any disk. If the disk is
in use (as opposed to one that has failed), stop the volume group I/O with
`vgchange -n'.

Before the disk fails:
1. Backup the data with fbackup, or something similar.
2. Backup the volume group configuration with vgcfgbackup, if you usually
disable the auto-backup with the `-A n' option and argument.

After the disk fails:
3. Replace the disk.
4. Use vgcfgrestore to restore volume group information to the disk.
5. Activate the volume group with vgchange
6. Execute newfs on all affected file system logical volumes.
7. Execute mount for all affected file system logical volumes.
8. Restore all data to the file system logical volumes from backup.
Please note: if a raw partition used by an application is affected,
please contact your application support provider after step 5 to determine
how to recover the data.

To see if you have failures run the following command:

pvdisplay -v /dev/dsk/c?t?d? | grep -i stale.

If you have stale extents your disk is in the process of failing.

Good Luck


I didn't do it. He did!
Jerry Friend_1
Frequent Advisor

Re: Replacing a non-mirrored disk that is a part of a stripe

I ran the command to check for stale extents and it returned stale PE 0 , last night the ems message said that the device reported a mechanical position error. Are there other diagnostics I can run on the disk to determine if it might fail and should be replaced? What else might cause this kind of error, and do you think I should be proactive and replace it now or wait and see?
Its not what you know that counts but, your willingness to learn.
S.K. Chan
Honored Contributor

Re: Replacing a non-mirrored disk that is a part of a stripe

Replace it NOW ! Though you can run the "exerciser" tool in STM to check it, I would not waste my time. If you are seeing I/O errors it could be an "early" sign of disk failure which by the time your PE shows "stale" it'll be too late.