Operating System - HP-UX
1834312 Members
2810 Online
110066 Solutions
New Discussion

Re: Problems with /dev/vg00

 
SOLVED
Go to solution
Mike Hassell
Respected Contributor

Problems with /dev/vg00

Fellow HP-UXers,

I'm having problems with my boot disk (c0t6d0) in /dev/vg00. This disk is mirrored to another physical disk in /dev/vg00 and I've just created a make_recovery_tape, so I feel fairly safe at this point.

My question pertains to removing this disk properly while trying to keep the system up and running. This is a production machine and the disk is still functioning, however every two hours or so I get a CRITICAL error message from EMS stating there are problems with this physical disk and the SCSI bus resets. What are the proper steps to take when a situation like this occurs? Typically with other disks, I would break the mirror, replace the disk and recreate the mirror, with no downtime, however since this is the OS disk I assume this isn't possible.

My thoughts are this:

1. Let the errors continue until after normal business hours as the system still seems somewhat stable (no errors from the Sybase logs).
2. Get a replacement disk for (c0t6d0, problem disk).
3. Break the mirror between the bad disk (c0t6d0) and the good mirrored copy (c0t5d0).
4. Bring the system down
5. Replace the bad drive with the good one.
6. Bring the system back up and this time boot from the alt path of the good disk (c0t5d0)
7. Recreate the mirrors.
8. Reboot and ensure the system is functioning properly.

If all else fails I have a good make_recovery_tape that I just created today and another one that I created on Friday. Any thoughts on this one? Thanks.

-Mike
The network is the computer, yeah I stole it from Sun, so what?
5 REPLIES 5
John Payne_2
Honored Contributor

Re: Problems with /dev/vg00

That sounds like a well thought out plan. Are you sure that it is the disk causing the errors? You may want to rule out problems with the scsi bus, but if you have a good spare disk, replacing the disk will rule out a disk problem if the problem persists...
Spoon!!!!
linuxfan
Honored Contributor

Re: Problems with /dev/vg00

Hi Mike,


Look at Jim's answer in the thread.

http://forums.itrc.hp.com/cm/QuestionAnswer/1,11866,0xf890abe92dabd5118ff10090279cd0f9,00.html

-HTH
Ramesh
They think they know but don't. At least I know I don't know - Socrates
linuxfan
Honored Contributor

Re: Problems with /dev/vg00

Hi Mike,


Just make sure you do a " vgcfgbackup /dev/vg00" otherwise vgcfgrestore wouldn't work.

I would go even further,
for i in $(ls -1d /dev/vg*)
do
vgcfgbackup $i
done

-Regards
Ramesh
They think they know but don't. At least I know I don't know - Socrates
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Problems with /dev/vg00

Hi:

There is a White Paper 'Storage: procedure for replacing an LVM disk in HP-UX 10.x and 11.x'. Document ID KBAN00000347. If you do a search for that Doc ID, you can print that document. It covers all the scanarios, boot disk (mirrored and non); non-boot (mirrored and non). It always serves as my cookbook.

When you are done make sure that you enable both boot disks to boot without quorum so that the box will boot automatically even with a failed boot disk.
Something like this on both boot disks:
mkboot -a "hpux -lq (;0)/stand/vmunix" /dev/rdsk/c0t6d0 using the appropriate raw device in each case.

Regards, Clay
If it ain't broke, I can fix that.
Mike Hassell
Respected Contributor

Re: Problems with /dev/vg00

Guys,

Thanks for your responses, they seem very helpful so far. I am currently working with HP to determine if the problem resides with this specific disk or rather the SCSI bus itself, it appears that this disk reports errors first and then the bus itself resets.

I also found out that while troubleshooting this problem, that there are EMC Symmetrix disks tagged off this same SCSI bus, which I know is bad practice, so I'd like to troubleshoot further to ensure that this isn't causing issues as well. It goes to show you that you shouldn't assume that the system was configured in the best manner, unless you did it yourself. Time to audit all our disk configuration and replace this disk if needed, thanks for your responses.

-Mike
The network is the computer, yeah I stole it from Sun, so what?