Couldn't active volume group after crash

Larry Le · ‎06-12-2001

Hi,
I had a RAID5 (Mylex) application volume group /dev/vg01 crashed. After replacing the bad drive and restarting the hardware rebuilding the new drive, I tried to activate the new RAID5 with vgchange without success. What I got is,

tmp/> vgchange -a y /dev/vg01
vgchange: Warning: couldn't query physical volume "/dev/dsk/c0t1d0":
The specified path does not correspond to physical volume attached to
this volume group
vgchange: Couldn't query the list of physical volumes.
vgchange: Couldn't activate volume group "/dev/vg01":
Quorum not present, or some physical volume(s) are missing.
Should I re-create the volume group completely and restore the application back from tape? Thanks for your help.
--Larry

Victor BERRIDGE · ‎06-12-2001

Hi,
Have you not forgot the vgcfgrestor -v vgXX /dev/rdsk/cYd0s2 ?

Only after can sou vgchange -a y vgXX to reactivate..

Good luck
All the best

Victor

James R. Ferguson · ‎06-12-2001

Hi Larry:

You need to do a 'vgcfgrestore' after you have replaced the disk and before you attempt the 'vgchange':

# vgcfgrestore -n /dev/vg01 /dev/rdsk/c0t1d0
# vgchange -a y /dev/vg01

See if this cures your ills.

...JRF...

Larry Le · ‎06-12-2001

I restore but the problem persists :(
/> vgchange -a n /dev/vg01
vgchange: Volume group "/dev/vg01" has been successfully changed.
/> vgcfgrestore -n /dev/vg01 /dev/rdsk/c0t1d0
Volume Group configuration has been restored to /dev/rdsk/c0t1d0
/> vgchange -a y /dev/vg01
vgchange: Warning: couldn't query physical volume "/dev/dsk/c0t1d0":
The specified path does not correspond to physical volume attached to
this volume group
vgchange: Couldn't query the list of physical volumes.
vgchange: Couldn't activate volume group "/dev/vg01":
Quorum not present, or some physical volume(s) are missing.
Thanks.
--Larry

Roberto Arias · ‎06-12-2001

Please, try it:
vgchange -q n -a y

Have you got a disk failure en the vg?

Best regards

The man is your friend

Victor BERRIDGE · ‎06-13-2001

Ok then,
I would start all again:
physical config:
#ioscan -fnC disk
to see to correct vol...
#strings /etc/lvmtab
to look at the config of the volume group it belongs to
#vgdisplay -v vg01|more
check info exists
#ll /etc/lvmconf
Go into lvm maintenace mode:
#shutdown -hy 0
ISL>hpux -lm

#vgcfgrestore -n /dev/vg01 /dev/rdsk/c0t1d0
#vgchange -a y /dev/vg01

Tell us what you get as result this time

Good luck
Victor

and try again
#

Vincenzo Restuccia · ‎06-13-2001

Boot into single user mode and override the quorum, as:

# ISL> hpux -is -lq /stand/vmunix

# vgcfgrestore -n /dev/vg01 /dev/rdsk/c0t1d0

Bill McNAMARA_1 · ‎06-13-2001

looks like c0t1d0 is a failed disk..
ioscan -fnCdisk
make sure you see your
ext_bus 0
tgt 1
disk 0

all in a state of CLAIMED

If you don't see them at all, or even one of them is not present, there is hardware failure.

Your syslog might report errors relating to the failure and in some cases a hang on io can trigger a panic.

If you don't see the disk, there is noting you can do other than repace it, vgcfgrestore lvm headers, activate, recreate your filesystems and restore from backup. (unless you have mirror disk - in which case, vgcfgrestore, vgchange activate - the mirrors will resync)

You can partially activate ignoring quorum (50% of disks in the VG available) with vgchange -a y -q n vgname
mount -a to see the damage.
You will loose all lv data on hte failed disk.

If it is just a question of lvm header corruption, the vgcfgrestore will repair it.
But it will not protect any data corruption on the filesystems.

Later,
Bill

It works for me (tm)

KapilRaj · ‎06-13-2001

Hi ,

I DO NOT AGREE WITH ANY OF THE ABOVE REPLIES, if i am wrong you experts may correct me.

The author talks about a VG on a RAID-5 box, SO cXtYdZ is not a Physical disk. It may be an LUN or an alternate path to a LUN in the RAID BOX.

You need not perform ant vgcfgrestore operations. b'cause it's already there. (It's there with the RAID controller , not in physical harddisks)

Tell me, after the crash a rebuild in this RAID-5 array was successful or not ? If not correct that first.

Tell me whether the RAID array is addressed through a single SCSI BUS or an alternate link does exist ?. If it is there you can use vgchange -q n to change the volume group. Then you may proceed with restoring the pvlink.

There are some RAIDboxes with two reduntant controllers , the case may be a controller failure , confirm with vgchange -q n. In such boxes each of the controllers give a devicefile such as c0t6d0.....

Hope this helps you,

Kaps

Nothing is impossible

Bill McNAMARA_1 · ‎06-13-2001

If it was an alternate link lun 0, it still would vgchange activate.
RAID 5 is N+1 so if a disk failed, the /dev/dsk/ would still be okay in an N state.. the OS wouldn't even know about it.

The dsk cannot be queried, which if raid5 means 2 disks may have failed and data is lost.

Quorum is not achieved which suggests that either there is no alternate defined and controller/cable failed and disk may be okay... I suppose he could look for an alternate link, but if he hasn't configured it in lvm we'll never know.

We do need an ioscan.
and a vgdisplay when quorum is overridden
and strings /etc/lvmtab

Later,
Bill

It works for me (tm)

Bill McNAMARA_1 · ‎06-13-2001

pvras and vgras are on the disk. Not on the controllers. Only thing on controllers is cache and controller code.

Bill

It works for me (tm)

KapilRaj · ‎06-13-2001

hi Bill,

I had an experience , let me summarise it,
HP 9000 /HPUX 10.20 with autoraid 12H (two controllers)

One controller in RAID failed and vg was not getting activated i could change it with -q n option then when i replaced controller it started working w/o -q n option

kaps

Nothing is impossible

Bill McNAMARA_1 · ‎06-13-2001

He replaced the failed h/w

"After replacing the bad drive and restarting the hardware rebuilding the new drive, I tried to activate the new RAID5 with vgchange without success."

With the autoraid, there is only one PV in your VG even with an alternate link.
vgdisplay tells you this in the first ten lines.. try pulling a controller again!

Bill

It works for me (tm)

Larry Le · ‎06-13-2001

Hi All,
Thanks very much for your help. However, after booting into maintenance single user mode, I couldn't set the volume group active again even with turning quorum off. I decided to remove the volume group completely and reinstall the archived data back on. With this drastic measure, it failed to remove using lvremove and vgremove

#vgremove vg01 (Volume group not activated)
I'm completely lost here. Is there any way to remove
/dev/vg01
/dev/dsk/c0t1d0
from lvmtab definition so that I can recreate the new volume group and reinstall the files back on. This HP volume is crap!!!
Sorry...

Bill McNAMARA_1 · ‎06-13-2001

use the vgcfgrestore as mentioned earlier.
then vgexport gets rid of your vg completely and all /dev/vgwhatever device files

BUT...

Go into hpux -is
vgchange -a y vg00
mount -a

strings /etc/lvmtab
ioscan -fnk
vgchange -a y -q n vgraid
vgdisplay -v vgraid

and post the outputs up here.

What disk array do you have?

Restoring from backup is drastic if its a disk array problem as suggested earlier..

Later,
Bill

It works for me (tm)

Larry Le · ‎06-13-2001

Hi Bill,
The RAID5 device is Sigma Trimm Model #SA-H350 with Mylex DAC960S controller. It seems that when I re-created the volume group and logical volume, the drive showed some damaged blocks.
Here is the output of ioscan
Class I H/W Path Driver S/W State H/W Type Description
=====================================================================
disk 0 8/0.0.0 sdisk CLAIMED DEVICE MYLEX DAC960S 34732T5
/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0
disk 1 8/0.1.0 sdisk CLAIMED DEVICE MYLEX DAC960S 34732T5
/dev/dsk/c0t1d0 /dev/rdsk/c0t1d0
disk 2 8/0.6.0 sdisk CLAIMED DEVICE SEAGATE ST32171W
/dev/dsk/c0t6d0 /dev/rdsk/c0t6d0
disk 3 8/12/5.0.0 sflop CLAIMED DEVICE TEAC FC-1 HF 07
/dev/floppy/c1t0d0 /dev/rfloppy/c1t0d0
disk 4 8/12/5.2.0 sdisk CLAIMED DEVICE TOSHIBA CD-ROM XM-5401TA
/dev/dsk/c1t2d0 /dev/rdsk/c1t2d0
Thanks very much for your help.

Victor BERRIDGE · ‎06-13-2001

I agree with Bill's last reply...
But now what have you ?
If you want to restore from scratch, you will have to:
# vgexport vg01
Clean all definitions of disks
#ioscan -funC disk
#cd /dev/dsk
#ll
#rmsf /dev/dsk c6t0d0 c6t0d1 c6t0d2...
#rmsf -a -D /dev/rdsk c6t0d0 c6t0d1 c6t0d2...

I would also:
#cd /etc
#rm ioconfig or mv ioconfig ioconfig.bak
#ioscan
#insf -e
and see what I have now:
#ioscan -fnC disk
#mkdir /dev/vg01
#mknod /dev/vg01/group c 64 0x010000
#pvcreate -f /dev/rdsk/c1t0d0 (-f because you may be blocked with info contained on disks)...
etc...
Create group with max size disk you want e.g.
#vgcreate -e 8000 /dev/r5vg01 /dev/dsk/c1t0d0
add the other disks
#vgextend /dev/vg01 /dev/dsk/c2t0d1 etc...

If you got here with success you could use sam to recreate your logical volumes...

But this brings me to think: BEFORE doing all this , I would try again by vgexport vg01 to a mapfile first, try to get the vg01 once the disks are OK by trying a vgimport

Good luck
All the best
Victor

Larry Le · ‎06-13-2001

Hi Victor,
"BEFORE doing all this , I would try again by vgexport vg01 to a mapfile first, try to get the vg01 once the disks are OK by trying a vgimport..." This failed (see reply to Bill above).
Thanks all for your insightful responses. Guess I need a real tape backup instead of the 4mm DDS drive. Any suggestion???
--Larry

Victor BERRIDGE · ‎06-13-2001

Hi Larry,
I hope you understood vgimport using the mapfile...
As Kapli and Bill mentionned in their terms, with RAID5 a disk failure and replacement shouldnt affect the system (for the OS all is still OK)
So what happened? Did you have a good luck at your syslog? what did stm diagnose?
In syslog you would have an EMS alert report...
DDS are fine depending what you have to backup
I would suggest make_recovery for vg00 and a fbackup for the other groups (cheapest solution), now if you have more boxes to backup there are many threads on this forum for good backup software solutions...
For more Higher availability of your system, I usually on sensible configurations use dual path meaning dual controllers on the disks subsystems and alternate path and mirror the vg00...
You may have noticed the importance of /etc/lvmtab now,this is important to have a backup for such case with your disks definitions...

ALl the best

Victor

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Couldn't active volume group after crash

Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash

Re: Couldn't active volume group after crash