Operating System - HP-UX
1752286 Members
4532 Online
108786 Solutions
New Discussion юеВ

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement

 
SOLVED
Go to solution
compiler
Frequent Advisor

Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement


Hi all.

I have an PA-RISC HP-UX 11v23 rx6600 system with a LVM mirrored disk (vg00), and I've received a "Media failure" on one of the vg00 disks. Fortunately, the failed disk is the "HA Alternate Path" and not the "Primary path".

I've been reading previous threads on itrc and all HP docs and I would like to validate the process and ask a few questions about it... Any help would be appreciated.

Info Summary:

-------------------------------------------------------
EMS:
Disk at hardware path 1/0/0/3/0.6.0 : Media failure

# ioscan -fnk | grep 1/0/0/3/0.6.0
disk 2 1/0/0/3/0.6.0 sdisk CLAIMED DEVICE HP 300 GST3300655LC
/dev/dsk/c2t6d0 /dev/dsk/c2t6d0s2 /dev/rdsk/c2t6d0 /dev/rdsk/c2t6d0s2
/dev/dsk/c2t6d0s1 /dev/dsk/c2t6d0s3 /dev/rdsk/c2t6d0s1 /dev/rdsk/c2t6d0s3

# vgdisplay -v vg00 | grep /dev/dsk/c2t6d0s2
PV Name /dev/dsk/c2t6d0s2

# lvlnboot -v
Boot Definitions for Volume Group /dev/vg00:
Physical Volumes belonging in Root Volume Group:
/dev/dsk/c0t6d0s2 (1/0/0/2/0.6.0) -- Boot Disk
/dev/dsk/c2t6d0s2 (1/0/0/3/0.6.0) -- Boot Disk

#/stand> cat bootconf
l /dev/dsk/c0t6d0s2
l /dev/dsk/c2t6d0s2

#/stand> setboot
Primary bootpath : 1/0/0/2/0.6.0
HA Alternate bootpath : 1/0/0/3/0.6.0
Alternate bootpath : 1/0/0/2/1.2.0
Autoboot is ON (enabled)

I've tested "pvchange -a Y /dev/dsk/c2t6d0s2" to see if I have LVM OLR and
the command run succesfully.
-------------------------------------------------------


My idea of the procedure:


# pvchange -a N /dev/dsk/c2t6d0s2

# ( Replace hotswap disk )

DOUBT: after replacement, HP documentation says something about:

# scsimgr replace_wwid тАУD /dev/rdisk/disk14
# io_redirect_dsf -d /dev/disk/disk14 -n /dev/disk/disk28

I believe that those commands are only needed for /dev/disk/* because
a new device file appears. In my case, /dev/dsk/c2t6d0 device file
will be reused for the new disk so I don't have to execute them.
Can you confirm my idea is OK?

# ( Create partitions exactly like in /dev/dsk/c0t6d0 )

DOUBT: How to do create the partitions EXACTLY like in c0t6d0?

# vgcfgrestore тАУn vg00 /dev/rdsk/c2t6d0s2

# pvchange -a y /dev/dsk/c2t6d0s2

# vgchange -a y vg00

# mkboot /dev/rdsk/c2t6d0s2

DOUBT: Over c2t6d0s2 or over c2t6d0?

# mkboot тАУa "hpux -lq" /dev/rdsk/c2t6d0s2

DOUBT: Over c2t6d0s2 or over c2t6d0?

# lvlnboot тАУv


Is the above ok? Any missing step?

Thanks a lot for any help.
9 REPLIES 9
P Arumugavel
Respected Contributor
Solution

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement

hi,

http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c01911837/c01911837.pdf

You could find nothing better than this doc, for disk management on lvm environment in hp-ux.

Go through this, you would find the path.

Rgds...
Manix
Honored Contributor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement

>>DOUBT:1
# ( Replace hotswap disk )

>># scsimgr replace_wwid ├в D /dev/rdisk/disk14
# io_redirect_dsf -d /dev/disk/disk14 -n /dev/disk/disk28

>> I believe that those commands are only needed for /dev/disk/* because
a new device file appears. In my case, /dev/dsk/c2t6d0 device file
will be reused for the new disk so I don't have to execute them.

this is not correct !!


Moreover you are on 11.23 ( NOTE IT PLS )
not on 11.31 the device file used by command "scsimgr replace_wwid ├в D /dev/rdisk/disk14" is persistent device file for 11.31 series
& the scsimgr command in to update the new
"WWN_ID" on the same persistent device file.

NOTE :: in 11.31 the device file is independent of the new HW_PATH not in 11.23.

The device file may change , you may get a new one on the same path.


>>># ( Create partitions exactly like in /dev/dsk/c0t6d0 )

>>DOUBT2: How to do create the partitions EXACTLY like in c0t6d0?

let vgcfgrestore take care of this !!


# mkboot /dev/rdsk/c2t6d0s2

DOUBT3: Over c2t6d0s2 or over c2t6d0?

there are three partitions of IA boot disk
s1.s2.s3 the LVM utility resides in s2
the efi headers in s1 & the diagnostics in s3.

You are performing operations in s2 right now.

Please read difference in IA & PA boot disk
structures.

Please read 11.31 persistent device files too

Right now you mixed up few things.

Thanks
Manix
HP-UX been always lovable - Mani Kalra
compiler
Frequent Advisor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement



>[P Arumugavel]
>
> http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c01911837/c01911837.pdf
>
>
> You could find nothing better than this doc, for disk management on lvm environment in hp-ux.

Yes, I know. I've prepared the steps for the disk replacement following that doc. I have some doubts after reading it, so this is I was trying to validate the commands I would execute.

>[Manix]
>># ( Create partitions exactly like in /dev/dsk/c0t6d0 )
>>DOUBT2: How to do create the partitions EXACTLY like in c0t6d0?
>
>let vgcfgrestore take care of this !!

Is it not needed to create the partitions before this step?

Step 7 from the HP Document mentioned above (When Good disk Go Bad), in the replacement procedure, page 38, step 7, and before the vgcfgrestore step (step 8), says:

--------------------------------------
7.- (HP Integrity servers only) Partition the replacement disk.
Partition the disk using the idisk command and a partition description file, and create the
partition device files using insf, as described in Mirroring the Root Volume on Integrity Servers.
--------------------------------------

My rx8640 is an HP Integrity server, should I skip this step and execute vgcfgrestore without creating any partitions?


>> # scsimgr replace_wwid -D /dev/rdisk/disk14
>> # io_redirect_dsf -d /dev/disk/disk14 -n /dev/disk/disk28
>>
>> I believe that those commands are only needed for /dev/disk/* because
>> a new device file appears. In my case, /dev/dsk/c2t6d0 device file
>> will be reused for the new disk so I don't have to execute them.
>
> this is not correct !!
>
> Moreover you are on 11.23 ( NOTE IT PLS )
> not on 11.31 the device file used by command "scsimgr replace_wwid -D /dev/rdisk/disk14"
> is persistent device file for > 11.31 series & the scsimgr command in to update the new
> "WWN_ID" on the same persistent device file.
>
> NOTE :: in 11.31 the device file is independent of the new HW_PATH not in 11.23.
>
> The device file may change , you may get a new one on the same path.

Maybe I'm not explained myself correctly:

My system was installed by HP representatives, I've inherited the system administration (I don't installed it).

In my system (11.23), no /dev/disk/diskXXX files are used, all the device files used are /dev/dsk/c?t?d?. /dev/disk and /dev/rdisk/ folders or files exist. My guess was that I don't needed "scsimgr replace_wwid + io_redirect_dsf" due to this, because when I replace the disk for a new one, this will have the same hardware path than the older one, and will be accesible from the same device file than the older one (/dev/dsk/c2t6d0s2).

My idea was that the scsimgr+io_redirect combo was only needed when you use /dev/disk/disk* device files because the new disk will have a differente /dev/disk/disk* filename than the previous one (ej: /dev/disk/disk14 -> /dev/disk/disk23), so you needed to redirect IO as a "symlink" disk14 = disk23.

But In my case, old_devfile = new_devfile = /dev/dsk/c2t6d0s2 and old_hwpath = new_hwpath = 1/0/0/3/0.6.0 (because disk will be attached to the same scsi port), so I don't need to make the io redirect.

This is my main doubt, I need someone to confirm that with /dev/dsk/* device files I can skip scsimgr+io_redirect_dsf and keep the procedure as follows:


# pvchange -a N /dev/dsk/c2t6d0s2
# ( Replace hotswap disk )
# ( partition the disk with the same exact partitions and sizes as /dev/rdsk/c0t6d0)
# ( it's done with idisk but I don't know how to clone the same parts of other disk)
# vgcfgrestore ├в n vg00 /dev/rdsk/c2t6d0s2
# pvchange -a y /dev/dsk/c2t6d0s2
# vgchange -a y vg00
# mkboot /dev/rdsk/c2t6d0s2
# mkboot ├в a "hpux -lq" /dev/rdsk/c2t6d0s2


Thanks a lot for all the help given.


Torsten.
Acclaimed Contributor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement

As mentioned above, a new disk (even in the same slot) get a new device file, because it has a different WWID, so you need to do the io_redirect command.


After replacement you also need to run idisk to create the partitions and populate the EFI area.


For this reason I always recommend to use a hardware raid for these disks (either by the default SAS chip - Integrated RAID - or by an optional P400 smartarray).

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
compiler
Frequent Advisor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement



Thanks for the answers:


So, correcting the procedure:

# pvchange -a N /dev/dsk/c2t6d0s2
# ( Replace hotswap disk )
# ( partition the disk with the same exact partitions and sizes as /dev/rdsk/c0t6d0 )
# insf -eC disk
# ls -lrt /dev/disk
( get the new device file, new_cxtxdxs2, the one with timestamp at current date )
# io_redirect_dsf -d /dev/dsk/c2t6d0s2 -n /dev/dsk/new_cxtxdxs2
# vgcfgrestore -n vg00 /dev/rdsk/new_cxtxdxs2
# pvchange -a y /dev/dsk/new_cxtxdxs2
# vgchange -a y vg00
# mkboot /dev/rdsk/new_cxtxdxs2
# mkboot -a "hpux -lq" /dev/rdsk/new_cxtxdxs2


And the doubts are just now:

1.- Is the "scsimgr replace_wwid -D /dev/rdisk/old_c2t6d0" still needed before the io_redirect_dsf?
2.- How do I create the exact partition scheme with the exact partition sizes as in the primary boot disk?

I've read that you must create a file with the partition scheme and do:

# cat /tmp/partitionfile
3
EFI 100MB
HPUX 100%
HPSP 400MB

# idisk -wf /tmp/partitionfile /dev/rdsk/new_device_file

Any other aditional step is needed? (efi_cp's, mainly?).

Thanks!


PS:
""" For this reason I always recommend to use a hardware raid for these disks
(either by the default SAS chip - Integrated RAID - or by an optional P400 smartarray)."""

Have Integrity rx8640 and rx6600 raid hardware avalaible by default in the system?

Then I don't understand why the HP engineer used a LVM mirror disk solution in a "critical" system instead of hw raid.

Torsten.
Acclaimed Contributor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement

J6369-90071.pdf has the procedure.

Google should find it.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
compiler
Frequent Advisor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement


Thanks a lot, Torsten, I'm downloading the PDF file right now to read it.

What do you think about my doubts?

1.- Is the "scsimgr replace_wwid -D /dev/rdisk/old_c2t6d0" still needed before the io_redirect_dsf?

2.- How do I create the exact partition scheme with the exact partition sizes as in the primary boot disk? Is there a way to "extract" the partition sizes of the primary boot disk to create the "partition file" with the right sizes? (i.e. the EFI XXXMB + HP-UX 100% + HPSP XXXXMB file).

3.- It's ok the rest of the procedure I wrote in the last post?

Thanks and sorry for still asking the same, I need to be totally sure before doing this replacement. Thanks zillions of times for your help.
compiler
Frequent Advisor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement


Is needed scsimgr + io_redirect_dsf in HP-UX v11.23? Because my scsimgr command does not accept "replace_wwid" and io_redirect_dsf does not exist:

# scsimgr
Usage : scsimgr -o old_target_name -n new_target_name -H interface_hw_path

# io_redirect_dsf
bash: io_redirect_dsf: command not found
compiler
Frequent Advisor

Re: Failed vg00 LVM mirrored disk (HA Alternate bootpath) - doubts about replacement


Hi all.

I finally managed to do it correctly with no major issues. Thanks a lot to everybody involved in this thread.

I'm attaching the full process I followed in case it can help any other in this situation. Notice that the below procedure shows how to replace an Alternate HA Boot Disk in Itanium HP-UX 11.23.


Verify that both disk are syncd before replacement:

#/root> vgsync vg00
Resynchronized logical volume "/dev/vg00/lvol1".
Resynchronized logical volume "/dev/vg00/lvol2".
Resynchronized logical volume "/dev/vg00/lvol3".
Resynchronized logical volume "/dev/vg00/lvol4".
Resynchronized logical volume "/dev/vg00/lvol5".
Resynchronized logical volume "/dev/vg00/lvol6".
Resynchronized logical volume "/dev/vg00/lvol7".
Resynchronized logical volume "/dev/vg00/lvol8".
Resynchronized logical volume "/dev/vg00/lvol9".
Resynchronized logical volume "/dev/vg00/lvol10".
Resynchronized volume group "vg00".

#/root> for i in 1 2 3 4 5 6 7 ; do lvdisplay -v /dev/vg00/lvol${i} ; done | grep "LV Stat"
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd


Deactivate Physical Volumen (-N = all paths) before extraction:

#/root> pvchange -a N /dev/dsk/c2t6d0s2

Do disk replacement. Give 90-120 seconds after disk extract and after disk insert.

#/root> tail -2 /var/adm/syslog/syslog.log
Mar 6 18:21:46 server vmunix: LVM: VG 64 0x000000: Flushing the deferred attach list.
Mar 6 18:21:46 server vmunix: LVM: VG 64 0x000000: PVLink 31 0x026002 Detached.

"Discover" new disk:

#/root> diskinfo /dev/rdsk/c2t6d0

#/root> ioscan -fnC disk | grep c2t6d0

#/root> insf -eC disk

#/root> for i in 1 2 3 4 5 6 7 ; do lvdisplay -v /dev/vg00/lvol${i} ; done | grep "LV Stat"
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale

Erase partition table and create a new one:

#/root> idisk -Rw /dev/rdsk/c2t6d0

#/root> cat /tmp/partition_file
3
EFI 500MB
HPUX 100%
HPSP 400MB

#/root> idisk -wf /tmp/partition_file /dev/rdsk/c2t6d0

#/root> insf -eC disk

#/root> efi_fsinit -d /dev/rdsk/c2t6d0s1


Restore LVM data and reactivate PV:


#/root> vgcfgrestore -n vg00 /dev/rdsk/c2t6d0s2

#/root> pvchange -a y /dev/dsk/c2t6d0s2

#/root> tail -20 /var/adm/syslog/syslog.log | grep vmunix
Mar 6 19:01:11 server vmunix: LVM: VG 64 0x000000: PVLink 31 0x026002 Recovered.


Create boot data:


#/root> mkboot -e -l /dev/rdsk/c2t6d0

#/root> mkboot -a "boot vmunix -lq" /dev/rdsk/c2t6d0

#/root> lvlnboot -v -R /dev/vg00

#/root> vgchange -a y vg00


Some tests to see the sync process:

#/root> for i in 1 2 3 4 5 6 7 ; do lvdisplay -v /dev/vg00/lvol${i} ; done | grep "LV Stat"
LV Status available/syncd
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale
LV Status available/stale

#/root> lvdisplay -v /dev/vg00/lvol2 | grep stale | wc -l
568

#/root> lvdisplay -v /dev/vg00/lvol2 | grep stale | wc -l
437

#/root> lvdisplay -v /dev/vg00/lvol2 | grep stale | wc -l
325

(... after a while ...)

#/root> for i in 1 2 3 4 5 6 7 ; do lvdisplay -v /dev/vg00/lvol${i} ; done | grep "LV Stat"
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/syncd
LV Status available/stale
LV Status available/stale

Hope this helps to anybody in the same situation.