1833772 Members
2286 Online
110063 Solutions
New Discussion

Re: vPar no more boot

 
SOLVED
Go to solution
Eric SAUBIGNAC
Honored Contributor

vPar no more boot

Bonsoir,


I am currently in the process of migrating some vPars from one san Netapp cluster storage to a new one.

The vPars rely on an old RP7420 and run under HP-UX 11iv1 / vPar A.03.02.04.

All the vPar are booted from the SAN.

LVM mirror is installed but not used because mirroring is done at the storage side

I have no access to manage the SAN, nor the storage. The client has :-(



To test the procedure, I mirrored the whole vg00 of a given vPar to a lun from the new Netapp Storage. Maybe I did a mistake at this step, but I guess that mirroring was correctly done.


Then to test the new boot pathes, I have stopped the vPar and from a second vPar I have modified boot path and alt boot path to point to the mirror.


Now the modified vPar no more boot : failed to open 1/0/14/1/0/4/0.6.18.0.0.1.5/stand/vmunix :-(((( Same behaviour with the 3 others paths (0/0/10/1/0/4/0.5.21.0.0.1.5, 1/0/14/1/0/4/0.2.18.0.0.1.5 and 0/0/10/1/0/4/0.1.18.0.0.1.5)


Well, I have tried to configure back the original pathes, BUT I CAN'T BOOT ANYMORE from the original LUN :-(((((( [1/0/14/1/0/4/0.6.0.0.0.1.5, 0/0/10/1/0/4/0.5.0.0.0.1.5, 1/0/14/1/0/4/0.2.0.0.0.1.5 and 0/0/10/1/0/4/0.1.0.0.0.1.5]


The original LUN has been presented to an other vPar. I did an import and with a read-only access I have checked that /stand is OK. It is.


I have some doubts around the SAN. Some questions :


- From the SAN switches, the ports where the faulty vPar is connected are seen down. In fact I don't know if, when a vPar is done but the nPar is running vpmon and some other vPars, the ports of the halted vPar should be seen up or not ?

- Each week end all the vPar are rebooted (the nPar is not) and it works. The difference is that I did a stop of the vPar, then a boot. I mean, could we imagine that when a vPar is rebooted the HBA are kept logged in the fabric, but not when the vPar is stopped ?

- Just before the SAN admin leaves the office, I have seen that the port of the san switches are configured at 4 Gb. But the FC cards in the vPars are A9784-60001, that is 2 Gb HBA. How this configuration can work ????


Tomorrow moring (GMT), I plan to try to launch an ignite install to check if I can see the san disks. While waiting for that, any idea ?


Thanks in advance

Eric
14 REPLIES 14
Torsten.
Acclaimed Contributor

Re: vPar no more boot

Please check the boot disk definition in the vPars configuration again.

In the past I used the "dot only" notation, but I had problems with it. From a certain version on (don't remember exactly), you can use the hardware path (with dots and slashes), like

1/0/14/1/0/4/0.2.0.0.0.1.5:BOOT

instead of

1.0.14.1.0.4.0.2.0.0.0.1.5.0.0.0:BOOT

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Torsten.
Acclaimed Contributor
Solution

Re: vPar no more boot

Regarding the questions:

The OS has the control over the HBAs. If the OS is down, the HBA is uninitialized, but powered. AFAIR the card is no longer logged into the fabric once the OS is down.

Depending on the HBA you will see all speed indicator LEDs blinking until the driver takes control.

But if the switch port is fixed at 4Gb, but the HBA is 2Gb only, this cannot work IMHO.

On the other hand, if the connection server <-> switch is at 2Gb and the switch <-> storage is at 4Gb, this is not a problem.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

Hi Torsten,

Please to meet you back.

"... dot notation ..." : in fact I totally forgot that it was the very first way to design a vpar resource. So I did my tests only with hardware path notation. To apologize myself, I will say that I have not been working with vPars for months and probably years ! I will try tomorrow.

"... AFAIR the card is no longer logged into the fabric" : ok, but when the vpar is trying to acces the LUN, I should catch something on the san switch. I will try ... if I can get IP, user and password of the switchs :-(

"... you will see all speed indicator ..." : no access is possible to the rooms without the san admin. And they leave soon here !

"... this cannot work IMHO. ..." : I think too. I need to clarify this tomorrow

May be I did a big mistake when I took the mirror copy, but the fact that I can access /stand/system from an other vPar let me think that it is not the case. And lvlnboot was OK before I stopped the VM.
Torsten.
Acclaimed Contributor

Re: vPar no more boot

If you was able to mirror to the new disks, they were accessible somehow.

There are some critical points here.

Doing the mirroring wrong (it's easy by mixing up the disk device files - whole disks instead of partition 2 in some steps for example).

Creating the ...:BOOT and ...:ALTBOOT entries in vPars configuration. If this doesn't match the vPar will not boot.


Maybe the vpmon log has more information.


AFAIR this is "vparstatus -e" - but I'm not sure at the moment.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

root@hp22:/>vparstatus -e
1/0/4/0.1.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (0/0/10/1/0/4/0.1.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (0/0/10/1/0/4/0.1.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:25:34 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:25:38 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:25:45 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:25:50 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:25:57 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:26:02 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.2.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.2.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.2.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:26:20 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:26:25 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.2.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.2.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.2.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:26:36 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:26:40 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:29:09 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:29:14 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:30:32 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:30:37 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5.0;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5.0;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5.0;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:31:28 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4
INFO:CPU0:MON:[2010-11-30 16:31:32 UTC] Interact has been stopped on manu
INFO:CPU3:masterp:manu: hard reset
INFO:CPU0:MON:[2010-11-30 16:33:39 UTC] Interact has been stopped on manu
INFO:CPU3:masterp:resetting CPU0
INFO:CPU0:MON:[2010-11-30 16:34:08 UTC] Interact has been stopped on manu
WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
WARNING:CPU0:MON:Unknown filesystem for path (0/0/10/1/0/4/0.5.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (0/0/10/1/0/4/0.5.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (0/0/10/1/0/4/0.5.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:[2010-11-30 16:34:16 UTC] manu load failed
ERROR:CPU0:MON:Could not load manu, error = -4


root@hp22:/>vparstatus -m
Console path: 1.0.0.0.1.0.0.0.0.0.0
Monitor Boot disk path: 1.0.14.1.0.4.0.6.0.0.0.1.5
Monitor Boot filename: /stand/vpmon
Database filename: /stand/vpdb
Torsten.
Acclaimed Contributor

Re: vPar no more boot

So we know there is a problem to access the kernel:

WARNING:CPU0:MON:Unknown filesystem for path (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to open (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix
ERROR:CPU0:MON:Failed to load (1/0/14/1/0/4/0.6.0.0.0.1.5;)/stand/vmunix



When you try to boot from the "old" disks, did you adjust the ...:BOOT device configuration?


Are you able to boot the system in nPar mode?

Doing this may help to check the pathes.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

The script I used to mirror the vg00 is supposed to have done the following commands :

------------

c2t1d5 : source dd
c26t1d5 : mirror dd

pvcreate -fB /dev/rdsk/c26t1d5

mkboot -l /dev/rdsk/c26t1d5

lifcp /dev/dsk/c2t1d5:AUTO /tmp/AUTO.$$
mkboot -a "$(cat /tmp/AUTO.$$)" /dev/rdsk/c26t1d5

mkboot -b /usr/sbin/diag/lif/updatediaglif2 -p HPUX -p ISL -p AUTO -p LABEL -p PAD /dev/rdsk/c26t1d5

vgextend vg00 /dev/dsk/c26t1d5


For each lvol, in their order on the source PV :

lvextend -m 1 $LVOL /dev/dsk/c26t1d5

lvlnboot -R
lvlnboot -v

----------------

Did nothing with setboot because I planned to do it with vparmodify from a second vPar.

Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

I go and check san connections in the room. Back in some minutes
Cortes Albertino
Trusted Contributor

Re: vPar no more boot

Bonjour,

May be something became wrong in boot volume
info.
Perhaps you may also test trying LVM maintenance boot mode, for example using:
# vparboot -p vparname -o '-lm'


Albertino
Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

Bad news,

everything is OK from SAN point of view. And if the switch ports were seen at 4Gb, that's only because there configured autoneg and not up. Checked with the san admin.

other bad news : I cant boot in maintenance mode the vPar with an unused system LUN from an old vPar

So I must assume that I did a mistake during mirroring of the original LUN.

I will now check boot area of the faulty disk while booted with the old LUN
Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

@Albertino : I try to boot in maintenance mode since I have noticed that the vpar has a problem.

@Torsten : you are right, when the vPar is down, HBA are no more loggued in the fabric. And if my confusion about the real speed of the gbic is due to the autoneg configuration, I have realized that the brocade OS fabric version is pretty old : v5.1.0b. You now, this kind of version that was not able to show correctly in the java gui the current configuration of a port when you want to modify its settings ... ;-)

@all

Problem solved. I have now to understand which action has given this result or which action I forgot to finalize the mirroring of vg00. If anyone has a comment to do on how I realized the mirror (check one of my previous post in this thread), please feel free to post.


Some explanation on how I have repaired my mistake.

- Boot in maintenance mode an other system LUN. An healthy one ;-)

- Import as vghp21 the faulty disk. Clean every mirroring previously done.

- Then list boot configuration

root@unknown:/etc>lvlnboot -v

---> Booted healthy lun from the other vPar

Current path "/dev/dsk/c26t2d1" is an alternate link, skip.
Current path "/dev/dsk/c36t2d1" is an alternate link, skip.
Current path "/dev/dsk/c39t2d1" is an alternate link, skip.
Boot Definitions for Volume Group /dev/vg00:
Physical Volumes belonging in Root Volume Group:
/dev/dsk/c29t2d1 (0/0/10/1/0/4/0.5.0.0.0.2.1) -- Boot Disk
/dev/dsk/c26t2d1 (0/0/10/1/0/4/0.1.0.0.0.2.1)
/dev/dsk/c36t2d1 (1/0/14/1/0/4/0.2.0.0.0.2.1)
/dev/dsk/c39t2d1 (1/0/14/1/0/4/0.6.0.0.0.2.1)
Boot: lvol1 on: /dev/dsk/c29t2d1
/dev/dsk/c26t2d1
/dev/dsk/c36t2d1
/dev/dsk/c39t2d1
Root: lvol3 on: /dev/dsk/c29t2d1
/dev/dsk/c26t2d1
/dev/dsk/c36t2d1
/dev/dsk/c39t2d1
Swap: lvol2 on: /dev/dsk/c29t2d1
/dev/dsk/c26t2d1
/dev/dsk/c36t2d1
/dev/dsk/c39t2d1
Dump: lvol2 on: /dev/dsk/c29t2d1, 0


--> Lun I should be able to boot from this vPar. Note the "No Boot Logical Volume configured"


Current path "/dev/dsk/c29t1d5" is an alternate link, skip.
Current path "/dev/dsk/c36t1d5" is an alternate link, skip.
Current path "/dev/dsk/c39t1d5" is an alternate link, skip.
Boot Definitions for Volume Group /dev/vghp21:
Physical Volumes belonging in Root Volume Group:
/dev/dsk/c26t1d5 (0/0/10/1/0/4/0.1.0.0.0.1.5) -- Boot Disk
/dev/dsk/c29t1d5 (0/0/10/1/0/4/0.5.0.0.0.1.5)
/dev/dsk/c36t1d5 (1/0/14/1/0/4/0.2.0.0.0.1.5)
/dev/dsk/c39t1d5 (1/0/14/1/0/4/0.6.0.0.0.1.5)
No Boot Logical Volume configured
Root: ??? on: /dev/dsk/c29t1d5
/dev/dsk/c26t1d5
/dev/dsk/c36t1d5
/dev/dsk/c39t1d5




No Swap Logical Volume configured
No Dump Logical Volume configured

- Then correct boot area with "lvlnboot -b /dev/vghp21/lvol1"



From this point I have been able to boot with the faulty disk in maintenance mode and finalize a clean configuration
Torsten.
Acclaimed Contributor

Re: vPar no more boot

I really don't like the

lvlnboot -R

I prefer the single steps

lvlnboot -r|-b|-d ...


for exactly this reason.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Cortes Albertino
Trusted Contributor

Re: vPar no more boot

Bonjour,

Regarding your used vg00 mirror procedure,

I usually ran the "vgextend /dev/vg00 ..." command just after the "pvcreate -B ...." and also use "lvlnboot -r/-b/-s/-d ...."

Albertino
Eric SAUBIGNAC
Honored Contributor

Re: vPar no more boot

@Albertino : the order in which the mirrord disk is inserted in the vg00 really doesn't matter. Well, of course after pvcreate -B ...

@All : I guess "lvlnboot -R" was the problem. Not 100% sure because I remembered that I added some pvlinks to the mirrored disk AFTER the mirroring procedure.

And I have used this "-R" procedure for a while with no problem. So I am not really sure why it didn't succeeded this time ... Maybe I should have done some more tests to understand more deeply the problem, but I had not so much time for this. A bit lazy probably :-)

Something else : for those who want to migrate san boot disks of ITANIUM vPar, DON'T forget the vpar efi path database : vparefiutil ;-)

Eric