HPE EVA Storage

MSA2000 - Qlogic QMH2462: SAN Boot + Multipath inconsistency

 
Rob_69_1
Frequent Advisor

MSA2000 - Qlogic QMH2462: SAN Boot + Multipath inconsistency

Hi all,

I'm really puzzled by what's happening on my system. I'll try to keep it short but will provide more specs if you ask me.

Situation:
- Oracle VM 2.1.5 OS (in the end, it's a customized XEN)
- 2 BL460C Blades
- QMH2462 HBA
- Brocade switches
- HP 2012FC SAN Storage

The system already boots from SAN and have Multipath configured and working (STABLE).

Each server has its own boot LUN, masked to the other

There's another LUN shared amongst the 2 servers (OCFS2 formatted, shared storage).

I repeat: system is STABLE. Now, the problem:

If I create another LUN on the SAN, and expose it to a server, I can easily configure it live, re-scanning the HBA, and it works right away. I can mount it, work with it, copy files to it, everything. Failing paths working, etc.

I can work for hours with that, it's just STABLE. Heavily testing path failures, all ok.

But AS SOON AS I PERFORM A REBOOT, kernel panic:


invalid partition table on /dev/mapper/mpath0 -- wrong signature 0
device-mapper: table: device 8:48 too small for target
device-mapper: table: 253:1: multipath error getting device
device-mapper: ioctl : error adding target to table
device mapper: reload ioctl failed: Invalid argument
Creating root device.
Mounting root filesystem
mount: could not find filesystem '/dev/root'

It doesn't make sense. How comes that just presenting a LUN (with already a primary partition and a filesystem on it), dismantles a working configuration?

PLEASE NOTE THAT IF I HIDE THE LUN FROM BEEING SEEN BY THE SERVERS, EVERYTHING GOES BACK TO A WORKING AND STABLE SITUATION. WITHOUT ANY OTHER INTERVENTION... (???!)

Are there particular steps to follow to add LUNS to a multipath system?

I want to keep this post short, so I won't post the steps I follow to add the LUN right away. But I would really, really appreciate any help with this... Any of you having had a similar issue?

Thanks...
1 REPLY 1
Rob_69_1
Frequent Advisor

Re: MSA2000 - Qlogic QMH2462: SAN Boot + Multipath inconsistency

After further investigations on the issue, I'm still not able to fix it.

It looks like it is quite common behaviour in a SAN environment that after presenting a new LUN to a host, at the next reboot there will be a mismatch in the order LUNs are discovered by the host itself. Also major/minor numbers will change almost for sure.

In my understanding, this means that at boot time the initrd image being loaded in memory will try activate Device Mapper for "dm-XX" and mount a LUN that's almost certainly different from the LUN it should boot from.

Therefore the result is a "wrong signature" of the partition table, if compared to expected partition table stored in the Master Boot Record by grub.

I'm not really sure that what I've exposed is 100% correct on the technical point of view, but I'm sure the concept is right... So I'm asking once again if anybody could please help out with this?

Also a trick or a workaround would be more than ok, I mean, if at INSTALLATION TIME a correct initrd image is created, WHY if I try to create a new initrd (using mkinitrd) AFTER HAVING ADDED AND ACTIVATED THE NEW LUN, this is resulting in a non-booting system anyway?

It's really frustrating.

Thank you