Disk Enclosures
1748149 Members
3456 Online
108758 Solutions
New Discussion юеВ

Re: EVA4000 BL460c RHEL 5 - LVM broken after update

 
Don Vanco - Linux Ninja
Regular Advisor

EVA4000 BL460c RHEL 5 - LVM broken after update

Just updated a RHEL 5 install, and on reboot, the SAN-attached LVM groups are gone (no devices found)

We upgraded the OS with all the latest updates - this included a kernel update to 2.6.18-52.1.14.el5
Prior to that we were running the 2.6.18-8.1.14.el5 kernel

We upgraded the Proliant support pack to v8

We upgraded the FC HBA driver to hp_qla2x00-2007-10-05 (the latest)
Prior to the we were on the previous release from June of 07

When we reboot the server we see the following 3 errors.
These errors appear when the kernel first starts to load, just after the
"Red Hat nash version 5.19.6 starting"
message appears:

scsi 0:0:1:0 unexpected response from LUN 1 while scaning, scan aborted
scsi 1:0:0:0 unexpected response from LUN 1 while scaning, scan aborted
scsi 0:0:2:0 unexpected response from LUN 1 while scaning, scan aborted

When the system boots, the LVM configuration that was built on our LUNs no longer mount - it says that the volume group devices do not exist.

Here is the output from some of the fiberutil commands:
[root@rabldb1 hp_fibreutils]# lssd
sda 0,0,0,1 HP HSV200 6110
sdb 0,0,0,2 HP HSV200 6110
sdc 0,0,0,3 HP HSV200 6110
sdd 0,0,0,4 HP HSV200 6110
sde 0,0,0,5 HP HSV200 6110



[root@rabldb1 hp_fibreutils]# lssg
sg0 0,0,0,0 HP HSV200 6110
sg1 0,0,0,1 HP HSV200 6110
sg2 0,0,0,2 HP HSV200 6110
sg3 0,0,0,3 HP HSV200 6110
sg4 0,0,0,4 HP HSV200 6110
sg5 0,0,0,5 HP HSV200 6110
sg6 0,0,1,0 HP HSV200 6110
sg7 1,0,0,0 HP HSV200 6110
sg8 1,0,2,0 HP HSV200 6110


[root@rabldb1 hp_fibreutils]# ./probe-luns -a
Adding legacy tape devices to /proc/scsi/device_info
Scanning /proc/scsi/qla2xxx/0, target 1, LUN 0
Scanning /proc/scsi/qla2xxx/1, target 2, LUN 0


scsi0 00 00 00 HP HSV200 6110 RAID
scsi0 00 00 01 HP HSV200 6110 Direct-Access
scsi0 00 00 02 HP HSV200 6110 Direct-Access
scsi0 00 00 03 HP HSV200 6110 Direct-Access
scsi0 00 00 04 HP HSV200 6110 Direct-Access
scsi0 00 00 05 HP HSV200 6110 Direct-Access
scsi0 00 01 00 HP HSV200 6110 RAID
scsi1 00 00 00 HP HSV200 6110 RAID
scsi1 00 02 00 HP HSV200 6110 RAID


Here's the LVM backup so you've an idea of what's missing:
[root@rabldb1 hp_fibreutils]# ls /etc/lvm/backup/
VolG-FtpC VolG-Ftpods VolG-Orator VolGroup00 VolG-Rpms VolG-Rssql


We attempted to go back to the old kernel - I can (still broken) so I then attempted to uninstall the new drivers and re-install the previous driver, and I cannot. The previous driver complains that it requires Red Hat 5

Have a call in to HP. I'd like to believe that I can restore the LVM configs - but as I can see no UUID info I am not sure how to proceed...

All the LVM commands simply show the local disk - as though the other PVs/VGs/LVs never existed....

TIA
Don
4 REPLIES 4
Rob Leadbeater
Honored Contributor

Re: EVA4000 BL460c RHEL 5 - LVM broken after update

Hi Don,

When you install the fibre drivers, make sure that it does actually complete and build a new kernel.

In previous versions I know the error handling in the scripts wasn't particularly good, so that it would appear to do it's thing but you could end up with a kernel without any qla modules.

This may not be your problem, as the output from lssd looks good, but it's worth checking.

Hope this helps,

Regards,

Rob
Don Vanco - Linux Ninja
Regular Advisor

Re: EVA4000 BL460c RHEL 5 - LVM broken after update

Thanks Rob. I actually validated this using modinfo and building my own initrd.

All really fine with the storage sub-system....

As it happens, there's a bit more to the story. The disk devices are there (/dev/sda -> /dev/sde) and they have a single partition as expected - but the partition is labeled as type "extended" and NOT LVM as I would expect.

So something either "changed" in the os that thinks these are now LVM, or they were not configured correctly to start - but I don't see how LVM could have been built this way.

HP has suggested that I change the partition type. The issue is that you cannot change type to/from extended - you have to delete and re-create the partition, and I am questioning whether or not that will be data desructive....

Don
Don Vanco - Linux Ninja
Regular Advisor

Re: EVA4000 BL460c RHEL 5 - LVM broken after update

At this point I am considering rolling back LVM, etc - but I'm just not clear what packages I'd really need to roll bacl - surely lvm, but possibly initscritps, disk-utils, etc, etc, etc
Don Vanco - Linux Ninja
Regular Advisor

Re: EVA4000 BL460c RHEL 5 - LVM broken after update

At present the root cause is unclear, but the issue seems to lie in an interaction between LVM and the partition types.

Somehow, the partitions on the LUNs used as PVs are "extended" and NOT LVM!!

It is not clear how this could be (the customer is trying to validate the method used to generate the partitioning)

The crazy thing is how it was "fixed"....

I split the RAID set (mirror), that way I had a drive in a "know" state that I could rebuild against.
I booted the secondary drive, and forced a roll-back of LVM to the prior release.

On reboot, this drive came up with all the LVM VGs/LVs intact!

So - assuming I had a fix, I again booted to the primary drive, intending to roll its version of LVM back as well - and it too came up without issue!! (new kernel, new volume of LVM)

So - while the issue appears resolved, I am baffled as to _why_ it fixed itself, or how LVM was built, and is functional, against partitioned labeled as "extended".

One final, interesting note - while both fdisk and parted see the partitions as extended, this tool:
http://www.cgsecurity.org/wiki/TestDisk
...sees LVM2 partition types!

If anyone cares to offer any kind of conjecture I'd love to hear it....