Operating System - HP-UX
1834018 Members
2410 Online
110063 Solutions
New Discussion

Re: *WEIRD* LVM problem...

 
Kenneth Platz
Esteemed Contributor

*WEIRD* LVM problem...

This morning as we were preparing to do a BCV refresh from our production system to our development system, we discovered that apparently all of the controller numbers on the disks attached to this server decided to "flip-flop".

We've managed to "un-fubar" the system (I'm imagining it wouldn't reboot correctly) by creating a new lvmtab file (via vgscan) and doing some selective vgexport/vgimports on problem VG's. The original /etc/lvmtab file (or at least strings of it) was:

[/] root@rockwall #strings /etc/lvmtab.orig
/dev/vg00
/dev/dsk/c0t0d0
/dev/dsk/c0t2d0
/dev/dsk/c1t0d0
/dev/dsk/c1t2d0
/dev/vgRABlog1
/dev/dsk/c2t8d0
/dev/vgRABlog2
/dev/dsk/c2t8d1
/dev/vgRABother
/dev/dsk/c2t8d2
/dev/dsk/c4t8d2


And the "new" lvmtab file looks like:
[/var/adm/syslog] root@rockwall #strings /etc/lvmtab
/dev/vg00
/dev/dsk/c4t0d0
/dev/dsk/c4t2d0
/dev/dsk/c5t0d0
/dev/dsk/c5t2d0
/dev/vgRABlog1
/dev/dsk/c0t8d0
/dev/dsk/c2t8d0
/dev/vgRABlog2
/dev/dsk/c0t8d1
/dev/dsk/c2t8d1
/dev/vgRABother
/dev/dsk/c0t8d2
/dev/dsk/c2t8d2

Now we've got the problem fixed, but the important question is... what can cause this to happen? I've checked the /stand/ioconfig and /etc/ioconfig files, and they were last modified on Feb 6th (the last time the system was rebooted), but I am assured that nobody explicitly removed them (the only way I can think of that this would happen). Is there any way of determining the *create* time of a file (as opposed to the last modification time)? Any ideas/suggestions on how this could have happened?
I think, therefore I am... I think!
11 REPLIES 11
RAC_1
Honored Contributor

Re: *WEIRD* LVM problem...

Are there alternate paths for evry PV?? If yes, was there any problems with primary paths??
There is no substitute to HARDWORK
Kenneth Platz
Esteemed Contributor

Re: *WEIRD* LVM problem...

RAC, the alternate paths aren't the problem. For example, take a look at /dev/vg00 -- it goes from being on c0 and c1 to being on c4 and c5. Those are the same physical disks which just suddenly decided they wanted to be on a different controller instance.
I think, therefore I am... I think!
RAC_1
Honored Contributor

Re: *WEIRD* LVM problem...

where these diska are coming from??
There is no substitute to HARDWORK
melvyn burnard
Honored Contributor

Re: *WEIRD* LVM problem...

Was anything maybe changed on the SAN (assuming you have one) and what type of array is in use.
I have heard of this type of isue occurring when WWN changes happen, and also one model of array I seem to recall could occasionally "hiccup" and do this.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Kenneth Platz
Esteemed Contributor

Re: *WEIRD* LVM problem...

The disks I'm *most* concerned with are the disks in /dev/vg00. These disks are standard 36.4G HP disks, connected to two channels of a dual-scsi combo card (ie, c4t0d0 and c4t2d0 are on one channel and c5t0d0 and c5t2d0 are on the other, and mirrored from one channel to the other). We did not change any WWN's on the fiber channel, and generally when WWN's *do* change, the controller numbers increment -- I've never seen them "flip-flop" like this.

Again, my primary concern are the disks for vg00 -- how the heck can these guys "flip-flop".
I think, therefore I am... I think!
Denver Osborn
Honored Contributor

Re: *WEIRD* LVM problem...

If you wouldn't mind, can we see what the vg00.conf and vg00.conf.old files say about vg00.


vgcfgrestore -f /etc/lvmconf/vg00.conf -l

vgcfgrestore -f /etc/lvmconf/vg00.conf.old -l


thanks,
-denver

Kenneth Platz
Esteemed Contributor

Re: *WEIRD* LVM problem...

[/etc/lvmconf] root@rockwall #vgcfgrestore -f ./vg00.conf -l
Volume Group Configuration information in "./vg00.conf"
VG Name /dev/vg00
---- Physical volumes : 4 ----
/dev/rdsk/c4t0d0 (Bootable)
/dev/rdsk/c4t2d0 (Non-bootable)
/dev/rdsk/c5t0d0 (Bootable)
/dev/rdsk/c5t2d0 (Bootable)
[/etc/lvmconf] root@rockwall #vgcfgrestore -f ./vg00.conf.old -l
Volume Group Configuration information in "./vg00.conf.old"
VG Name /dev/vg00
---- Physical volumes : 4 ----
/dev/rdsk/c0t0d0 (Bootable)
/dev/rdsk/c0t2d0 (Non-bootable)
/dev/rdsk/c1t0d0 (Bootable)
/dev/rdsk/c1t2d0 (Bootable)

And no, c5t2d0/c1t2d0 should not *really* be bootable, but we had a rookie admin in here who, while he was mirroring the root disks, decided to pvcreate -B everything.
I think, therefore I am... I think!
Kent Ostby
Honored Contributor

Re: *WEIRD* LVM problem...

My experience has been that when you see this, /stand/ioconfig has gotten redone for some reason and "in the meantime", you've added new hardware which the system finds sooner then the root paths.

"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Kenneth Platz
Esteemed Contributor

Re: *WEIRD* LVM problem...

Oz,

Yeah, I had figured that the most likely culprit was that both /stand/ioconfig and /etc/ioconfig got nuked somehow, but I'm *pretty* certain nobody intentionally nuked it.

Do you know of any other way this could have happened, or if there are any processes/jobs that are known to nuke the ioconfig files?

TIA
I think, therefore I am... I think!
Rory R Hammond
Trusted Contributor

Re: *WEIRD* LVM problem...

Ken,

I have some boxes connected via Dual MacData switches to our SAN (IBM). A zoneing change on the switch would/could/might cause this problem.


Rory
There are a 100 ways to do things and 97 of them are right

Re: *WEIRD* LVM problem...

Ken,

Well disk device file names are complety based on hardware paths, so another possibility rather than iosconfig being reconstructed is that something changed in the hardware paths. As parts of the hardware path are based on FCIDs in SANs, these *can* change... can you post an ioscan -fn output?
(preferably as an attachment - that way it won't lose its formatting)

HTH

Duncan

I am an HPE Employee
Accept or Kudo