Re: WEIRD LVM problem...

Kenneth Platz · ‎02-18-2005

This morning as we were preparing to do a BCV refresh from our production system to our development system, we discovered that apparently all of the controller numbers on the disks attached to this server decided to "flip-flop".

We've managed to "un-fubar" the system (I'm imagining it wouldn't reboot correctly) by creating a new lvmtab file (via vgscan) and doing some selective vgexport/vgimports on problem VG's. The original /etc/lvmtab file (or at least strings of it) was:

[/] root@rockwall #strings /etc/lvmtab.orig
/dev/vg00
/dev/dsk/c0t0d0
/dev/dsk/c0t2d0
/dev/dsk/c1t0d0
/dev/dsk/c1t2d0
/dev/vgRABlog1
/dev/dsk/c2t8d0
/dev/vgRABlog2
/dev/dsk/c2t8d1
/dev/vgRABother
/dev/dsk/c2t8d2
/dev/dsk/c4t8d2

And the "new" lvmtab file looks like:
[/var/adm/syslog] root@rockwall #strings /etc/lvmtab
/dev/vg00
/dev/dsk/c4t0d0
/dev/dsk/c4t2d0
/dev/dsk/c5t0d0
/dev/dsk/c5t2d0
/dev/vgRABlog1
/dev/dsk/c0t8d0
/dev/dsk/c2t8d0
/dev/vgRABlog2
/dev/dsk/c0t8d1
/dev/dsk/c2t8d1
/dev/vgRABother
/dev/dsk/c0t8d2
/dev/dsk/c2t8d2

Now we've got the problem fixed, but the important question is... what can cause this to happen? I've checked the /stand/ioconfig and /etc/ioconfig files, and they were last modified on Feb 6th (the last time the system was rebooted), but I am assured that nobody explicitly removed them (the only way I can think of that this would happen). Is there any way of determining the *create* time of a file (as opposed to the last modification time)? Any ideas/suggestions on how this could have happened?

I think, therefore I am... I think!

RAC_1 · ‎02-18-2005

Are there alternate paths for evry PV?? If yes, was there any problems with primary paths??

There is no substitute to HARDWORK

Kenneth Platz · ‎02-18-2005

RAC, the alternate paths aren't the problem. For example, take a look at /dev/vg00 -- it goes from being on c0 and c1 to being on c4 and c5. Those are the same physical disks which just suddenly decided they wanted to be on a different controller instance.

I think, therefore I am... I think!

RAC_1 · ‎02-18-2005

where these diska are coming from??

There is no substitute to HARDWORK

melvyn burnard · ‎02-18-2005

Was anything maybe changed on the SAN (assuming you have one) and what type of array is in use.
I have heard of this type of isue occurring when WWN changes happen, and also one model of array I seem to recall could occasionally "hiccup" and do this.

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!

Kenneth Platz · ‎02-18-2005

The disks I'm *most* concerned with are the disks in /dev/vg00. These disks are standard 36.4G HP disks, connected to two channels of a dual-scsi combo card (ie, c4t0d0 and c4t2d0 are on one channel and c5t0d0 and c5t2d0 are on the other, and mirrored from one channel to the other). We did not change any WWN's on the fiber channel, and generally when WWN's *do* change, the controller numbers increment -- I've never seen them "flip-flop" like this.

Again, my primary concern are the disks for vg00 -- how the heck can these guys "flip-flop".

I think, therefore I am... I think!

Denver Osborn · ‎02-18-2005

If you wouldn't mind, can we see what the vg00.conf and vg00.conf.old files say about vg00.

vgcfgrestore -f /etc/lvmconf/vg00.conf -l

vgcfgrestore -f /etc/lvmconf/vg00.conf.old -l

thanks,
-denver

Kenneth Platz · ‎02-18-2005

[/etc/lvmconf] root@rockwall #vgcfgrestore -f ./vg00.conf -l
Volume Group Configuration information in "./vg00.conf"
VG Name /dev/vg00
---- Physical volumes : 4 ----
/dev/rdsk/c4t0d0 (Bootable)
/dev/rdsk/c4t2d0 (Non-bootable)
/dev/rdsk/c5t0d0 (Bootable)
/dev/rdsk/c5t2d0 (Bootable)
[/etc/lvmconf] root@rockwall #vgcfgrestore -f ./vg00.conf.old -l
Volume Group Configuration information in "./vg00.conf.old"
VG Name /dev/vg00
---- Physical volumes : 4 ----
/dev/rdsk/c0t0d0 (Bootable)
/dev/rdsk/c0t2d0 (Non-bootable)
/dev/rdsk/c1t0d0 (Bootable)
/dev/rdsk/c1t2d0 (Bootable)

And no, c5t2d0/c1t2d0 should not *really* be bootable, but we had a rookie admin in here who, while he was mirroring the root disks, decided to pvcreate -B everything.

I think, therefore I am... I think!

Kent Ostby · ‎02-18-2005

My experience has been that when you see this, /stand/ioconfig has gotten redone for some reason and "in the meantime", you've added new hardware which the system finds sooner then the root paths.

"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"

Kenneth Platz · ‎02-18-2005

Oz,

Yeah, I had figured that the most likely culprit was that both /stand/ioconfig and /etc/ioconfig got nuked somehow, but I'm *pretty* certain nobody intentionally nuked it.

Do you know of any other way this could have happened, or if there are any processes/jobs that are known to nuke the ioconfig files?

TIA

I think, therefore I am... I think!

Rory R Hammond · ‎02-19-2005

Ken,

I have some boxes connected via Dual MacData switches to our SAN (IBM). A zoneing change on the switch would/could/might cause this problem.

Rory

There are a 100 ways to do things and 97 of them are right

Duncan Edmonstone · ‎02-19-2005

Ken,

Well disk device file names are complety based on hardware paths, so another possibility rather than iosconfig being reconstructed is that something changed in the hardware paths. As parts of the hardware path are based on FCIDs in SANs, these *can* change... can you post an ioscan -fn output?
(preferably as an attachment - that way it won't lose its formatting)

HTH

Duncan

I am an HPE Employee

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: *WEIRD* LVM problem...

*WEIRD* LVM problem...