Operating System - HP-UX
1832880 Members
2454 Online
110048 Solutions
New Discussion

When are stale extents not a failed disk?

 
SOLVED
Go to solution
Yvonne Butler
Regular Advisor

When are stale extents not a failed disk?

I have an unusual problem here. There are 2 lv's on a disk which is mirrored, only one of the disks is showing up in lvdisplay as ??? (see example below). The disk has been replaced by an HP engineer but the same problem remains and an lvsync fails with the message:

lvsync: Couldn't re-synchronize stale partitions of the logical volume:
I/O error
lvsync: Couldn't resynchronize logical volume "/dev/vgsg02/lvol6".lvsync: Couldn't re-synchronize stale partitions of the logical volume:
I/O error
lvsync: Couldn't resynchronize logical volume "/dev/vgsg02/lvol6".


--- Logical volumes ---
LV Name /dev/vgsg02/lvol6
VG Name /dev/vgsg02
LV Permission read/write
LV Status available/stale
Mirror copies 1
Consistency Recovery MWC
Schedule parallel
LV Size (Mbytes) 8000
Current LE 2000
Allocated PE 4000
Stripes 0
Stripe Size (Kbytes) 0
Bad block on
Allocation strict
IO Timeout (Seconds) default

--- Distribution of logical volume ---
PV Name LE on PV PE on PV
/dev/dsk/c4t8d0 2000 2000

--- Logical extents ---
LE PV1 PE1 Status 1 PV2 PE2 Status 2
00000 ??? 00000 stale /dev/dsk/c4t8d0 00000 current
00001 ??? 00001 stale /dev/dsk/c4t8d0 00001 current
00002 ??? 00002 stale /dev/dsk/c4t8d0 00002 current
00003 ??? 00003 stale /dev/dsk/c4t8d0 00003 current
00004 ??? 00004 stale /dev/dsk/c4t8d0 00004 current
00005 ??? 00005 stale /dev/dsk/c4t8d0 00005 current
00006 ??? 00006 stale /dev/dsk/c4t8d0 00006 current
00007 ??? 00007 stale /dev/dsk/c4t8d0 00007 current

Any ideas anyone?
8 REPLIES 8
Pete Randall
Outstanding Contributor

Re: When are stale extents not a failed disk?

The I/O error from lvsync leads me to believe you've still got a problem with the disk. Try running dd against it:

dd if=/dev/rdsk/c4t8d0 of=/dev/null bs=1024k

If you get an I/O error from dd, the disk is bad.


Pete

Pete
Robert-Jan Goossens
Honored Contributor

Re: When are stale extents not a failed disk?

Hi Yvonne,

Changes are you are dealing with a failing disk (bad blocks)

Check this doc.

Document description: lvsync: "couldn't re-synchronize" - Device offline/Powerfailed
Document id: KBRC00015370

US
http://www2.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000076534445

Europe
http://www4.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000076534445

Best regards,
Robert-Jan
Patrick Wallek
Honored Contributor

Re: When are stale extents not a failed disk?

What steps were taken when the disk was replaced? What type of disk is it?

What does ioscan show for the disk?

If this disk is in an FC10, then might have to do an fcmsutil with the replace disk option before the OS will se it.

The ??? where the PV name should be still indicates that the VG is not aware of the disk.

Things to do when replacing a disk:

vgcfgrestore to restore the VG config to the disk
vgchange to reactivate the VG
vgsync to sync mirrors


BFA6
Respected Contributor

Re: When are stale extents not a failed disk?

Hi I'm the other sysadmin at the site with this problem.

We did think down the hardware route ourselves and a dd just hung, not errors, no entries in sylog just hung. Likewise a cstm verify hung as well.

ioscan -fnC showed all disks claimed.

The engineer tried two new disks still the same problem. He did the fcmsutil replace_dsk and that responded ok and a diskinfo works on the raw volume but vgcfgrestore etc. still fail
vgdisplay says:-

vgdisplay: Warning: couldn't query physical volume "/dev/dsk/c6t4d0":
The specified path does not correspond to physical volume attached to
this volume group

Almost as though the device file path to the disk has changed or the WWN on the disk is different.

Have a call in with HP for software but any ideas are welcome.
Yvonne Butler
Regular Advisor

Re: When are stale extents not a failed disk?

We're now getting and EMS alert, so it might be hardware?!:

Event Time..........: Thu Jul 14 16:00:11 2005
Severity............: SERIOUS
Monitor.............: disk_em
Event #.............: 100172
System..............: logtms02

Summary:
Disk at hardware path 0/2/0/0.8.0.255.1.4.0 : Device connectivity or
hardware failure


Description of Error:

The device was not ready to process requests, but the cause is not
reportable.
generic_1
Respected Contributor

Re: When are stale extents not a failed disk?

Read this document it was made just for you :).
http://docs.hp.com/en/5991-1236/When_Good_Disks_Go_Bad.pdf

If you did all the steps to restore the disk after replacing it it looks like you have a 2nd bad disk or a controller problem.
In that cause I would get a CE back out and get that puppy fixed before you loose data if the other goes.

The above document will take you step by step and explain as well if you need proper disk replacement info.
BFA6
Respected Contributor

Re: When are stale extents not a failed disk?

Hi Jeff,

That document is really thanks for the link. Our circumstances are a little bit of a mix of symptoms looking through that list. The oddest thing is the ??? in the device file in vgdisplay, yet ioscan and diskinfo work fine.

I'm going to re-open the hardware call and HP can sort out out the cause between software and hardware. I think it's now a combination of the two.
BFA6
Respected Contributor
Solution

Re: When are stale extents not a failed disk?

Well we've fixed our problem. In the end we went for a reboot first and then a replacement of the disk. At the reboot stage we did get some weird messages about a disk address not being unique.

After the reboot we tried dd on the disk this time it didn't hang it just took a long time to time out which was different. Also a diskinfo came back with a size of 0 instead of reporting normally like it was before.

We were thinking a backplane problem at this point ( we had a spare on site in anticiaption of this ) but thought we'd try a third hard disk in the slot. It was at this point that we found the 2nd disk the other HP engineer had tried was the wrong size. It was an 18GB and should have been a 36GB like its partner in the mirror. That was obviously not helping.

We put in the correct size disk and it all synced up fine.

So was it just hardware from the start? Well I don't think so as the first disk that the first engineer tried was the right size ( afaik ) and the dd's were hanging rather than timing out causing errors etc.

Anyway all's well that ends well ( kinda ) though we will be having words with the first HP engineer we got.

Thanks to everyone who posted suggestions.