Re: Stale extents in VG00

R.K. # · ‎02-10-2009

Hi All,

Lots of stuff for you guys to help me out of the situation :|

Earlier I had the same issue two days back with /dev/vg00/lvol7, did lvsync and it worked fine. But now problem is in lvol6 and lvol8 and lvsync is NOT working.

Here is all what I have:

rp5430 11.23

# vgdisplay -v vg00 |more
--- Volume groups ---
VG Name /dev/vg00
VG Write Access read/write
VG Status available
Max LV 255
Cur LV 8
Open LV 8
Max PV 16
Cur PV 4
Act PV 4

LV Name /dev/vg00/lvol6
LV Status available/stale
LV Size (Mbytes) 512
Current LE 64
Allocated PE 128
Used PV 2

LV Name /dev/vg00/lvol8
LV Status available/stale
LV Size (Mbytes) 4608
Current LE 576
Allocated PE 1152
Used PV 2

All disks are "available"

Problematic LV
/dev/vg00/lvol6
/dev/vg00/lvol8

# strings /etc/lvmtab
/dev/vg00
/dev/dsk/c1t2d0
/dev/dsk/c1t0d0
/dev/dsk/c2t0d0
/dev/dsk/c2t2d0

# lvdisplay -v /dev/vg00/lvol6 |more
--- Logical extents ---
LE PV1 PE1 Status 1 PV2 PE2 Status 2
00000 /dev/dsk/c1t0d0 01796 stale /dev/dsk/c2t0d0 01136 current

# lvdisplay -v /dev/vg00/lvol8 |more
--- Logical extents ---
LE PV1 PE1 Status 1 PV2 PE2 Status 2
00000 /dev/dsk/c1t0d0 02372 current /dev/dsk/c2t0d0 01712 current
00001 /dev/dsk/c1t0d0 02373 current /dev/dsk/c2t0d0 01713 current
00002 /dev/dsk/c1t0d0 02374 stale /dev/dsk/c2t0d0 01714 current
00003 /dev/dsk/c1t0d0 02375 current /dev/dsk/c2t0d0 01715 current

#ioscan -fnC disk
disk 0 0/0/1/1.0.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
/dev/dsk/c1t0d0 /dev/rdsk/c1t0d0

#diskinfo /dev/rdsk/c1t0d0
It is OK

#dd if=/dev/rdsk/c1t0d0 of=/dev/null bs=1024k
It is also OK

When trying to resync the lvols, getting following error:
# /usr/sbin/lvsync /dev/vg00/lvol6
lvsync: Couldn't re-synchronize stale partitions of the logical volume:
Device offline/Powerfailed

lvsync: Couldn't resynchronize logical volume "/dev/vg00/lvol6".
#

Nothing related to disk error in syslog.

Hardware engnr says disk is OK.
If it is NOT a disk issue, WHAT CAN IT BE ?????
Can it be a patch issue?

Have a look on two of the LVM patches:
# PHCO_34036 1.0 LVM commands patch
# PHKL_33312 1.0 LVM Cumulative Patch

Thanks in advance.
R.K.

Don't fix what ain't broke

Steven E. Protter · ‎02-10-2009

Shalom,

If lsync did the job a few days ago on the same problem, and now it does not work, it tells me the problem is disk related.

It leads me to believe a whole section of disk is not working and the problem is going to get worse.

Device offline/powerfailed points to a disk being bad. It passed the dd read test (good job) but its not good enough to handle the lvsync which as you imagine is a little disk intensive.

Things to do:

1) Take an ignite make_tape_recovery or make_net_recovery immediately. The system appears to be mirrored, but its getting worse and you could lose the system.

2) Consult this document for further steps:
http://docs.hp.com/en/T1859-90048/ch04s07.html

I've seen where a hot swap disk behaves in this manner. You pop it out, pop it back in, lvsync works and everything seems fine. Some time later the symptom returns.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Avinash20 · ‎02-10-2009

"Device offline/Powerfailed" usually means that a disk drive didn't respond to
a request but such an event should be sent to syslog.log by the drivers. Since
this didn't happen, the real issue is probably an I/O error which is
incorrectly reported by lvsync.

This is an example of /stand showing bad block

As the read tests worked fine as well, chances are we're dealing with a write
problem to the stale extent on /dev/dsk/c5t6d0.

Using the unsupported bbdir.11 tool we checked the bad block directory
on the suspect drive :

# bbdir.11 -d /dev/rdsk/c5t6d0
Block device "/dev/dsk/c5t6d0" in vg /dev/vg00
VG /dev/vg00 is activated
opening device /dev/rdsk/c5t6d0 block size 1
Using the primary BBDIR
[1] bad block 16489 (PE# 4 data block 13577) alternate 0
There are 1 entries in the BBDIR
No changes being made to the bbdir

Here we clearly see that a block was marked bad but couldn't be relocated
(alternate is 0) because /stand has to be contiguous. Under these
circumstances
read requests might still succeed but a write request to the block will
immediately return an error.

Replacing disk c5t6d0 and doing another lvsync fixed the problem.

Note : Please contact your local HP Response Center to obtain a copy of
the bbdir.11 tool.

"Light travels faster than sound. That's why some people appear bright until you hear them speak."

Ganesan R · ‎02-10-2009

Hi,

It might be possible that disk may go offline intermittenly.

Have you replaced the disk earlier? I would go with disk replacement first then look into other if problem persists after replacement also.

For confirmation, you can reduce the mirror from c1t0d0 and mirror it with another disk/LUN and watch the situation. If same issue then you can check other things.

Best wishes,

Ganesh.

R.K. # · ‎02-10-2009

Hi All,

I will try with the different disk.

But can it be by any way related to the LVM patches??

-R.K.

Don't fix what ain't broke

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Stale extents in VG00

Stale extents in VG00