Operating System - HP-UX
1829103 Members
2176 Online
109986 Solutions
New Discussion

disk fails to mirror in one lvol, but not in another

 
Stuart N
Occasional Advisor

disk fails to mirror in one lvol, but not in another

K580 HPUX 10.20 FC-AL connected disks in FC1010D enclosures, LVM vxfs.

A 9G disk (c16t6d0) in a mirrored lvol (vg39/lvol1) failed with bad PEs 418, 437, 444.

It has been replaced but the replacement disk fails to mirror with the same stale PEs !
Rebooted machine.
I got engineer to change disk again (and checked disk serial no. with stm to ensure not same one back in !).

Put c16t6d0 into a different vg of same size, mirrored lvol ok, no stale PEs.
Put a different 9G disk in vg39, mirrored lvol ok.

Put c16t6d0 in vg39 and fails to mirror lvol with same stale PEs !

So c16t6d0 fails with same PEs stale in vg39/lvol1, but ok in vg47/lvol1.
c17t14d0 syncs ok in vg39/lvol1.

I know this doesn't make any sense but I have repeated several times.
6 REPLIES 6
Mohanasundaram_1
Honored Contributor

Re: disk fails to mirror in one lvol, but not in another

Hi Stuart,

Atleast it makes sense to me. :-)

The disk which you are considering good is having the problem. Hence you are not able to synchronize LV even after you replaced the drive.

So c16t6d0 may not be the failed disk. You can confirm this by running DD on both the disks in this VG.

I have seen this problem myself once. Hope this helps.

With regards,
Mohan.
Attitude, Not aptitude, determines your altitude
Thayanidhi
Honored Contributor

Re: disk fails to mirror in one lvol, but not in another

Hi,
I noticed problem like this few years ago once. Your current disk (c16t6d0) may be bad. Any disk you try to mirror, the mirrored extend becomes stale!
I suggest, take backup and recreate the lv/vg on that disk, or replace the disk.
I have seen like this problem with 10.20 around 7 years back.

Regds
TT
Attitude (not aptitude) determines altitude.
Devender Khatana
Honored Contributor

Re: disk fails to mirror in one lvol, but not in another

Hi,

Possible disk problems still. The disk could be having interminant problems. One issue could be that disk give problems in copy large amount of data it can be checked by verifying the sizes of two lvol where it suceeded & it did not.

A good work around would be to let another disk here & keep it at other place but contunue monitoring for some time.

HTH,
Devender
Impossible itself mentions "I m possible"
Stuart N
Occasional Advisor

Re: disk fails to mirror in one lvol, but not in another

Originally, when vg39 | c16t6d0 errored I added a 2nd mirror (3rd plex) to vg39, then removed c16t6d0 - so back to one mirror (2 plex).

Put yet another 9G disk in vg39, mirrored lvol ok (see ref to 'c17t14d0 syncs ok in vg39/lvol1').

So c17t14d0 syncs ok in vg39/lvol1, but c16t6d0 does not. Despite the fact that c16t6d0 is now a replaced disk. The new c16t6d0 syncs ok in vg47/lvol1.

dd on plex1 in vg39 = ok
dd on plex2 in vg39 = ok


Thayanidhi
Honored Contributor

Re: disk fails to mirror in one lvol, but not in another

When you mirror on vg47, are you using the full area of the disks (PE size and no of PEs)In vg39 the size LV you are trying to mirror to this PV?
When you try establish mirror to c16 on vg39, is it completing succusfully?
Every time you move the disk between vgs are doing pvcreate -f ?

Revert

Attitude (not aptitude) determines altitude.
Stuart N
Occasional Advisor

Re: disk fails to mirror in one lvol, but not in another

When new/replacement disk : pvcreate -f to re-establish LVM headers on disk and in /etc.
But not each time LVM changes subsequently as this is not necessary.

c17t14d0 is same size as c16t6d0 and are both used in their entirety in both vg39 and vg47.

Mirror to c16t6d0 in vg39 always fails with 3 PE stale.

NB : I changed primary path on c17t14d0 to be c14t14d0 (was previously the A/L) because I wondered if interface was determining which mirror was read for mirror recovery. I then re-mirrored, got a failure on PE 418 (only), then syslog 'POWERFAIL' c14t14d0 switched to c17t14d0 recovered the rest of lvol and PE 418 also - so lvol completely remirrored.

I have also run dd on all disks om Primary and A/L without error. Maybe problem is not disk corruption but a PE corruption on one disk, that only shows when mirror read from there.

But if so why does error get reported on disk being written to ? Finger sometimes points at source disk and sometimes at destination disk.

I shall create new 9G filesystem, Unix-level copy all dir structures, drop vg39 and re-introduce all suspect disks (pvcreate -f). I had hoped to identify fault so that if a suspect disk I can drop it to reduce risks.