Re: recover a stale extent

AnthonySN · ‎05-22-2010

We have a msa30 jbod having 6 disks and strict distributed mirroring (2 pvg groups of 3 disks each) connected to 2 rx6000 servers in a A/P cluster mode and oracle10g database.
problem is one disk is become faulty and another is showing 1 stale extent, now there is one file which we are not able to copy because of this bad block
cp: bad copy to /backup/data/online/card.dbf: read: I/O error

#lvdisplay -v /dev/vgdata/u02 | grep -i stale
LV Status available/stale
00874 /dev/dsk/c0t1d0 00291 stale /dev/dsk/c1t1d0 00291 current

vgsync vgdata
vgsync: Couldn't resynchronize stale partitions of the logical volume:
I/O error
lvsync -T /dev/vgdata/u02
lvsync: Couldn't resynchronize stale partitions of the logical volume:
I/O error
lvsync: Couldn't resynchronize logical volume "/dev/vgdata/u02".

dd if=/dev/rdsk/c0t1d0 of=/dev/null bs=1024k
286102+1 records in
286102+1 records out

dd if=/dev/rdsk/c1t1d0 of=/dev/null bs=1024k
dd read error: I/O error
9333+0 records in
9333+0 records out

How do we replace the disk in this scenario

stephen peng · ‎05-22-2010

SASJ,
read this document:
http://docs.hp.com/en/5991-1236/When_Good_Disks_Go_Bad_WP.pdf
you will find the best way to fix your problem. what confused me was that since the volume was mirrored, why you could not handle a file in it. replace the fault disk first and then check out what will happen next.

regards

AnthonySN · ‎05-22-2010

Hi Stephen,
I do have that doc you sent the link.
>>replace the fault disk first and then check out what will happen next.

in this case both disks are having issues so which one should be done first?

stephen peng · ‎05-23-2010

SASJ,
I consider you have to accept that you encounter data lost. it is not so clear what situation you are in, /dev/dsk/c1t1d0 is the fault one and /dev/dsk/c0t1d0 is showing 1 stale extent? if you've got data backup, just re-create that vg and restore data. maybe there is easier way to deal with it, seeing just one datafile unaccessible,not need to restore whole vg data,but i am not so familier with oracle,

Aneesh Mohan · ‎05-23-2010

Hi,

In your case you have two failures ( 1 disk and 1 lvol)

/dev/dsk/c1t1d0 ----> Faulty Disk

/dev/dsk/c0t1d0 ----> A disk with one stale extend ( /dev/vgdata/u02)

I may perform the below steps on this condition.

Take complete backup of /dev/vgdata filesystems (Pre-requisite)

b) lvremove /dev/vgdata/u02 (YOu should have consistent backup before this task to restore the data)

c) Replace the faulty disk as per the procedure explained in When_Good_Disks_Go_Bad_WP.pdf.

d) create lvol /dev/vgdata/u02 with the orginal size and mirror it.

( You should restore this filesystem from latest available backup and your dba need to perform recovery to the dbf on it)

Aneesh

AnthonySN · ‎05-24-2010

anish,
>>Take complete backup of /dev/vgdata filesystems (Pre-requisite)

that is what the problem is we are unable to take backup of one particular dbf file.

R.O. · ‎05-24-2010

Hi,

Before replacing one of the disks, see if you can add a 2nd mirror so you can have a redundant copy of all the logical volumes.

Regards,

"When you look into an abyss, the abyss also looks into you"

Aneesh Mohan · ‎05-24-2010

Okay,
if you have old backup of that particular dbf file and complete archive logs then your DBA can perform recovery of that particular tablespace/dbf after your restore .
Steps:-
a) Take cold backup of all dbfs in the stale filesystem (except problematic one)
b) Take backup of archive logs .(Confirm you have complete chain from last succesfull backup of the problematic dbf)

c) Remove the stale lvol.
d) Proceed with DISK replacement process
e) Create lvol /dev/vgdata/u02
f) Restore all dbfs from the last backup and also the problematic one from the last available backup
g) Ask DBA to perform tablespace /datafile recovery using archive logs
h) Start the database in open mode
i) Mirror lvol /dev/vgdata/u02

Aneesh

AnthonySN · ‎05-24-2010

RO,
how abt this,
add one disk and TRY mirroring the disk c1t1d0 on to it.
add one more disk and mirror the disk c0t1d0 having stale extent.

R.O. · ‎05-24-2010

Hi,

You have to add the necessary disk to be able to extend all the logical volumes residing on the disks failing and then, if the 2nd mirroring succeeds, you can reduce the mirrors of the lvols residing in the failing disks and replace the disks, rebuild the mirrors and finally, reduce the temporary 2nd mirror and the disks used to do it.

Regards,

"When you look into an abyss, the abyss also looks into you"

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: recover a stale extent

recover a stale extent