Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
Operating System - Tru64 Unix
cancel
Showing results for 
Search instead for 
Did you mean: 

lsm remirror fails

Michael Schulte zur Sur
Honored Contributor

lsm remirror fails

Hi,

I have replaced a disk, that was failing and now I discovered that the remirror fails because of a read error on the now single disk. If I am lucky, no file is affected, but I have no clue how to get the mirror running again.

please help,

thanks,

Michael
7 REPLIES
Michael Schulte zur Sur
Honored Contributor

Re: lsm remirror fails

Hi,

I tried to eliminate the offending block with dd if=/dev/zero of=/dev/... count=1

I hope, that procedure did not damage the file system. At least the rebuild worked. :-)

Michael

Abdul Rahiman
Esteemed Contributor

Re: lsm remirror fails

Michael,

Great .. I was feeling bad not seeing any responses for your problem..

Glad to know that you are able to fix it.. hope you can sleep well tonight :-)

regds,
Abdul.
No unix, no fun
Ralf Puchner
Honored Contributor

Re: lsm remirror fails

Abdul,

the forum does not quarantee a response time, so customer depend on a support contract for that if he depend on a running machine.

Btw. it is very hard giving some hints without knowing anything about used commands, volprint, devices, errormessages etc. - guessing is not an escalation path nor a valid troubleshooting!


Help() { FirstReadManual(urgently); Go_to_it;; }
Michael Schulte zur Sur
Honored Contributor

Re: lsm remirror fails

Dear Ralf,

what you say is only partly true. If you read carefully and are not dependant on the output of standard unix commands, you could have understood my problem. ;-)

One question I have still. I used
dd if=/dev/null of=/dev/rdisk/dsk6c oseek=xxxxxx count=1
to zero out the block, that was preventing lsm to rebuild the mirror.
If I assume, that the block was not used, did I do any damage to the advfs?
Is there another command besides verify that can check fully the consistency of a domain?

thanks for any comment!

Michael
Ralf Puchner
Honored Contributor

Re: lsm remirror fails

Michael,

there are so many command and ways to manage lsm - so how can I guess what command and what errormessage you are retrieving? Depending on the error message and command you must alter the configuration/command.

The advfs consistency depend on the function of the block, if it is part of a metadata structure advfs inconsistency occured, if part of filedata only a file will be damaged (so advfs structure is ok). This question is very tricky and can be answered by conclusions ;-)

Depending on the used advfs version there are different tools... why not using "apropos advfs" to get a list of commands. Depending of metadata/data corruption choose the right command to repair/check the domain.

Btw. another approach is to analyze the binary.errlog and check if a bad block replacement was made if so data is mostly recovered....


Help() { FirstReadManual(urgently); Go_to_it;; }
Ismail1
Occasional Visitor

Re: lsm remirror fails

Hi Michael/Ralf,

This problem looks like a hardware error. Ralf is right to say that if you have clear away the metadata block it will cause advfs inconsistency but if you have cleared a block which was used by any file, you wont come to know until system reads that file again. In a mirror configuration it is also possible that this particular block can be read from another disk. It may create problem when other disk fails and you start mirroring this disk (as sourse disk).

I also support Ralf that binary.errlog will be more accurate to say whether any BBR was done or not. By right BBR is a correct ways to clear this type of error, which is handled by the system and hardware.

Enjoy.
Dmitry Timoshenko
Frequent Advisor

Re: lsm remirror fails

Hi,

There is a way to avoide read failure on healthy plex:
- Remove failed plex from volume (volume1)
- After replacement, create second volume (volume2)

Now, you`ll have two volumes.

- Execute 'rmvol /dev/vol/somedg/volume1 some_domain'

Here, 'rmvol' will kill volume1 and will copy data from volume1 to volume2.
This operation will be succesful, cause it`ll use non-block type of copying.

Now, you can create mirror on volume2 and rename volume2 to volume1 if you like old name.

Best regards,
Dmitry.