Operating System - HP-UX
1836479 Members
2041 Online
110101 Solutions
New Discussion

Re: disk errors; fsck will not fix; lvol will not mount

 
SOLVED
Go to solution
Constance
Advisor

disk errors; fsck will not fix; lvol will not mount

I'm baaaaack...
I have a problem with my primary disk after a failed attempt to mirror the disk (with a great deal of help from several of you good people) on my hpux/OpenView system.
The story is this;
I had a system that was apparently fine, running hpux 10.20 and OpenView NNM and ITO. It has installed in it a second hard disk that was unused by the system, identical to the primary disk. I was tasked with mirroring the production disk and the attempt failed. People on this forum assisted me with problem determination and the upshot was that the previously unused disk was failing and needed to be replaced.
I had stopped all OpenView services while I was doing the system work to mirror the disk and when I tried to start ITO the system crashed.
When it came back up fsck (which ran automagically during the system restart) found errors on lvol10 which was my /var/opt/OV file system. I broke the mirror and took the once unused disk back out of the volume group and the problem did not change. Is there a resolution to this? Does this mean that the primary disk is failing also?
As always, any help I can get is greatly appreciated.
NOTE: below is what I get when I try to run fsck.
# fsck /dev/vg00/lvol10
** /dev/vg00/lvol10
** Last Mounted on /var/opt/OV
** Phase 1 - Check Blocks and Sizes

CANNOT READ: BLK 369664
CONTINUE? n

Program terminated
# fsck /dev/vg00/lvol10
** /dev/vg00/lvol10
** Last Mounted on /var/opt/OV
** Phase 1 - Check Blocks and Sizes

CANNOT READ: BLK 369664
CONTINUE? y


CANNOT READ: BLK 369664
CONTINUE? y


CANNOT READ: BLK 369664
CONTINUE? y

FAILED READ OF BLOCK #369664, RETRIED 2 TIMES

CANNOT READ: BLK 369744
CONTINUE? y


CANNOT READ: BLK 369744
CONTINUE? y


CANNOT READ: BLK 369744
CONTINUE? y

FAILED READ OF BLOCK #369744, RETRIED 2 TIMES
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
SUMMARY INFORMATION (INODE FREE) BAD
BAD CYLINDER GROUPS
FIX? y

** Phase 6 - Salvage Cylinder Groups
3698 files, 0 icont, 512745 used, 512872 free (1288 frags, 63948 blocks)
DISK MEDIA PROBLEMS ENCOUNTERED!
BAD BLOCKS WERE FOUND ON THE DISK.
***** FILE SYSTEM IS NOT CLEAN -- DISK MEDIA PROBLEMS ENCOUNTERED *****

***** FILE SYSTEM WAS MODIFIED *****
#
10 REPLIES 10
A. Clay Stephenson
Acclaimed Contributor

Re: disk errors; fsck will not fix; lvol will not mount

Hi:

Almost certainly, you have another failing disk. Now would be an extremely good time to make yourself a mirrored boot disk with the obvious exception of lvol10. You can try to mirror it as well but that is not essential. I would not shutdown until you have done a make_tape_recovery (or have one already) or completed your mirroring.

If it ain't broke, I can fix that.
melvyn burnard
Honored Contributor

Re: disk errors; fsck will not fix; lvol will not mount

well firstly you should specifiy th eraw lvol to fsck, not the block device.
fsck /dev/vg00/rlvol10

secondly, this does look like you may be getting a nasty error on your disc.
try the fsck again wit the raw device.
If this fails, do you have a good backup of the file system?
if yes, then newfs the rlvol and recover the file system from backup.
If not, try to force mount it read-only, back it up, and then do the newfs/recover items.

HTH
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
S.K. Chan
Honored Contributor

Re: disk errors; fsck will not fix; lvol will not mount

Does lvol10 has stale PEs ?
# lvdisplay -v /dev/vg00/lvol10 | more
If it does, your primary disk might be hosed too.
Constance
Advisor

Re: disk errors; fsck will not fix; lvol will not mount

One of the things I did do before I started this whole adventure was to stop OpenView and make a full backup tape. After the newly mirrored disk failed to sync (because the disk or disks was/were failing) I used Ignite (had to install it) to do a make_recovery. I used the -a option and I did it while the disk was still mirrored and the message back was that it was successful. I have done a recovery of so much as a single file much less an entire system so I don't have any idea if what I have created will actually help me or not. Is there a way to "check" the back up or bootable recovery tape?
Also, I find it hard to understand why the system, which had not had a single problem for a year and a half, suddenly has not one but both hard drives failing during a mirror operation.
Any input?
Constance
Advisor

Re: disk errors; fsck will not fix; lvol will not mount

Please note that I broke the mirror (after the crash) and no, nothing shows stale for the entire disk, also lvol10 is not mounted. It will not mount.

Ran fsck as suggested and this is the output;
#
#
# fsck /dev/vg00/rlvol10
** /dev/vg00/rlvol10
** Last Mounted on /var/opt/OV
** Phase 1 - Check Blocks and Sizes

CANNOT READ: BLK 369664
CONTINUE? y


CANNOT READ: BLK 369664
CONTINUE? y


CANNOT READ: BLK 369664
CONTINUE? y

FAILED READ OF BLOCK #369664, RETRIED 2 TIMES

CANNOT READ: BLK 369744
CONTINUE? y


CANNOT READ: BLK 369744
CONTINUE? y


CANNOT READ: BLK 369744
CONTINUE? y

FAILED READ OF BLOCK #369744, RETRIED 2 TIMES
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
SUMMARY INFORMATION (INODE FREE) BAD
BAD CYLINDER GROUPS
FIX? y

** Phase 6 - Salvage Cylinder Groups
3698 files, 0 icont, 512745 used, 512872 free (1288 frags, 63948 blocks)
DISK MEDIA PROBLEMS ENCOUNTERED!
BAD BLOCKS WERE FOUND ON THE DISK.
***** FILE SYSTEM IS NOT CLEAN -- DISK MEDIA PROBLEMS ENCOUNTERED *****

***** FILE SYSTEM WAS MODIFIED *****
#
melvyn burnard
Honored Contributor

Re: disk errors; fsck will not fix; lvol will not mount

forgot that the Force mount option is only available for HFS, not VXFS (sigh....)
t least you have the backups, which is good.
I would guess you now try newfs the file system and recovering that directory structure, otherwise you may want to get the disc investigated to confirm it is duff.

What backup command did you use?
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Constance
Advisor

Re: disk errors; fsck will not fix; lvol will not mount

I used SAM to do a full backup.
lvol10 is HFS. The only VxFS volume in the system is lvol16, the /backup file system. This is according to the display I get in SAM.
As far as the newfs procedure goes, please don't forget that I know only what you good folks have taught me. Is there a detailed procedure anywhere that you can give me a link to or can some one give me a list of all the commands I'll use to do this and I'll read the man pages?
Also, is this just spinning my wheels? If the disks are failing and they are going to have to be replace anyway, should I bother with this stuff? What can I do to determine if the disks are really failing? If they are, and I have to replace them, is there a procedure that tells me how to do that? I have seen some info about how to boot to the bootable recovery disk that I made (assuming it's good) but when I do that will the system come back up to the GUI or will I have to work on the command line? Will I have to restore from the backup tape or can/should I restore from the recovery tape itself?
SO many questions, I'm feeling quite lost and pressured (The "it was fine till you touched it" syndrome.
melvyn burnard
Honored Contributor
Solution

Re: disk errors; fsck will not fix; lvol will not mount

well as it is HFS, you could force_mount it doing
mount_hfs -f

But as you have a backup, then this is not an issue.

Sadly, there are many people "dropped in it" with little or no knowledge, and it is not easy.
We can try to assist, but I have to say that if something goes wrong, or your backup is not good, then you have a major problem that could get worse.
to check what is on your backup, use SAm and chosse Backup and Rceovery
select Interactive

select the tape drive, abd then under Actions schosse get List of Files

Thos will list what is on the tape.
Command line would be something like:
frecover -f /dev/rmt/0m -v -I >/tmp/filelist

this will list the INDEX into the file filelist which you can then view

As for the file system?
newfs -F hfs /dev/vg00/rlvol10
mount /dev/vg00/lvol10 /
bdf (to see it is there
frecover -v -f /dev/rmt/0m -i /

I also recommend you look at attending one of the many System Admin courses available from an HP Education Centre to save yourself some future pain.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
S.K. Chan
Honored Contributor

Re: disk errors; fsck will not fix; lvol will not mount

If you have a good backup on /var/opt/OV, what I would do (just my opinion) is I would install a good disk (say B), duplicate the boot disk (vg00) to B (without lvol10), boot up from B and then restore /var/opt/OV back onto B. If you want to go this route, I can help with the process.
Constance
Advisor

Re: disk errors; fsck will not fix; lvol will not mount

Thanks for all your help. I am going to try the newfs and restore plan AND place a hardware call on both hard drives.