- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: #$^%@# SCSI drive!!
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 11:56 AM
тАО01-02-2002 11:56 AM
vxfs: mesg 021: vx_mountsetup: /home file system validation failure
boot continues then and finished up. I then go in, run fsck on /home, and it says that there are IO errors. Fix them. everything is fine until the next reboot, where I do it again. This problem has happened since I have upgraded to HP 11.0, but I don't think it happened right away, but several months after. I have just put in the new patches, and it did it before and after. The interesting thing is that /download mounts fine with no issue whatsoever. In the syslog is an error with a SCSI reset and lbolt, and I don't remember if that appears every time or not. There arn't any other performance issues (crashes, etc). I'd appreciate any ideas as to what's going on. Thanks.
Mark
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:07 PM
тАО01-02-2002 12:07 PM
Re: #$^%@# SCSI drive!!
This is not a lot to go on but 1) Are the two file systems in different VG's and thus on separate LUN's. In that case, it's possible that the IO Timeouts are set differently on the two PV's. Typically RAID LUN's require timeouts in the rangle of 120-180 seconds rather than the default value. 2) You mentioned patches but have you installed all the LVM, SCSI, and VxFS patches. 3) Don't overlook the obvoius - cable length, proper termination, termination power - this could be a case of everything working almost perfectly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:07 PM
тАО01-02-2002 12:07 PM
Re: #$^%@# SCSI drive!!
lbolt dev: 1f00500
Then the 1f is hex for 31 - in other words a disk is having hardware problems.
So more info on the lbolt error would help...
Rgrds,
Rita
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:07 PM
тАО01-02-2002 12:07 PM
Re: #$^%@# SCSI drive!!
Are you mounting these file systems with nolog or tmplog option?. If so, mount them with log or delaylog. "mount" command will display how they are mounted. I can think of this if there are no H/W problems with the disk subsystem.
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:10 PM
тАО01-02-2002 12:10 PM
Re: #$^%@# SCSI drive!!
When a VERITAS File System is mounted, the structure is read from disk. If the file system is marked clean, the structure is correct and the
first block of the intent log is cleared. If there is any I/O problem or the structure is inconsistent, the kernel sets the VX_FULLFSCK flag
and the mount fails. If the error isn't related to an I/O failure, this may have occurred because a user or process has written directly
to the device or used fsdb to change the file system.
Action
Check the console log for I/O errors. If the problem is a disk failure, replace the disk. If the problem is not related to an I/O failure, find out how the disk became corrupted. If no user or process is writing to
the device, report the problem to your customer support organization. In either case, unmount the file system and use fsck to run a full structural check.
Check this out:
http://us-support2.external.hp.com/cki/bin/doc.pl/sid=2d1abe5919d9550f85/screen=ckiDisplayDocument?docId=200000024613750
http://us-support2.external.hp.com/cki/bin/doc.pl/sid=2d1abe5919d9550f85/screen=ckiDisplayDocument?docId=200000050087829
http://us-support2.external.hp.com/cki/bin/doc.pl/sid=2d1abe5919d9550f85/screen=ckiDisplayDocument?docId=200000035869465
HTH,
Shiju
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:10 PM
тАО01-02-2002 12:10 PM
Re: #$^%@# SCSI drive!!
Technical Knowledge Base document #GLPKBRC00002290 notes the following:
/begin_quote/
Message: 021
WARNING: msgcnt x: vxfs: mesg 021: vx_mountsetup - mount_point file system validation failure
Explanation
When a VERITAS File System is mounted, the structure is read from disk. If the file system is marked clean, the structure is correct and the first block of the intent log is cleared. If there is any I/O problem or the structure is inconsistent, the kernel sets the VX_FULLFSCK flag and the mount fails. If the error isn't related to an I/O failure, this may have occurred because a user or process has written directly to the device or used fsdb to change the file system.
Action
Check the console log for I/O errors. If the problem is a disk failure, replace the disk. If the problem is not related to an I/O failure, find out how the disk became corrupted. If no user or process is writing to
the device, report the problem to your customer support organization. In either case, unmount the file system and use fsck to run a full structural check.
/end_quote/
Thus, make sure no user or process is writing the the filesystem, and make *especially* sure no one is running 'fsdb'.
Since you note 'lbolt' errors in your syslog, it would also be worth while to correlate those to the disk(s) that map to this filesystem, although these may be relatively benign.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:16 PM
тАО01-02-2002 12:16 PM
Re: #$^%@# SCSI drive!!
scb ->cdb: 28 00 01 ad dc c8 00 00 08 00
scb ->cdb: 4d 00 40 00 00 00 00 04 00 00
SCSI: Resetting SCSI -- lbolt: 55278, bus: 0
SCSI: reset detected -- lbolt: 55278, bus: 0
Secondly, both /download and /home are created in /dev/vgraid. I set the /home timeout high becuase of past problems, but I don't know where the /download is set. As for patches, I ran the custom patch manager, so it grabbed everything HP thought the server needs. nothing has changed for cables, but I would that to affect both /home and /download.
Lastly, on the server, mount shows:
/download on /dev/vgraid/download delaylog,nodatainlog (same for /home)
Please let me know if you need anything else. Thanks!
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:24 PM
тАО01-02-2002 12:24 PM
Re: #$^%@# SCSI drive!!
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:33 PM
тАО01-02-2002 12:33 PM
Re: #$^%@# SCSI drive!!
I don't suppose you have any sort of error logging on your RAID? I would try a couple of things at this point in an attempt to separate the hardware problems from filesystem problems.
dd if=/dev/vgraid/rlvol1 (or whatever) bs=256k of=/dev/null - this is a read-only operation and thus safe. If that passes repeat it for the other lvols in in VG. If those pass then the underlying I/O is probably okay.
I would then be tempted to create a new /home filesystem and restore from backup.
Finally, one last thought: Is your RAID powered-on and 'READY' or 'ONLINE' before you boot your server. You just might have a timing issue or transients on the bus otherwise.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:44 PM
тАО01-02-2002 12:44 PM
Re: #$^%@# SCSI drive!!
I am running the dd command and will see what it does. the raid is normally powered on and ready. In this last case, the server reboot after the patch install, so there should not have been a change in state for the RAID.
If nothing else works, I could create the new fs, but will that solve the problem? Everything that I think of concerning a bad RAID would affect /home and /download, but yet I have never (repeat, never) had an issue with /download, but /home has crapped out on me and crashed and caused me all sorts of trouble over the last year or so (both with 11.0 and 10.20). Both are referenced (to the best of my knowledge) the same way on the server (/etc/fstab, for example). So why does one not behave?
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:53 PM
тАО01-02-2002 12:53 PM
Re: #$^%@# SCSI drive!!
the results of the dd if=/dev/vgraid/home bs=256k of=/dev/null were:
dd read error: Invalid argument
56251+0 records in
56251+0 records out
the file system, from bdf, is 25.6 GB, with 1665172 used and 8380236 free.
I'll run /download to see what that does.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 12:54 PM
тАО01-02-2002 12:54 PM
Re: #$^%@# SCSI drive!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:00 PM
тАО01-02-2002 01:00 PM
SolutionAfter this, there is a rather good chance that making a new filesystem will fix you. I think you have a very subtle fs problem or you have not done a full fsck and the log replays are not really clearing your problem. There have been a number of VxFS patches that corrected various seldom-seen problems and you may be one of the lucky few. I would run a newfs and then restore home from backup.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:05 PM
тАО01-02-2002 01:05 PM
Re: #$^%@# SCSI drive!!
just for reference, the dd on /download showed no problems. so let's assume that I need to go ahead and remake the fs.
I'm going to go ahead and search for how to do it, but I do have a question about what to do with the old one. Do I need to dump it first and then make the new one, calling it /home? Do I keep the old one on? Also, will I need to redo all the files like /etc/fstab, where it lists /home, or will it just look to the new one. Needless to say, this whole thing makes me a little nervous (mass restoration of hard drives). Any advice is welcome. Thanks!
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:06 PM
тАО01-02-2002 01:06 PM
Re: #$^%@# SCSI drive!!
Do they have by any chance a lot of links in side? What will happen if you unmount /home or /download (whichever is easier) them and mount them manually?.
Also, what do you have in /etc/rc.log and syslog.log?. Do you see any errors other than
the ones you mentioned?.
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:10 PM
тАО01-02-2002 01:10 PM
Re: #$^%@# SCSI drive!!
I looked though the rc.log and syslog, and there was nothing new. in the rc.log, it says that /home is corrupted and needs to be checked. there are a few items that the command was not found, but that it becuase it is in the /home path and not mounted. I end up mounting /home myself, but I just use the mount /home or mount -a command. Should I type in the entire line?
mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:20 PM
тАО01-02-2002 01:20 PM
Re: #$^%@# SCSI drive!!
This is rather easy (it assumes a vxfs filesystem):
1) Backup existing home directory:
cd /home
tar cvf /dev/rmt/0m .
where /dev/rmt/0m is your tape drive
then
tar vtf /dev/rmt/0m to list the contents of the backup and confirm that you have a good backup.
You can also use cpio if you like (or fbackup).
P.S. You might want to backup both /home and /download to protect yourself from yourself.
2) cd /
umount /home
3) newfs -F vxfs /dev/vgraid/rlvol1 (or whatever is the current raw device for this filesystem but be sure.
4) mount /home (since you haven't messed with /etc/fstab; no changes are needed).
5) cd /home
tar xvf /dev/rmt/0m
That should have you back in business. You can then do an exportfs -a and that should get your NFS stuff fixed as well.
Regards, Clay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:33 PM
тАО01-02-2002 01:33 PM
Re: #$^%@# SCSI drive!!
I say that all the time. ; )
Another thought:
I had a similiar situation a while back, (and said "#$^%@# SCSI drive!!" many times) which turned out to be some sort of latent corruption on the drive.
I backed up the contents of the drive, mediainited it and restored the contents.
No more problem.
Could be another possibility to consider.
Good Luck and Happy New Year.
Kel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:33 PM
тАО01-02-2002 01:33 PM
Re: #$^%@# SCSI drive!!
Hate to be a bother, but one point of clarification, please.
I umount /home and then do a newfs -F vxfs /dev/vgraid/???? Do I use rhome here, and does that effectivly destroy the old rhome? Or do I use something like rlvol1, and then just reference that when I mount /home (/dev/vgraid/lvol1 /home). Does it really matter?
Then, when I mount up /home again, there should be nothing in it, right? thus, restore from tape.
Thanks!
mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 01:48 PM
тАО01-02-2002 01:48 PM
Re: #$^%@# SCSI drive!!
What I meant was to manually unmount the file system and mount it back to see if there are any errors. Some more possibilities.
1. The disks (LUNs) that are being used by /home and /download are shared across multiple systems. Particularly in a SAN environment, the possibility is more due to LUN security issues. So what kind of backend do you have?.
2. The permissions on the logical volumes shouldn't be given to others so that the chance of ordinary users manually corrupting them is taken away.
3. Do they have a lot of activity? Are the reboots getting completed normally? It will also result this kind of situations particularly on the file systems that are heavily active where there may be some homemade deadly processes that do not like to leave but become zombie. Sometimes, we lose patience with the reboot because of this and switch off the servers before they close the logical volumes and flush the buffers.
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2002 02:00 PM
тАО01-02-2002 02:00 PM
Re: #$^%@# SCSI drive!!
You do a newfs -F vxfs /dev/vgraid/rhome and yes your old filesystem is toast. That is why it is essential that you have a good backup before doing this. You other option is to create an entirely new logical volume and call it /newhome. Then copy everything from /home to /newhome. You would then umount /home and /umount /newhome. Finally modify /etc/fstab changing the lvol for home then mount /hiome. This would leave your old home filesystem intact but unmounted. This is the safest way to do this but does require available disk space.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-03-2002 12:14 PM
тАО01-03-2002 12:14 PM
Re: #$^%@# SCSI drive!!
Mark