Operating System - HP-UX
1846585 Members
1776 Online
110256 Solutions
New Discussion

vxfs: mesg 056: vx_dataioerr

 
Keely Jackson
Trusted Contributor

vxfs: mesg 056: vx_dataioerr

Hi

We have received the following message in the syslog at the same time that an oracle database crashed.

Oct 17 09:04:13 medusa vmunix: vxfs: mesg 056: vx_dataioerr - /dev/vg02/lvol3 file system file data write error

/dev/vg02/lvol3 contains oracle dbf and log files.

Any ideas what this means? Should I be unduly worried?

Systems is K570, hp-ux 11.0, on-line jfs, mar02 support+ pack.


Many thanks
Keely
Live long and prosper
4 REPLIES 4
Keely Jackson
Trusted Contributor

Re: vxfs: mesg 056: vx_dataioerr

Hello again

Having looked at couple of other threads on this subject, I have checked the lvols for stale extents, none found and next presume I need to check out the disks itself.

However, the volume group contains 2 physical disks (these are actually raidsets in SAN storage) will there be any other messages logged anywhere (apart from possibly on the controller, which is at another site) that would give me a clue which physical disk is the problem?

Thanks
Keely
Live long and prosper
Keely Jackson
Trusted Contributor

Re: vxfs: mesg 056: vx_dataioerr

Please ignore question about which disk. Lvdisplay tells me. Must be being a bit slow this morning.
Live long and prosper
Jean-Louis Phelix
Honored Contributor

Re: vxfs: mesg 056: vx_dataioerr

Hello,

Answer from a Q & A ...

Message: 056
WARNING: msgcnt x: vxfs: mesg 056: vx_dataioerr - file system file dataerror

Explanation
------------------

A read or a write error occurred while accessing file data. The message
specifies whether the disk I/O that failed was a read or a write. File data
includes data currently in files and free blocks. If the message is printed
because of a read or write error to a file, another message that includes the
inode number of the file will print. The message maybe printed as the result of
a read or write error to a free block, since some operations allocate an extent
and immediately perform I/O to it. If the I/O fails, the extent is freed and
the operation fails. The message should be accompanied by a message from the
disk driver containing information about the disk I/O error.

Action
---------

Resolve the condition causing the disk error. If the error was the result of a
temporary condition (such as accidentally turning off a disk or a loose cable),
correct the condition. Check for loose cables,etc. If any file data was lost,
restore the files from backups. Determine the file names from the inode number
(see the ncheck (1M) manual page for more information.) If an actual disk error
occurred, make a backup of the file system, replace or reformat the disk
drive, and restore the file system from the backup. Consult the documentation
specific to your system for information on how to recover from disk errors. The
disk driver should have printed a message that may provide more information.


THINGS TO CHECK:
----------------

1) Make sure you have the latest VxFS/LVM patches installed on your system.

2) See if you are able to read from the disk using the following command:

dd if=/dev/dsk/c#t#d# of=/dev/null bs=64k

where c#t#d# is the appropriate disk device file. If you are unable
to read the disk, then it's most likely a disk failure. Contact the
Hardware Response Center for help in further diagnosis and correction of
potential disk failure.

3) Try running a full fsck on the file system:

fsck -F vxfs -y -o full /dev/vg##/lvol##

If you are still unable to access the file system after running fsck,
and a second fsck returns no errors, then use newfs to
recreate the file system and restore the data from backup tape. If this
problem persists, then contact the Hardware Response Center for help in
further diagnosis and correction of potential disk failure.



It works for me (© Bill McNAMARA ...)
Keely Jackson
Trusted Contributor

Re: vxfs: mesg 056: vx_dataioerr

Hi

Problem is indeed a failed disk, but as it is in a raid I don't understand why the db crashed and is now corrupt. Surely the controllers should cache the writes and unix/oracle would know nothing about them.

Keely

Live long and prosper