Re: SCSI Disk Errors

MikeL_4 · ‎05-22-2006

We started receiving read errors on one of our SCSI disk drives a week ago that was not caught:

May 14 15:39:08 ignite1 vmunix: msgcnt 1 vxfs: mesg 038: vx_dataioerr - /dev/vgi
gn1/ignitelv file system file data read error in block 7399426

May 19 10:42:40 ignite1 vmunix: msgcnt 13 vxfs: mesg 008: vx_direrr: vx_readdir2
_1 - /var/opt/ignite file system dir inode 103811 block 6422535 dirent inode 0 e
rror 5

When we started doing our weekly Ignite make recoveries this past weekend they all failed due to not being able to write to the Volume.

Is there a way in the Event Monitor to trap these errors so that we receive them in our Event Monitor EMail's ??

Steven E. Protter · ‎05-22-2006

Shalom,

You should be able to configure EMS with SAM to trap the issues and notify via email.

You have a bad sector on your disk and probably have block relocation set to enabled.

I'd get a good backup, id the disk with cstm,xstm or mstm and replace the disk.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Ajitkumar Rane · ‎05-22-2006

Mike,

Try /etc/opt/resmon/lbin/monconfig to configure receiving EMS alerts

rgds,

Ajit

Amidsts difficulties lie opportunities

MikeL_4 · ‎05-23-2006

.

Andrew Merritt_2 · ‎05-24-2006

The EMS HW Monitors are enabled by default; you should not need to run monconfig to turn this on. Run 'monconfig' and select the Check option to see that the disks affected are listed.

You do NOT configure the EMS HW monitors using SAM (SEP please note! I think you've stated this before; it's not true.).

The disk monitor, disk_em, does not monitor virtual disk arrays (XP, VA, etc.), so if that's what you have, that would explain why no EMS event was generated.

If these are directly attached SCSI disks, then you need to check you have a current version of the OnlineDiags (running 'cstm' will show the version you have installed, including the patch level).

Note also that the monitors do not generate an EMS event for every single low-level failure; there are thresholds set so that the events are only generated when the number of failures indicates a likely real problem. Hardware is resilient to a certain degree, and one or two recoverable errors are acceptable.

I notice you've closed the thread; did you find the answer?

Andrew

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: SCSI Disk Errors

SCSI Disk Errors