Operating System - HP-UX
1753494 Members
5103 Online
108794 Solutions
New Discussion юеВ

Event Monitor notification information:

 
jane zhang
Regular Advisor

Event Monitor notification information:

Hi all:

I have an hp workstation attached with external disks from scsi chain.

I received Monitor notification:
Event data from monitor:

Event Time : Mon Apr 7 17:53:11 2003
Hostname : hostname.domainname
IP Address : xxx.xx.xxx.xx
Event Id : 0x003e921d7700000000 Monitor : disk_em
Event # : 100372 Event Class : I/O
Severity : CRITICAL

Disk at hardware path 10/1/4/0.2.0 : Device connectivity or hardware failure
.....

It also told you several things to check. I checked the cable, it is tight, I also did ioscan, everything seems to be fine.

Is there any other things I need to do?

Thanks,
Jane
7 REPLIES 7
Michael Tully
Honored Contributor

Re: Event Monitor notification information:

Hi Jane,

If you are familiar with 'cstm/mstm/xstm' diagnostics, you can exercise the disk.
You could also do a 'dd' read from it.

# dd if=/dev/rdsk/cxtydz of=/dev/null bs=1024

Are there any other SCSI or disk related messages in syslog?

Regards
Michael

Anyone for a Mutiny ?
Michael Steele_2
Honored Contributor

Re: Event Monitor notification information:

Startup procedure for external SCSI devices daisy chained together is to power off everything and then bring up the farthest device, then the workstation, then the middle device.

dd test, as indicated above, and also logtool.

STM > TOOLS > UTILITY > RUN > LOGTOOL > FILE > VIEW > RAW SUMMARY.

Note the first and last dates of transactions and calculate the difference. If the difference is short, like 4 hours, then this is important to note. Now read down the report of hardware addresses and observe the integer numbers in parenthesis. Anything over 150 in this 4 hour period should be called into HP for replacement.

If its dead it won't respond to the dd command.
Support Fatherhood - Stop Family Law
KCS_1
Respected Contributor

Re: Event Monitor notification information:

hi,Jane

do you have a still problem?
if, you have ,

you should check another logs and status of your h/w

# dmesg

# /var/adm/syslog/syslog

# /var/opt/resmon/log/event.log

# dd if=/dev/rdsk/c#t#d# of=/dev/null bs=1024

# ioscan -funCdisk

# diskinfo -v /dev/rdsk/c#t#d#


also, Terminator and cables related with the disk drive.


have a good day!

Easy going at all.
S.K. Chan
Honored Contributor

Re: Event Monitor notification information:

From my past experience the EMS notification can sometimes be inaccurate. This can be due to diagnostics patches not uptodate, third party drivers not compatible with EMS, problem with EMS itself. Yes .. the first thing you need to do check the disk in 10/1/4/0.2.0 to make sure if it really has any hardware problem or not (see the recommendation from the rest). If you can't find anything wrong with the disk, give a call to HP. For the time being if you want to disable EMS from monitoring the device at 10/1/4/0.2.0, follow the steps I gave in this thread.
http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x26265fe8b250d71190080090279cd0f9,00.html
jane zhang
Regular Advisor

Re: Event Monitor notification information:

Hi all,

I used ioscan -funC disk, the device is claimed.
disk 3 10/1/4/0.2.0 sdisk CLAIMED DEVICE IBM DDYS-T18350N
/dev/dsk/c3t2d0 /dev/rdsk/c3t2d0

It is also respond to dd (I stopped):
# dd if=/dev/rdsk/c3t2d0 of=/dev/null bs=1024
^C11980602+0 records in
11980602+0 records out
# diskinfo /dev/rdsk/c3t2d0
SCSI describe of /dev/rdsk/c3t2d0:
vendor: IBM
product id: DDYS-T18350N
type: direct access
size: 17921835 Kbytes
bytes per sector: 512
#demsg
...//cut to save space here
Memory Information:
...
SCSI: Unexpected Disconnect -- lbolt: 409403, dev: 1f032000, io_id: 300bb24
SCSI: Unexpected Disconnect -- lbolt: 424808, dev: 1f032000, io_id: 300f300
SCSI: Request Timeout; Abort Tag -- lbolt: 427983, dev: 1f032000, io_id: 300f301
SCSI: Request Timeout; Abort Tag -- lbolt: 428083, dev: 1f032000, io_id: 300f302
LVM: PV 0 has been returned to vg[2].
SCSI: Unexpected Disconnect -- lbolt: 447615, dev: 1f032000, io_id: 30146c4
DIAGNOSTIC SYSTEM WARNING:
The diagnostic logging facility has started receiving excessive
errors from the I/O subsystem. I/O error entries will be lost
until the cause of the excessive I/O logging is corrected.
If the diaglogd daemon is not active, use the Daemon Startup command
in stm to start it.
If the diaglogd daemon is active, use the logtool utility in stm
to determine which I/O subsystem is logging excessive errors.
DIAGNOSTIC SYSTEM WARNING:
The diagnostic logging facility is no longer receiving excessive
errors from the I/O subsystem. 4 I/O error entries were lost.
DIAGNOSTIC SYSTEM WARNING:
The diagnostic logging facility has started receiving excessive
errors from the I/O subsystem. I/O error entries will be lost
until the cause of the excessive I/O logging is corrected.
If the diaglogd daemon is not active, use the Daemon Startup command
in stm to start it.
If the diaglogd daemon is active, use the logtool utility in stm
to determine which I/O subsystem is logging excessive errors.
DIAGNOSTIC SYSTEM WARNING:
The diagnostic logging facility is no longer receiving excessive
errors from the I/O subsystem. 14 I/O error entries were lost.
SCSI: Request Timeout; Abort Tag -- lbolt: 450783, dev: 1f032000, io_id: 30146c5
LVM: PV 0 has been returned to vg[2].

Even though the machine sent me the message,everything seem to be OK and it did not send me message today, I am hesitating to change the cable or even the disk now.

Thanks.
Jane
Dario_1
Trusted Contributor

Re: Event Monitor notification information:

Hi Jane:

As S.K. mentioned, I also had problems with EMS and we fixed the problem just calling HP and upgrading. I think the SCSI I/O timeout has been set as default. Default is 30 Sec. Try to increase it to 60 sec.For normal disks you can do it by

pvchange -t 60

You can check if it set as default doing the following:

pvdisplay -v /dev/dsk/c4t4d0 | more


you will get an output like this:

--- Physical volumes ---
PV Name /dev/dsk/c5t10d0
VG Name /dev/vgu02
PV Status available
Allocatable yes
VGDA 2
Cur LV 1
PE Size (Mbytes) 8
Total PE 2170
Free PE 1795
Allocated PE 375
Stale PE 0
IO Timeout (Seconds) default
Autoswitch On

--- Physical volumes ---
PV Name /dev/dsk/c5t12d0
VG Name /dev/vgu04
PV Status available
Allocatable yes
VGDA 2
Cur LV 1
PE Size (Mbytes) 8
Total PE 4340
Free PE 2090
Allocated PE 2250
Stale PE 0
IO Timeout (Seconds) 180
Autoswitch On

Note the timeout on the first as default and the second increased to 180 seconds. The second disk is a very busy oracle disk.

Regards,

DR

KCS_1
Respected Contributor

Re: Event Monitor notification information:

hi,Jane
------------------------------
SCSI: Request Timeout; Abort Tag -- lbolt: 450783, dev: 1f032000, io_id: 30146c5
LVM: PV 0 has been returned to vg[2].

------------------------------
it's sounds like physical device is busy or timeout.

your disk is operating so normally, just in my opinion.

According to above a message 'Physical Volume 0' ,more detail /dev/dsk/c3t2d0 is has been returned to vg02.

so I am suggesting to adjust setting TIMEOUT value of physical device(/dev/dsk/c3t2d0),and BACKUP your all data on VG02

# pvchange -t 180 /dev/dsk/c3t2d0

if, you still have a problem.
It's may better make sure to upgrade your system recently patch,





Easy going at all.