LVM messages in syslog

peter_402 · ‎04-13-2006

Dear all,

I have K570 server with 11.00 on it.
installed in it 4 H.D 4GB seagate with hp firmware HPM2, then one of then failed and was changed with disk of the same size and same hp part number and over that with the same seagate model number but with HP firmware HPM4.

before the installation of the new hard i get the following message in the event log
Notification Time: Tue Feb 28 16:01:27 2006

kephren sent Event Monitor notification information:

/storage/events/disks/default/10_0.4.0 is >= 1.
Its current value is SERIOUS(4).

Event data from monitor:

Event Time..........: Tue Feb 28 16:01:27 2006
Severity............: SERIOUS
Monitor.............: disk_em
Event #.............: 100272
System..............: kephren

Summary:
Disk at hardware path 10/0.4.0 : Device connectivity or hardware failure

Description of Error:

The device was not ready to process requests when it received a request
from the device driver because it is in the process of becoming ready.

Probable Cause / Recommended Action:

The device may have been powered off and may be being powered on.

Alternatively, one or both of the terminators on the SCSI bus may be
missing. Install the terminators in their proper locations at the ends of
the SCSI bus.

Alternatively, the SCSI cable may have become detached from the device.
Re-attach the cable.

Alternatively, the SCSI cable may have failed. Replace it.

Alternatively, the device may be in a state where it could not process
this, or any, request. Cycle power to the device.

Alternatively, there could be more than one device having the same address
on the SCSI bus. Make all the addresses on the SCSI bus unique.

Alternatively, the total length of all cable segments on the SCSI bus
exceeds 25 meters. Replace one or more cable segments until the total
length is less than this value.

Alternatively, if all of the above fail to correct the problem, the device
has experienced a hardware failure. Contact your HP support representative
to have the device checked.

Alternatively, if messages corresponding to this condition appear in the
log for more than one device on the SCSI bus, the device adapter may be in
a state from which it cannot extract itself. Perform a system shutdown,
cycle power to the computer and wait for it to reboot.

If, after reboot, messages corresponding to this condition continue to
appear in the log for this SCSI bus, contact your HP support
representative to have the adapter checked.

Additional Event Data:
System IP Address...: 10.1.66.21
Event Id............: 0x440457b700000000
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_disk_em.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/899
OS Version......................: B.11.00
STM Version.....................: A.31.00
EMS Version.....................: A.03.20
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/scsi.htm#100272

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v

Component Data:
Physical Device Path...: 10/0.4.0
Device Class...........: Disk
Inquiry Vendor ID......: SEAGATE
Inquiry Product ID.....: ST34371W
Firmware Version.......: HPM2
Serial Number..........: JDW9604606ZJR3

Product/Device Identification Information:

Logger ID.........: disc30; sdisk
Product Identifier: Disk
Product Qualifier.: SEAGATE ST34371W
SCSI Target ID....: 0x04
SCSI LUN..........: 0x00

SCSI Command Data Block:

Command Data Block Contents:
0x0000: 00 00 00 00 00 00

Command Data Block Fields (6-byte fmt):
Command Operation Code...(0x00)..: TEST UNIT READY
Logical Unit Number..............: 0

Hardware Status: (not present in log record).

SCSI Sense Data:

Undecoded Sense Data:
0x0000: 70 00 02 00 00 00 00 0A 00 00 00 00 04 01 02 00
0x0010: 00 00

SCSI Sense Data Fields:
Error Code : 0x70
Segment Number : 0x00
Bit Fields:
Filemark : 0
End-of-Medium : 0
Incorrect Length Indicator : 0
Sense Key : 0x02
Information Field Valid : FALSE
Information Field : 0x00000000
Additional Sense Length : 10
Command Specific : 0x00000000
Additional Sense Code : 0x04
Additional Sense Qualifier : 0x01
Field Replaceable Unit : 0x02
Sense Key Specific Data Valid : FALSE
Sense Key Specific Data : 0x00 0x00 0x00

Sense Key 0x02, NOT READY, indicates that the logical unit addressed
cannot be accessed. Operator intervention may be required to correct
this condition.

The combination of Additional Sense Code and Sense Qualifier (0x0401)
indicates: Logical unit is in process of becoming ready
-----------------------------------------------
after i change it i get the folloing in the event log
Notification Time: Wed Mar 1 16:07:57 2006

kephren sent Event Monitor notification information:

/storage/events/disks/default/10_0.4.0 is >= 1.
Its current value is CRITICAL(5).

Event data from monitor:

Event Time..........: Wed Mar 1 16:07:57 2006
Severity............: CRITICAL
Monitor.............: disk_em
Event #.............: 13
System..............: kephren

Summary:
Disk at hardware path 10/0.4.0 : I/O request failed.

Description of Error:

As part of the polling functionality, the monitor periodically requests
data from the device. The monitor's I/O request failed in this case. The
monitor was requesting data for Log Sense command.

Probable Cause / Recommended Action:

The monitor could not finish the requested I/O operation to the device.
Check /etc/opt/resmon/log/api.log file for an entry logged by
tl_scsi_dev_io request.

Additional Event Data:
System IP Address...: 10.1.66.21
Event Id............: 0x4405aabd00000000
Monitor Version.....: B.01.00
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_disk_em.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/899
OS Version......................: B.11.00
STM Version.....................: A.31.00
EMS Version.....................: A.03.20
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/disk_em.htm#13

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v

Component Data:
Physical Device Path...: 10/0.4.0
Device Class...........: Disk
Inquiry Vendor ID......: SEAGATE
Inquiry Product ID.....: ST34371W
Firmware Version.......: HPM4
Serial Number..........: JDM252970LL30H

Product/Device Identification Information:

Logger ID.........: disc30; sdisk
Product Identifier: Disk
Product Qualifier.: SEAGATE ST34371W
SCSI Target ID....: 0x04
SCSI LUN..........: 0x00

SCSI Command Data Block:

Command Data Block Contents:
0x0000: 4D 00 43 00 00 00 00 10 00 00

Command Data Block Fields (10-byte fmt):
Command Operation Code...(0x4D)..: LOG SENSE
Logical Unit Number..............: 0
PPC Bit..........................: 0
Save Parameters Bit..............: 0
Page Code Bits...................: 1
Page Code........................: 3 (0x03)
Parameter Pointer................: 0 (0x0000)
Allocation Length................: 4096 (0x1000)

Hardware Status: (not present in log record).

SCSI Sense Data: (not present in log record)

and these messages in the syslog, and i cant create filesystem on it, but mediainit work with it.
Mar 19 11:38:37 kephren EMS [1797]: ----- EMS Monitor Restart ----- Title: disk_em Command: /usr/sbin/stm/uut/bin/tools/monitor/disk_em Vendor: Hewlett-Packard Company Version: B.01.00 To obtain a list of currently monitored resources, execute the following: /opt/resmon/bin/resdata -M 1533300606
Mar 19 11:38:38 kephren EMS [1817]: ----- EMS Monitor Restart ----- Title: dm_core_hw Command: /usr/sbin/stm/uut/bin/tools/monitor/dm_core_hw Vendor: Hewlett-Packard Company Version: B.01.00 To obtain a list of currently monitored resources, execute the following: /opt/resmon/bin/resdata -M 1998375097
Mar 19 11:38:38 kephren EMS [1825]: ----- EMS Monitor Restart ----- Title: dm_memory Command: /usr/sbin/stm/uut/bin/tools/monitor/dm_memory Vendor: Hewlett-Packard Company Version: B.01.00 To obtain a list of currently monitored resources, execute the following: /opt/resmon/bin/resdata -M 3111254123
Mar 19 11:38:39 kephren EMS [1834]: ----- EMS Monitor Restart ----- Title: sysstat_em Command: /usr/sbin/stm/uut/bin/tools/monitor/sysstat_em Vendor: Hewlett-Packard Company Version: A.01.00 To obtain a list of currently monitored resources, execute the following: /opt/resmon/bin/resdata -M 3376148974
Mar 19 11:38:39 kephren EMS [1845]: ----- EMS Monitor Restart ----- Title: scsi123_em Command: /usr/sbin/stm/uut/bin/tools/monitor/scsi123_em Vendor: Hewlett-Packard Company Version: B.01.00 To obtain a list of currently monitored resources, execute the following: /opt/resmon/bin/resdata -M 553374747
Mar 19 11:38:39 kephren EMS [1853]: ----- EMS Monitor Restart ----- Title: dm_stape Command: /usr/sbin/stm/uut/bin/tools/monitor/dm_stape Vendor: Hewlett-Packard Company Version: B.01.05 To obtain a list of currently monitored resources, execute the following: /opt/resmon/bin/resdata -M 86234607
Apr 3 10:49:06 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129334080, dev: cb004000, io_id: 10771e
Apr 3 10:49:38 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129337280, dev: cb004000, io_id: 10771e
Apr 3 10:50:10 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129340480, dev: cb004000, io_id: 10771e
Apr 3 10:50:42 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129343680, dev: cb004000, io_id: 10771e
Apr 3 10:51:14 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129346880, dev: cb004000, io_id: 10771e
Apr 3 10:51:46 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129350080, dev: cb004000, io_id: 10771e
Apr 3 10:51:47 kephren EMS [1797]: ------ EMS Event Notification ------ Value: "CRITICAL (5)" for Resource: "/storage/events/disks/default/10_0.4.0" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 117768202 -r /storage/events/disks/default/10_0.4.0 -n 117768193 -a
Apr 3 11:52:20 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129713399, dev: cb004000, io_id: 107c8a
Apr 3 11:52:52 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129716599, dev: cb004000, io_id: 107c8a
Apr 3 11:53:24 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129719799, dev: cb004000, io_id: 107c8a
Apr 3 11:53:56 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129722999, dev: cb004000, io_id: 107c8a
Apr 3 11:54:28 kephren vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 129726199, dev: cb004000, io_id: 107c8a
Apr 3 12:58:30 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130110414, bus: 0
Apr 3 12:58:30 kephren vmunix: SCSI: Reset detected -- lbolt: 130110414, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130111714, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Reset detected -- lbolt: 130111714, bus: 0
Apr 3 13:00:41 kephren vmunix: LVM: Recovered Path (device 0x1f004000) to PV 0 in VG 7.
Apr 3 13:00:41 kephren vmunix: LVM: Restored PV 0 to VG 7.
Apr 3 13:00:41 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130113214, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Reset detected -- lbolt: 130113214, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130114714, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Reset detected -- lbolt: 130114714, bus: 0
Apr 3 13:00:41 kephren vmunix: LVM: Recovered Path (device 0x1f004000) to PV 0 in VG 7.
Apr 3 13:00:41 kephren vmunix: LVM: Recovered Path (device 0x1f006000) to PV 0 in VG 0.
Apr 3 13:00:41 kephren vmunix: LVM: Restored PV 0 to VG 7.
Apr 3 13:00:41 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130115914, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Reset detected -- lbolt: 130115914, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130117214, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Reset detected -- lbolt: 130117214, bus: 0
Apr 3 13:00:41 kephren vmunix: DIAGNOSTIC SYSTEM WARNING:
Apr 3 13:00:41 kephren vmunix: The diagnostic logging facility has started receiving excessive
Apr 3 13:00:41 kephren vmunix: errors from the I/O subsystem. I/O error entries will be lost
Apr 3 13:00:41 kephren vmunix: until the cause of the excessive I/O logging is corrected.
Apr 3 13:00:41 kephren vmunix: If the diaglogd daemon is not active, use the Daemon Startup command
Apr 3 13:00:41 kephren vmunix: in stm to start it.
Apr 3 13:00:41 kephren vmunix: If the diaglogd daemon is active, use the logtool utility in stm
Apr 3 13:00:41 kephren vmunix: to determine which I/O subsystem is logging excessive errors.
Apr 3 13:00:41 kephren vmunix: SCSI: Resetting SCSI -- lbolt: 130118614, bus: 0
Apr 3 13:00:41 kephren vmunix: SCSI: Reset detected -- lbolt: 130118614, bus: 0
Apr 3 13:00:41 kephren vmunix: LVM: vg[0]: pvnum=0 (dev_t=0x1f006000) is POWERFAILED
Apr 3 13:00:41 kephren vmunix: LVM: vg[0]: pvnum=1 (dev_t=0x1f005000) is POWERFAILED
Apr 3 13:02:02 kephren vmunix: SCSI: Async write error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:02 kephren vmunix: blkno: 3147256, sectno: 6294512, offset: -1072177152, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: SCSI: Write error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:02 kephren vmunix: blkno: 3161544, sectno: 6323088, offset: -1057546240, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: SCSI: Async write error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:02 kephren vmunix: blkno: 3127680, sectno: 6255360, offset: -1092222976, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: blkno: 3144968, sectno: 6289936, offset: -1074520064, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: blkno: 3136152, sectno: 6272304, offset: -1083547648, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: SCSI: Read error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:02 kephren vmunix: SCSI: Async write error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:03 kephren above message repeats 2 times
Apr 3 13:02:02 kephren vmunix: blkno: 2510560, sectno: 5021120, offset: -1724153856, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: SCSI: Async write error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:02 kephren vmunix: blkno: 2473400, sectno: 4946800, offset: -1762205696, bcount: 8192.
Apr 3 13:02:02 kephren vmunix: SCSI: Read error -- dev: b 31 0x005000, errno: 126, resid: 2048,
Apr 3 13:02:02 kephren vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
Apr 3 13:02:02 kephren vmunix: SCSI: Async write error -- dev: b 31 0x005000, errno: 126, resid: 1024,
Apr 3 13:02:02 kephren vmunix: blkno: 3501001, sectno: 7002002, offset: -709942272, bcount: 1024.
Apr 3 13:02:02 kephren vmunix: SCSI: Async write error -- dev: b 31 0x005000, errno: 126, resid: 8192,
Apr 3 13:02:02 kephren vmunix: blkno: 3529408, sectno: 7058816, offset: -680853504, bcount: 8192.
can any one provide me with informatin to deal with this problem.
may anticipated thanks for all for you

Steven E. Protter · ‎04-13-2006

Shalom Peter,

dmesg -

dmesg

These command clear the error log and then display it again.

If you have changed a hot swap disk on this server, the commands above should clear the lbolt and other messages.

Otherwise you have a meaningful erorr message.

A disk has gone bad and will eventualy require replacement. You may have a loose cable, but to check that you need to shut down the machine.

I'd make sure my backups are up to date and prepare for replacement of the disk in question.

ll /dev/dsk

or

ll /dev/rdsk

You will be able to match
dev: b 31 0x005000,
to the device information and you will know what disk is the problem.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Josiah Henline · ‎04-19-2006

The error messages could be the result of the SCSI bus resetting.

You can verify the disk using the "dd" command.

dd if=/dev/rdsk/cXt4d0 of=/dev/null bs=64k

If the command completes successfully, check syslog for any new errors on the drive. If the command fails with an I/O error, run it again to make sure it fails in the same spot. If it does not fail in the same spot, the problem is most likely somewhere else on the SCSI chain. If it fails in the same spot, drive 4 is also bad.

If at first you don't succeed, read the man page.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

LVM messages in syslog

LVM messages in syslog

Re: LVM messages in syslog

Re: LVM messages in syslog