1752372 Members
6084 Online
108787 Solutions
New Discussion юеВ

Re: Disk Media error

 
SOLVED
Go to solution
Philemon_2
Frequent Advisor

Disk Media error

Hi All,

It looks there was an error reported on Compaq disk which is root disk(disk 1 1/0/1/1/0.6.0 sdisk CLAIMED DEVICE COMPAQ BD14689BB9
/dev/dsk/c4t6d0 /dev/dsk/c4t6d0s2 /dev/rdsk/c4t6d0 /dev/rdsk/c4t6d0s2
/dev/dsk/c4t6d0s1 /dev/dsk/c4t6d0s3 /dev/rdsk/c4t6d0s1 /dev/rdsk/c4t6d0s3)


But I donтАЩt see any Read errors on the Cstm report but it reports 47 Non-medium errors
Please see below and let me know if i really have a disk problem.


-- Information Tool Log for SCSI Disk on path 1/0/1/1/0.6.0 --

Log creation time: Thu Aug 6 10:15:10 2009

Hardware path: 1/0/1/1/0.6.0


Product Id: BD14689BB9 Vendor: COMPAQ
Device Type: SCSI Disk Firmware Rev: HPB1
Device Qualifier: COMPAQBD14689BB9 Logical Unit: 0
Serial Number: DEA1P6903YKR0636
Capacity (M Byte): 140014.41
Block Size: 512
Max Block Address: 286749487
Smart Enabled: TRUE
Error Logs
Read Errors: 0 Buffer Overruns: N/A
Read Reverse Errors: N/A Buffer Underruns: N/A
Write Errors: 0 Non-Medium Errors: 47
Verify Errors: 0
-- Information Tool Log for SCSI Disk on path 1/0/1/1/0.6.0 --




>------------ Event Monitoring Service Event Notification ------------<



Notification Time: Thu Aug 6 08:56:56 2009



veritas sent Event Monitor notification information:



/storage/events/disks/default/1_0_1_1_0.6.0 is >= 1.

Its current value is CRITICAL(5).







Event data from monitor:



Event Time..........: Thu Aug 6 08:56:56 2009

Severity............: CRITICAL

Monitor.............: disk_em

Event #.............: 100337

System..............: veritas



Summary:

Disk at hardware path 1/0/1/1/0.6.0 : Media error





Description of Error:



The device was unsuccessful in reading data for the current I/O request

due to an error on the medium. The maximum number of retries were

attempted and the data could not be read.



Probable Cause / Recommended Action:



If the event is reported against a device other than a disk drive:



- Reformatting the medium may fix the problem.

- Alternatively, the medium in the device is flawed.

- If the medium is removable, replace the medium with a fresh one.

- Alternatively, if the medium is not removable, the device has

experienced a hardware failure. Repair or replace the device, as

necessary.



If the event is reported against a disk drive on a system on which none

or only some of the disks are in a redundant environment (i.e., mirrored):



- Review applications for errors at the time the event was reported to

determine which data could not be read.

- Attempt to re-read the data.

- Re-write the data to the disk to allow the disk to reallocate to a

spare area on the disk.

- If a re-read of the data and/or a rewrite of the data are not

successful, the disk should be replaced and data restored from backup.



If the event is reported against a disk drive on a system on which all

disks are in a redundant environment (i.e., mirrored):



- When the OS is patched to current LVM and SCSI patches, reallocation

will take place automatically for these disks, and no action needs be

taken to check or replace these drives.

- To avoid unnecessary paging and notification, the severity of this

event can be changed to MINOR_WARNING by enabling the alternate

configuration for this event in

/var/stm/config/tools/monitor/default_disk_em.clcfg (and

/var/stm/config/tools/monitor/rst_disk_em.clcfg, if it exists):



- Find the following lines:

EQ:100337:CRITICAL:...



and insert a "#" in column 1.



- Remove the "#" from column 1 of the line which starts:

EQ:100337:MINOR_WARNING:...





Additional Event Data:

System IP Address...: 172.25.181.57

Event Id............: 0x4a7ad31800000000

Monitor Version.....: B.01.01

Event Class.........: I/O

Client Configuration File...........:

/var/stm/config/tools/monitor/default_disk_em.clcfg

Client Configuration File Version...: A.01.00

Qualification criteria met.

Number of events..: 1

Associated OS error log entry id(s):

0x4a7ad27500000000

Additional System Data:

System Model Number.............: ia64 hp server rx8620

OS Version......................: B.11.23

STM Version.....................: C.60.00

EMS Version.....................: A.04.20

Latest information on this event:

http://docs.hp.com/hpux/content/hardware/ems/scsi.htm#100337



v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v







Component Data:

Physical Device Path...: 1/0/1/1/0.6.0

Device Class...........: Disk

Inquiry Vendor ID......: COMPAQ

Inquiry Product ID.....: BD14689BB9

Firmware Version.......: HPB1

Serial Number..........: DEA1P6903YKR0636



Product/Device Identification Information:



Logger ID.........: sdisk

Product Identifier: SCSI Disk

Product Qualifier.: COMPAQBD14689BB9

SCSI Target ID....: 0x06

SCSI LUN..........: 0x00



I/O Log Event Data:



Driver Status Code..................: 0x0000007C

Length of Logged Hardware Status....: 36 bytes.

Offset to Logged Manager Information: 40 bytes.

Length of Logged Manager Information: 34 bytes.



Hardware Status:



Raw H/W Status:

0x0000: 00 00 00 02 F0 00 03 01 4A 2B 1E 28 00 00 00 00

0x0010: 11 01 00 80 00 3F 06 28 00 16 01 01 51 02 00 00

0x0020: 0F 3A 01 A0



SCSI Status...: CHECK CONDITION (0x02)

Indicates that a contingent allegiance condition has occurred. Any

error, exception, or abnormal condition that causes sense data to be

set will produce the CHECK CONDITION status.



SCSI Sense Data:



Undecoded Sense Data:

0x0000: F0 00 03 01 4A 2B 1E 28 00 00 00 00 11 01 00 80

0x0010: 00 3F 06 28 00 16 01 01 51 02 00 00 0F 3A 01 A0



SCSI Sense Data Fields:

Error Code : 0x70

Segment Number : 0x00

Bit Fields:

Filemark : 0

End-of-Medium : 0

Incorrect Length Indicator : 0

Sense Key : 0x03

Information Field Valid : TRUE

Information Field : 0x014A2B1E

Additional Sense Length : 40

Command Specific : 0x00000000

Additional Sense Code : 0x11

Additional Sense Qualifier : 0x01

Field Replaceable Unit : 0x00

Sense Key Specific Data Valid : TRUE

Sense Key Specific Data : 0x80 0x00 0x3F



Sense Key 0x03, MEDIUM ERROR, indicates that the command terminated

with a nonrecovered error condition that was probably caused by a

flaw in the medium or an error in the recorded data. This sense key

may also be returned if the device is unable to distinguish between a

flaw in the medium and a specific hardware failure (sense key 0x04).

For the RECOVERED ERROR, HARDWARE ERROR, or MEDIUM ERROR Sense Key,

the Sense Key Specific data indicates that 63 retries were attempted.



The combination of Additional Sense Code and Sense Qualifier (0x1101)

indicates: Read retries exhausted.



SCSI Command Data Block:



Command Data Block Contents:

0x0000: 28 00 01 4A 2A E2 00 02 00 00



Command Data Block Fields (10-byte fmt):

Command Operation Code...(0x28)..: READ

Logical Unit Number..............: 0

DPO Bit..........................: 0

FUA Bit..........................: 0

Relative Address Bit.............: 0

Logical Block Address............: 21637858 (0x014A2AE2)

Transfer Length..................: 512 (0x0200)



Manager-Specific Data Fields:

Request ID.............: 0x04000A63

Data Residue...........: 0x00038600

CDB status.............: 0x00000002

Sense Status...........: 0x00000000

Bus ID.................: 0x04

Target ID..............: 0x06

LUN ID.................: 0x00

Sense Data Length......: 0x20

Q Tag..................: 0x76

Retry Count............: 9



---------- End Event Monitoring Service Event Notification ----------<
6 REPLIES 6
Michal Kapalka (mikap)
Honored Contributor
Solution

Re: Disk Media error

hi,

this is your Event message :


Event 100337

* Code:0x11 Qualifier:0x01 Key:0x03 Status:0xXX
* Device: ALL
* Severity: CRITICAL
* Event Summary: Media failure
* Problem Description: The device was unsuccessful in reading data for the current I/O request due to an error on the medium. The maximum number of retries were attempted and the data could not be read. The request was likely processed in a way which could cause damage to or loss of data.
* Probable Cause / Recommended Action: If the event is reported against a device other than a disk drive:
o Reformatting the medium may fix the problem.
o Alternatively, the medium in the device is flawed.
o If the medium is removable, replace the medium with a fresh one.
o Alternatively, if the medium is not removable, the device has experienced a hardware failure. Repair or replace the device, as necessary. If the event is reported against a disk drive on a system on which none or only some of the disks are in a redundant environment (i.e., mirrored)
o Review applications for errors at the time the event was reported to determine which data could not be read.
o Attempt to re-read the data.
o Re-write the data to the disk to allow the disk to reallocate to a spare area on the disk.
o If a re-read of the data and/or a rewrite of the data are not successful, the disk should be replaced and data restored from backup. If the event is reported against a disk drive on a system on which all disks are in a redundant environment (i.e., mirrored)
o When the OS is patched to current LVM and SCSI patches, reallocation will take place automatically for these disks, and no action needs be taken to check or replace these drives.
o To avoid unnecessary paging and notification, the severity of this event can be changed to MINOR_WARNING by enabling the alternate configuration for this event in /var/stm/config/tools/monitor/default_disk_em.clcfg (and /var/stm/config/tools/monitor/rst_disk_em.clcfg, if it exists)
o Find the following lines: EQ:100337:CRITICAL:... and insert a "#" in column 1.
o Remove the "#" from column 1 of the line which starts: EQ:100337:MINOR_WARNING:...


contact HP support what is their recomendation, if i have this message immediately replacemnet will be done.

Tingli
Esteemed Contributor

Re: Disk Media error

Open a call to hardware support. And you might need to replace the mirrored root disk.
Philemon_2
Frequent Advisor

Re: Disk Media error

Thanks to both of you, I have placed an order for new disk as this was an issue with HDD.
Ganesan R
Honored Contributor

Re: Disk Media error

Hi Philemon,

You can use dd to find if any i/o errors on the disk. Use this command to read entire disk.

# dd if=/dev/rdsk/c4t6d0 of=/dev/null bs=1024k
Best wishes,

Ganesh.
Alzhy
Honored Contributor

Re: Disk Media error

You can also do a "dmesg" and your errors will show and you can always map the SCSI HW device to the I/O path.

CAn you post your dmesg output too?
Hakuna Matata.
Philemon_2
Frequent Advisor

Re: Disk Media error

Sure Alzhy,

Here is the report

# dmesg

Aug 7 15:05
Found adjacent data tr. Growing size. 0x311d000 -> 0x711d000.
Pinned PDK malloc pool: base: 0xe000000100ee3000 size=115828K
Loaded ACPI revision 2.0 tables.

MFS is defined: base= 0xe000000100ee3000 size= 5656 KB


Unpinned PDK malloc pool: base: 0xe000000110000000 size=262144K
Multiple MCAs/INITs feature enabled via checksum invalidation method
NOTICE: cachefs_link(): File system was registered at index 5.
255/0 ms_fake_drv
NOTICE: nfs3_link(): File system was registered at index 8.
NOTICE: mod_fs_reg: Cannot retrieve configured loading phase from KRS for module: cifs. Setting to load at INIT

1 cell
1/0 sba
1/0/0 lba
1/0/0/0/0 asio0
1/0/0/0/1 asio0
1/0/0/1/0 igelan
c8xx BUS: 0 SCSI C1010 Ultra Wide Single-Ended assigned CPU: 1
1/0/0/2/0 c8xx
c8xx BUS: 1 SCSI C1010 Ultra Wide Single-Ended assigned CPU: 2
1/0/0/2/1 c8xx
c8xx BUS: 2 SCSI C1010 Ultra Wide Single-Ended assigned CPU: 3
1/0/0/3/0 c8xx
c8xx BUS: 3 SCSI C1010 Ultra160 Wide LVD assigned CPU: 0
1/0/0/3/1 c8xx
1/0/1 lba
c8xx BUS: 4 SCSI C1010 Ultra Wide Single-Ended assigned CPU: 1
1/0/1/1/0 c8xx
c8xx BUS: 5 SCSI C1010 Ultra Wide Single-Ended assigned CPU: 2
1/0/1/1/1 c8xx
1/0/2 lba
1/0/4 lba
1/0/6 lba
1/0/8 lba
1/0/10 lba
1/0/12 lba
1/0/12/1/0 PCItoPCI
fcd: Claimed HP AD194-60001 4Gb Fibre Channel port at hardware path 1/0/12/1/0/4/0 (FC Port 1 on HBA)
1/0/12/1/0/4/0 fcd
fcd: Claimed HP AD194-60001 4Gb Fibre Channel port at hardware path 1/0/12/1/0/4/1 (FC Port 2 on HBA)
1/0/12/1/0/4/1 fcd
1/0/12/1/0/6/0 iether
1/0/12/1/0/6/1 iether
1/0/14 lba
1/120 processor
1/121 processor
1/122 processor
1/123 processor
1/250 pdh
1/250/0 acpi_node
1/250/1 ipmi
1/0/12/1/0/4/1.2 fcd_fcp
1/0/12/1/0/4/1.2.43.255.0 fcd_vbus
1/0/12/1/0/4/1.2.168.255.8 fcd_vbus
1/0/12/1/0/4/1.2.169.255.8 fcd_vbus
1/0/12/1/0/4/1.2.170.255.8 fcd_vbus
1/0/12/1/0/4/1.2.43.255.0.0 tgt
1/0/12/1/0/4/1.2.169.255.8.0 tgt
1/0/12/1/0/4/1.2.170.255.8.0 tgt
1/0/12/1/0/4/1.2.168.255.8.0 tgt
1/0/12/1/0/4/1.2.169.255.8.0.0 stape
1/0/12/1/0/4/1.2.170.255.8.0.0 stape
1/0/12/1/0/4/1.2.43.255.0.0.0 stape
1/0/12/1/0/4/1.2.168.255.8.0.0 stape
1/0/12/1/0/4/0.1 fcd_fcp
1/0/12/1/0/4/0.1.43.255.0 fcd_vbus
1/0/12/1/0/4/0.1.168.255.8 fcd_vbus
1/0/12/1/0/4/0.1.169.255.8 fcd_vbus
1/0/12/1/0/4/0.1.170.255.8 fcd_vbus
1/0/12/1/0/4/0.1.171.255.8 fcd_vbus
1/0/12/1/0/4/0.1.43.255.0.0 tgt
1/0/12/1/0/4/0.1.168.255.8.0 tgt
1/0/12/1/0/4/0.1.169.255.8.0 tgt
1/0/12/1/0/4/0.1.170.255.8.0 tgt
1/0/12/1/0/4/0.1.168.255.8.0.0 stape
1/0/12/1/0/4/0.1.43.255.0.0.0 stape
1/0/12/1/0/4/0.1.169.255.8.0.0 stape
1/0/12/1/0/4/0.1.170.255.8.0.0 stape
1/0/12/1/0/4/0.1.171.255.8.0 tgt
1/0/12/1/0/4/0.1.171.255.8.0.0 schgr
1/0/0/2/1.2 tgt
1/0/0/2/1.2.0 sdisk
1/0/0/2/1.7 tgt
1/0/0/2/1.7.0 sctl
1/0/0/2/0.7 tgt
1/0/0/2/0.7.0 sctl
1/0/1/1/0.6 tgt
1/0/1/1/0.6.0 sdisk
1/0/1/1/0.7 tgt
1/0/1/1/0.7.0 sctl
1/0/1/1/1.6 tgt
1/0/1/1/1.6.0 sdisk
1/0/1/1/1.7 tgt
1/0/1/1/1.7.0 sctl
1/0/1/1/1.8 tgt
1/0/1/1/1.8.0 sdisk
1/0/0/3/0.7 tgt
1/0/0/3/0.7.0 sctl
1/0/0/3/1.7 tgt
1/0/0/3/1.7.0 sctl
1/0/1/1/1.10 tgt
1/0/1/1/1.10.0 sdisk
1/0/1/1/1.12 tgt
1/0/1/1/1.12.0 sdisk
1/0/1/1/1.14 tgt
1/0/1/1/1.14.0 sdisk
255/1 mass_storage
Boot device's HP-UX HW path is: 1/0/1/1/0.6.0

System Console is on the Built-In Serial Interface
igelan0: INITIALIZING HP A7109-60001 PCI 1000Base-T Core at hardware path 1/0/0/1/0
iether1: INITIALIZING HP AD194-60001 PCI/PCI-X 1000Base-T 2-port 4Gb FC/2-port 1000B-T Combo Adapter at hardware path 1/0/12/1/0/6/0
iether2: INITIALIZING HP AD194-60001 PCI/PCI-X 1000Base-T 2-port 4Gb FC/2-port 1000B-T Combo Adapter at hardware path 1/0/12/1/0/6/1
Logical volume 64, 0x3 configured as ROOT
Logical volume 64, 0x2 configured as SWAP
Logical volume 64, 0x2 configured as DUMP
Swap device table: (start & size given in 512-byte blocks)
entry 0 - major is 64, minor is 0x2; start = 0, size = 16777216
Dump device table: (start & size given in 1-Kbyte blocks)
entry 0000000000000000 - major is 31, minor is 0x46000; start = 2349940, size = 8388604
Starting the STREAMS daemons-phase 1
Create STCP device files
Starting the STREAMS daemons-phase 2
$Revision: vmunix: B11.23_LR FLAVOR=perf Fri Aug 29 22:35:38 PDT 2003 $
Memory Information:
physical page size = 4096 bytes, logical page size = 4096 bytes
Physical: 16760080 Kbytes, lockable: 12532056 Kbytes, available: 14552220 Kbytes