1752805 Members
5546 Online
108789 Solutions
New Discussion юеВ

Transient SCSI errors

 
Jack Gross
New Member

Transient SCSI errors

I have a pair of HP Proliant DL-580 servers set up as a RAC cluster running RedHat Linux 4.6 and Oracle 10g, and I've been getting a couple of errors a day on each server that look like this:

Sep 19 07:42:44 rsgisdp2 kernel: SCSI error : <0 0 2 5> return code = 0x8000002
Sep 19 07:42:44 rsgisdp2 kernel: Current sdf: sense key Aborted Command
Sep 19 07:42:44 rsgisdp2 kernel: ASC=c0 ASCQ= 0
Sep 19 07:42:44 rsgisdp2 kernel: end_request: I/O error, dev sdf, sector 118912
Sep 19 07:42:44 rsgisdp2 kernel: end_request: I/O error, dev sdf, sector 118918
Sep 19 07:42:44 rsgisdp2 kernel: device-mapper: dm-multipath: Failing path 8:80.
Sep 19 07:42:45 rsgisdp2 multipathd: 8:80: mark as failed
Sep 19 07:42:45 rsgisdp2 multipathd: oramp23: remaining active paths: 1

Sep 19 07:42:54 rsgisdp2 multipathd: 8:80: tur checker reports path is up
Sep 19 07:42:54 rsgisdp2 multipathd: 8:80: reinstated
Sep 19 07:42:54 rsgisdp2 multipathd: oramp23: remaining active paths: 2



The servers are attached to an XP24000 disk array, have Qlogic HBAs, and are using the latest version of HP's Qlogic drivers and HPDM. The firmware on the HBA cards have been updated to the latest HP-provided version. The servers are running the 64-bit version of RedHat Linux 4.6 with a 2.6.9-55.0.12 kernel. The affected LUNs contain OCFS2 and ASM filesystems.

I'm a bit concerned that an excessive number of these errors could cause the Linux OS to make these LUNs read only, but I've never seen that happen with these servers.

Since our SAN guys claim there is nothing wrong with the SAN hardware, is there anything more I can do to eliminate the errors? Should I even be concerned about the errors, since they only appear a few times a day and the OS has never prevented writing to these LUNs?
4 REPLIES 4
Sandeep_Chaudhary
Trusted Contributor

Re: Transient SCSI errors

I dont think there is any disk problem. But there is problem in path. i think u r using dual path. one path to the disk is failing. other path is OK. Thats why data transmission is happening. Please check for connectivity , HBA of other path
Jack Gross
New Member

Re: Transient SCSI errors

The SCSI errors show up with LUN devices on either path, although never both paths to a particular LUN simultaneously. HPDM always has at least one path available to the LUN to work with.

skt_skt
Honored Contributor

Re: Transient SCSI errors

could u post the o/p of "multipath -ll" and "multipath -v 3"
Jack Gross
New Member

Re: Transient SCSI errors