What has happened here??

 

What has happened here??


Can anybody tell me what the hell has happened here and if it's serious?


Dec 6 18:54:59 nlnwgu05 vmunix:
Dec 6 18:54:59 nlnwgu05 vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 441288426, dev: 1f04c000, io_id: 41061a5
Dec 6 18:55:00 nlnwgu05 vmunix: SCSI: Request Timeout; Abort Tag -- lbolt: 441288526, dev: 1f04c000, io_id: 41061a6
Dec 6 18:55:02 nlnwgu05 vmunix: SCSI: isrEscape Controller at 0/1/0/0.
Dec 6 18:55:02 nlnwgu05 vmunix: SCSI: First party detected bus hang (HTH) -- lbolt: 441288765, dev: 1f04c000
Dec 6 18:55:02 nlnwgu05 vmunix: lbp->state: 30008
Dec 6 18:55:02 nlnwgu05 vmunix: lbp->offset: ffffffff
Dec 6 18:55:02 nlnwgu05 vmunix:
Dec 6 18:55:02 nlnwgu05 above message repeats 2 times
Dec 6 18:55:02 nlnwgu05 vmunix: lbp->nominalOffset: 360
Dec 6 18:55:02 nlnwgu05 vmunix: lbp->Cmdindex: f
Dec 6 18:55:02 nlnwgu05 vmunix: lbp->last_nexus_index: 54
Dec 6 18:55:02 nlnwgu05 vmunix: lbp->nexus_index: 55
Dec 6 18:55:02 nlnwgu05 vmunix: uCmdSent: e005b80 uNexus_offset: 10145754
Dec 6 18:55:02 nlnwgu05 vmunix: last lbp->puStatus [00000000405d5734]:
Dec 6 18:55:02 nlnwgu05 vmunix: 0003006b 0003007a 00030074 00030074
Dec 6 18:55:02 nlnwgu05 vmunix: next lbp->puStatus [00000000405d5744]:
Dec 6 18:55:02 nlnwgu05 vmunix: 00030074 00030074 00030074 00030074
Dec 6 18:55:02 nlnwgu05 vmunix: From most recent interrupt:
Dec 6 18:55:02 nlnwgu05 vmunix: ISTAT: 2a, SIST0: 40, SIST1: 01, DSTAT: 80, DSPS: 00000000
Dec 6 18:55:02 nlnwgu05 vmunix: lsp: 0x0000000048177e00
Dec 6 18:55:02 nlnwgu05 vmunix: bp->b_dev: 1f04c000
Dec 6 18:55:02 nlnwgu05 vmunix: scb->io_id: 41061a3
Dec 6 18:55:02 nlnwgu05 vmunix: scb->cdb: 2a 00 00 00 43 c0 00 00 10 00
Dec 6 18:55:02 nlnwgu05 vmunix: lbolt_at_timeout: 441291626, lbolt_at_start: 441288626
Dec 6 18:55:02 nlnwgu05 vmunix: lsp->state: 30d
Dec 6 18:55:02 nlnwgu05 vmunix: Jump Table entry [ffffffff83fdeee0]: ff010066 10141000
Dec 6 18:55:02 nlnwgu05 vmunix: lsp->puScript [00000000405d1000]:
Dec 6 18:55:02 nlnwgu05 vmunix: 08001000 80de0000 0000e501 08001000
Dec 6 18:55:02 nlnwgu05 vmunix: 47afe000 0000e580 78370000 00000000
Dec 6 18:55:02 nlnwgu05 vmunix: DSAtbl->host_iocb_index: f
Dec 6 18:55:02 nlnwgu05 vmunix: DSAtbl->host_iocb_addr: 10145b80
Dec 6 18:55:02 nlnwgu05 vmunix: stored scratcha: 0xff078066
Dec 6 18:55:02 nlnwgu05 vmunix: scratch_lsp: 0x0000000048177e00
Dec 6 18:55:02 nlnwgu05 vmunix: c8xx_iocb [ffffffff83fdeb00]:
Dec 6 18:55:02 nlnwgu05 vmunix: 0e005b80 ff008066 10141000 bf0c1f00
Dec 6 18:55:02 nlnwgu05 vmunix: 00000004 10145b60 0000000a 10145b68
Dec 6 18:55:02 nlnwgu05 vmunix: Pre-DSP script dump [ffffffff83fde898]:
Dec 6 18:55:02 nlnwgu05 vmunix: e27c0004 10146000 e2600004 10145b7c
Dec 6 18:55:02 nlnwgu05 vmunix: 7c027f00 00000000 1e000000 00000010
Dec 6 18:55:02 nlnwgu05 vmunix: Script dump [ffffffff83fde8b8]:
Dec 6 18:55:02 nlnwgu05 vmunix: 48000000 00000000 98080000 00000016
Dec 6 18:55:02 nlnwgu05 vmunix: 60000040 00000000 1f000000 00000040
Dec 6 18:55:02 nlnwgu05 vmunix: NCR chip register dump for: 0x400200a
Dec 6 18:55:02 nlnwgu05 vmunix: 00: SCNTL3: bf SCNTL2: 00 SCNTL1: 10 SCNTL0: da
Dec 6 18:55:02 nlnwgu05 vmunix: 04: GPREG: 0a SDID: 0c SXFER: 1f SCID: 47
Dec 6 18:55:02 nlnwgu05 vmunix: 08: SBCL: 26 SSID: 8a SOCL: 06 SFBR: 80
Dec 6 18:55:02 nlnwgu05 vmunix: 0c: SSTAT2: 08 SSTAT1: 0e SSTAT0: 00 DSTAT: 80
Dec 6 18:55:02 nlnwgu05 vmunix: 10: DSA: 83fdeb00
Dec 6 18:55:02 nlnwgu05 vmunix: 14: MBOX1: 00 MBOX0: 00 ISTAT1: 00 ISTAT: 28
Dec 6 18:55:02 nlnwgu05 vmunix: 1c: TEMP: 83fde358
Dec 6 18:55:02 nlnwgu05 vmunix: 24: DCMDDBC: 48000000
Dec 6 18:55:02 nlnwgu05 vmunix: 28: DNAD: 83fde8b8
Dec 6 18:55:02 nlnwgu05 vmunix: 2c: DSP: 83fde8c0
Dec 6 18:55:02 nlnwgu05 vmunix: 30: DSPS: 00000000
Dec 6 18:55:02 nlnwgu05 vmunix: 34: SCRATCHA: ff078066
Dec 6 18:55:02 nlnwgu05 vmunix: 38: DCNTL: a1 DWT: 00 DIEN: 7f DMODE: 4c
Dec 6 18:55:02 nlnwgu05 vmunix: 3c: ADDER: 83fde8c0
Dec 6 18:55:02 nlnwgu05 vmunix: 40: SIST1: 00 SIST0: 00 SIEN1: 97 SIEN0: 8f
Dec 6 18:55:02 nlnwgu05 vmunix: 44: GPCNTL: 2f MACNTL: 00 SWIDE: 00 SLPAR: 00
Dec 6 18:55:02 nlnwgu05 vmunix: 48: RESPID1: 00 RESPID0: 80 STIME1: 00 STIME0: fc
Dec 6 18:55:02 nlnwgu05 vmunix: 4c: STEST3: 80 STEST2: 00 STEST1: 0c STEST0: 76
Dec 6 18:55:02 nlnwgu05 vmunix: 50: RESV50: 00 RESV51: c0 SIDL1: 00 SIDL0: 00
Dec 6 18:55:02 nlnwgu05 vmunix: 54: CCNTL1: 01 CCNTL0: 01 SODL1: 00 SODL0: 0d
Dec 6 18:55:02 nlnwgu05 vmunix: 58: RESV58: 00 RESV59: 00 SBDL1: 00 SBDL0: 0d
Dec 6 18:55:02 nlnwgu05 vmunix: 5c: SCRATCHB: 000c0000
Dec 6 18:55:02 nlnwgu05 vmunix: 60: SCRATCHC: c0ffffff
Dec 6 18:55:02 nlnwgu05 vmunix: 64: SCRATCHD: 10145b7c
Dec 6 18:55:02 nlnwgu05 vmunix: 68: SCRATCHE: 83fdecf4
Dec 6 18:55:02 nlnwgu05 vmunix: 6c: SCRATCHF: 10140f00
Dec 6 18:55:02 nlnwgu05 vmunix: 70: SCRATCHG: bf0c1f00
Dec 6 18:55:02 nlnwgu05 vmunix: 74: SCRATCHH: 10145754
Dec 6 18:55:02 nlnwgu05 vmunix: 78: SCRATCHI: 0c01bf1f
Dec 6 18:55:02 nlnwgu05 vmunix: 7c: SCRATCHJ: 0e005b80
Dec 6 18:55:02 nlnwgu05 vmunix: bc: SCNTL4: 00
Dec 6 18:55:02 nlnwgu05 vmunix: PCI configuration register dump:
Dec 6 18:55:02 nlnwgu05 vmunix: Command: 0157
Dec 6 18:55:02 nlnwgu05 vmunix: Latency Timer: ff
Dec 6 18:55:02 nlnwgu05 vmunix: Cache Line Size: 10
Dec 6 18:55:03 nlnwgu05 vmunix:
Dec 6 18:55:03 nlnwgu05 vmunix: SCSI: Resetting SCSI -- lbolt: 441288865, bus: 4 path: 0/1/0/0
Dec 6 18:55:03 nlnwgu05 vmunix: SCSI: Reset detected -- lbolt: 441288865, bus: 4 path: 0/1/0/0
Dec 6 18:55:03 nlnwgu05 vmunix: SCSI: Read error -- dev: b 31 0x04c000, errno: 126, resid: 2048,
Dec 6 18:55:03 nlnwgu05 vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048.
Dec 6 18:55:03 nlnwgu05 vmunix: LVM: vg[2]: pvnum=7 (dev_t=0x1f04c000) is POWERFAILED
Dec 6 18:55:03 nlnwgu05 vmunix: SCSI: Write error -- dev: b 31 0x04c000, errno: 126, resid: 57344,
Dec 6 18:55:03 nlnwgu05 vmunix: blkno: 1856, sectno: 3712, offset: 1900544, bcount: 57344.
Dec 6 18:55:13 nlnwgu05 vmunix: LVM: Recovered Path (device 0x1f04c000) to PV 7 in VG 2.
Dec 6 18:55:03 nlnwgu05 vmunix:
Dec 6 18:55:13 nlnwgu05 above message repeats 3 times
Dec 6 18:55:13 nlnwgu05 vmunix: LVM: Restored PV 7 to VG 2.
5 REPLIES 5
Jeff Schussele
Honored Contributor

Re: What has happened here??

Hi Marvin,

Errors at the top indicate that you had a drive (or LUN) - specifically c4t12d0 - that wasn't responding for a period of time.
Errors toward the problem point to a pathing problem which could indicate that the c4 controller or the switch it connects to or the front end port on the array could also be culprits. But if you saw no other errors on disks/LUNs off the c4 controller at the same time then it was probably the disk/LUN - unless of course you have no other disks/LUNs connected to the c4 HBA.

HTH,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Jeff Schussele
Honored Contributor

Re: What has happened here??

Sorry - should have read
"Errors toward the bottom....."

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Andy Torres
Trusted Contributor

Re: What has happened here??

I'll take a stab at it.

The disk device at path 0/1/0/0 (root disk?) is taking lbolt write errors, an indication of a failing/failed device. It has been POWERFAILED and restored to VG 2, possibly a mirror failover?

Take a look at your disk, and call HP to replace the failed device.
Andy Torres
Trusted Contributor

Re: What has happened here??

I cede to Jeff's analysis. I read through it too quickly.

Re: What has happened here??


It seems that the disk was faulty. Thanks All!!