ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Diagnosing Smart Array P440ar serial log errors

 
jbrown42
Occasional Collector

Diagnosing Smart Array P440ar serial log errors

I'm trying to pinpoint some platform issues, so I'm doing a deep dive into potential disk/RAID controller issues.  Checking the controller serial log, I'm seeing a lot of different errors.  Some are easy to diagnose (certain KCS codes).  Others, not so much (other KCQs.  I've spent the last week digging into SCSI command codes, KCQs, ASC/ASCQs, Sense Codes, Opcodes, etc.  I've got a fair handle on most of the errors I'm seeing, except for one.

There's one group of errors that I have that I cannot decipher:

[2019-07-03 07:24:51] Drive SN: xxxxxxxxxxxxxxxx
CDB=0x85092E0000000100B600000000002F00
CC Sense Data--
00: 70 00 01 00 00 00 00 06 80 00 00 00 00 1D 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1C: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[2019-07-03 07:24:51] Recovered [host] PR=0x8146b050 D030 Op=85 PLErr=02 IopErr=04 S=02
[2019-07-03 07:24:51]  KCQ=1:00:1D
[2019-07-03 07:24:51] Drive SN: xxxxxxxxxxxxxxxx
CDB=0x85092E00000001000000000000002F00
CC Sense Data--
00: 70 00 01 00 00 00 00 06 80 00 00 00 00 1D 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1C: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[2019-07-03 07:24:51] Recovered [host] PR=0x8146b050 D031 Op=85 PLErr=02 IopErr=04 S=02
[2019-07-03 07:24:51]  KCQ=1:00:1D

 

I get a pair of these errors for each SSD attached (16 SSD drives, 32 messages).  I do not get any errors for the SAS disks.  I can't find a KCQ 1:00:1D.

These errors correlate to periods of poor performance.  It almost seems like there's a bus reset happening, but I would imagine that would impact the SAS drives as well.  Any ideas?

 

Thanks in advance.

1 REPLY 1
Bunsol
HPE Pro

Re: Diagnosing Smart Array P440ar serial log errors

Hi Jbrown,

Need to check if the drives are SATA SSD because if the scsi errors are only being seen for the SSD and not the SAS that mean this sense scsi code is only for SATA interphase. If the errors are being seen only for the SSD then make sure that the controller driver and firmware is updated as per supported configuration. Regarding the KCQ we are also not aware of this however we do not believe this is actually an issue. Request you to make sure the driver and firmware of the controller and firmware of the drive are updated.

Regards,

Bunsol



I am a HPE Employee