ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL380 G3 cpqcissm Timeout Errors with SmartArray 5i

 
Daniel Sawyer
Occasional Visitor

DL380 G3 cpqcissm Timeout Errors with SmartArray 5i

We started getting Event 9 cpqcissm timeout errors yesterday and have occurred more frequently today:

The device \Device\Scsi\cpqcissm1 did not respond within the timeout period

It occurs more than once at a time separated by 10 to 30 seconds. During this time the computer is nonresponsive both over the network and at local UI. Problem since it's our backend database.

When this error last occurred on 10/5/05 it occurred once then the system marked the drive as failed and we replaced it (RAID 5).

I've run the HP Array Diagnostic Utility, and looked at the other configuration/utility tools and they do not report any problems or degraded items.

I compared the number of disk errors and hours in service on this server to our other DL 380's and it has a MUCH higher rate of errors.

Another server has drives with twice the hours with only 1 error recorded across 6 drives.

3 of the 6 drives in the problem system with 11,100 hours have Failure Indicators - Not Ready Errors in the 60's and in the Problem Indicators one or more SCSI Bus Faults and/or Other Timeouts. They have read about 71 billion sectors and written 3 billion.

1 drive with 11,100 hours has only 10 Not Ready Errors but with 2 Other Timeouts and 1 SCSI Bus Fault

Another drive with 11,100 hours has NO ERRORS on it. And the drive replaced on 10/5/05 has 960 hours on it with no errors.

Anyone have any ideas what the root problem may be? Am I looking at the possibility of more than one hard drive failing at once? Do I have 3 bum drives and could/should I get them replaced? Thanks

 

 

P.S. This thread has been moved from Disk to ProLiant Servers (ML,DL,SL). - Hp Forum Moderator

3 REPLIES
Daniel Sawyer
Occasional Visitor

Re: DL380 G3 cpqcissm Timeout Errors with SmartArray 5i

An update ... we decided to risk replacing the drives one at a time with drives from a spare server that had no errors. We've replaced all 3 and are waiting to see if any more errors occur. We had replaced 2 last night and got the same error this morning on the remaining suspect drive.
Basil Vizgin
Honored Contributor

Re: DL380 G3 cpqcissm Timeout Errors with SmartArray 5i

Its possible that more than one hard drive fail at once. Especially if they from the same lot and all have manufacturing defect.
New HP Insight Diagnostics supports new functionality called Smart Array Drive Diagnosis.
---
It provides testing and troubleshooting recommendations when testing logical drives created using HP Smart Array and Modular Smart Array (MSA) controllers. Smart Array Drive Diagnosis also provides several other unique capabilities:
* Prediction of imminent failure
* Generates specific repair recommendations based on Diagnosis results
* Detects whether or not a disk drive in a fault mode is defective
---
May be you try this?
http://h18013.www1.hp.com/support/files/server/us/download/22903.html

Daniel Sawyer
Occasional Visitor

Re: DL380 G3 cpqcissm Timeout Errors with SmartArray 5i

Thanks for that suggestion Basil. I did install that on both the original server and the server I put the suspect drives on. It says the drives are operating within parameters and should not be replaced.

Now that I changed those 3 drives out I no longer get the errors. Also, since the last of the 3 was removed, the drives that replaced them have had no new Not Ready or other errors.

Go figure.