Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

Multiple errors for Raid Controller prior to system hanging

Ayman Altounji
Valued Contributor

Multiple errors for Raid Controller prior to system hanging

I have experienced multiple situations with different hardware where an event is logged in the system log prior to a Windows 2000 Server hanging. At this time the server is unresponsive to commands and in some situations, the services on the server fail.

I have seen this on a Proliant 5500 with a Smart Array 3200 controller.
The device, \Device\Scsi\Cpqarray1, did not respond within the timeout period.
I found the following article identifying a potential fix for this as being a firmware upgrade on the 3200 controller. The article is http://www.compaq.com/support/techpubs/customer_advisories/ex990921_cw01_0.html

The firmware was upgraded on the array controller and the error appeared to have dissappeared for a short period of time.

We are now seeing the same situation on a 5300 Array controller (Firmware Version 2.12) in a DL380. The error is the following: The device, \Device\Scsi\cpqcissm1, did not respond within the timeout period.

We have found numerous articles but none that clearly explain a resolution to these errors. In some situations the server will effectively lose functionality but no services are affected and thus, we never get monitoring alerts on the server. The following article sort of explains the hangs: http://www.compaq.com/support/techpubs/customer_advisories/EM010823_CW03_1.html

However, this article references a Microsoft article that explains that this issue occurs when using fiber. The servers that experience this error are not using fiber adapter cards.

We have now seen this issue on multiple array controllers (3200 and 5300) on multiple servers, Proliant 5500, DL380, and DL580. All servers that experience this issue are either Windows 2000 Server or Advanced Server.

My team has not been able to find an effective resolution to this issue and it continues to disturb us as we are constantly running into this issue which is affecting our business.

If someone could provide me with any help it would be appreciated.

Thanks,

snooge
4 REPLIES
mike_287
Occasional Visitor

Re: Multiple errors for Raid Controller prior to system hanging

snooge, did you ever figure this one out? what SP are you running? I have a DL380 g2 and am seeing them every 10-15 mins through out the day. Sometimes it seems like the server hangs but I haven't matched the events up with that. It is my sql server
Brent Stark
Occasional Visitor

Re: Multiple errors for Raid Controller prior to system hanging

Snooge,
I am currently experiencing this problem on 3 8500's with attached scsi arrays using the 5300 card as well as a overland tape array. Were you ever successful in resolving this?
Patric H
Occasional Visitor

Re: Multiple errors for Raid Controller prior to system hanging

Hi!
I have also the same problem on a Proliant 5500 win2k Server. I have searched everywhere for an answer but haven??t found one.
Have you guys or girls found a solution now?
"If we weren´t supposed to eat animals, they wouldn´t be made out of meat." -Homer Simpson-
Mike Mayfield
Occasional Visitor

Re: Multiple errors for Raid Controller prior to system hanging

I'm seeing a similar situation on a DL580 that's got Netware 6 loaded on it, so I don't think it's a NOS specific issue. My basic system is that whenever I attempt to DOWN or RESTART my server, as soon as CPU 1 is taken offline, the array goes into a mode where all drives are fully on, the keyboard is non responsive and it remains in this state forever. I have to disconnect power to down my server. This same phenomenon occurs when I attempt to install a Support Pack - the server hangs as it is about to enter the file copy phase (with all drives fully on)and I have to cycle power. I'm not able to apply the Support Pack. I can duplicate the failure with all server apps unloaded, all pci slots powered off and drivers unloaded and all volumes dismounted. I just replaced my Integrated SmartArray Controller with no change in symptoms. I'm also unable to update my SmartArray Firmware from 1.42 to 1.50 (the latest rev) using the SmartStart (v6.30) CD - the routine says that the firmware update was completed but the Results don't show the new firmware installed.