ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ML370G3 - crash

Daniel Moyson
Occasional Visitor

ML370G3 - crash

Hello,

One of our ML370G3 crashes about once a week. The crash always happens during intensive use (Oracle datawarehouse updates). FYI : these updates occur 3 times a day.
The error message I receive from the hp agents is : "The device, \Device\Scsi\cpqcissm1, did not respond within the timeout period.". see today's crash system log attached.

I have installed the latest drivers from hp.com via the RDU and the latest bios firmware.

I suppose the problem comes from a driver, but I can't find the one that could cause the problem. I have yet updated them all last week!

Summary of the HW components :
ML370G3 - 2 processors @ 3Ghz - 6400 Controller with 3 HD (one raid5 volume) - AIT Tape drive (on internal scsi controller)
Windows 2000 SP4 with lastest updates from windowsupdate.

Thank in advance for your advices

Rgds,

Daniel Moyson
IT Mgr
8 REPLIES
Sunil Jerath
Honored Contributor

Re: ML370G3 - crash

Hello Daniel,
The cpqcissm driver belongs to the array controller. You have SA6400 controller. If the firmware of the controller is not uptodate then please do so. Also please reapply the latest driver i.e. cpqcissm for the array controller locally rather then using RDU. If this does not fix the problem then try moving the controller to a different slot to see if this helps. Still a problem then replace the controller. But most likely this is due to the driver i.e. cpqcissm.
Please let me know as to how you make out.

Regards,
Daniel Moyson
Occasional Visitor

Re: ML370G3 - crash

Hi,

I have checked the firmware on the controller and it is not the latest available. I'll perform a full backup tonight and update the firmware tomorrow.
Pls allow me a few days before I send feedback, as I usually have to wait 5-8 days ......... before a crash occurs !!

Thanks for your help

Rgds,

Daniel
Daniel Moyson
Occasional Visitor

Re: ML370G3 - crash

Hi,

Some (bad news). I have updated the firware of the crtler and it the situation went ... even worse. it crashes almost every day during heavy load (oracle).
I have checked all HW components with the HP Dos diagnostic and it did not report any error.
HP will change the motherboard as I also received another system error :
c000021a unknown hard error
NMI parity check
memory parity error

Thx for your help

Rgds,

Daniel
Sunil Jerath
Honored Contributor

Re: ML370G3 - crash

Hello Daniel,
Please have a look at the following article:

http://support.microsoft.com/default.aspx?scid=kb;en-us;330303

Regards,
Daniel Moyson
Occasional Visitor

Re: ML370G3 - crash

the server is updated with lastest SP4.

Daniel
Sunil Jerath
Honored Contributor

Re: ML370G3 - crash

Hello Daniel,
I have checked with my few fellow Engineers and everyone suggested contacting MS for resolution. In the meantime if I do come across something I shall definitely let you know. Please post the solution if you do resolve it by calling MS.

Regards,
Daniel Moyson
Occasional Visitor

Re: ML370G3 - crash

Hi,

Thank you for the support.
May I ask you why you believe this is a software issue rather than HW ? The technical staff of our HW provider will replace the motherboard this week. And, is this does not solve the issue, they suggest to replace the memory and the raid controller.
Anyway, how do I contact MS for this kind of issues, please ?

Rgds,

Daniel

Re: ML370G3 - crash

Hey Daniel,

was this ML370G3 setup with SmartStart?

For performance and stability reasons, both Oracle and Microsoft recommend that you but the database on a different RAID array. YOu shouldn't mix Operating System and Oracle database on the same spindles.

Normally, for my customers, i recommend putting W2k on 2 "small" disks in RAID-1, The oracle datafiles on another raid-1 and the oracle archives and logs on another raid-1.

Since few of us can afford that, just 2 different arrays would be nice already. You wouldn't mix operating system and paging operations with database operations.

I have found your error situation in many scenarios: bad disks, faulty controllers, outdated drivers and firmware, bad cables, etc... but some also come from stress conditions.

just adding a thought!