ProLiant Servers (ML,DL,SL)
1748182 Members
3648 Online
108759 Solutions
New Discussion юеВ

Re: ML350 G5 Disk Failure

 
fricci
Advisor

Re: ML350 G5 Disk Failure

Hi, Pac
thank you for your suggestions, but I never observed an unwanted reboot on that server.
Instead, I got corrupted files (and partitions) after a ***regular*** reboot (for example after applying some patches), as well as using some software applications.
The message "1792-Drive Array Reports Valid Data Found in Array Accelerator" (almost) always appears after a ***regular*** reboot with driver 6.6.0.32 or older and a similar behaviour is observed with driver 6.8.0.32.
The only version that seems not to be affected by this issue is 6.6.2.32.
I got some file corruption just rebooting also with the last driver 6.8.0.32.

Please note, just before clicking "reboot" I ran chkdsk on all volumes and all was ok.
After rebooting, during post, the message "1792-Drive Array Reports Valid Data Found in Array Accelerator" appeared, after login I ran chkdsk and I found files corruption on a volume. Chkdsk was unable to fix the problem also during the next restart, so I had to reformat the volume and to restore it from a previous backup.
I got 6-7 file/partition corruptions from the end of May.
On August 18 I had to completely restore all the system......

I know that PSU could be something involved in this issue, but I think that this could be the job of the HP support technician.... I have no PSU to change, they have them! Also cabling and backplane are potentially involved in this issue, but I have no backplane to change, because I don't work at HP....
I saw the output of the HPSRPT, I called it a diagnostic utility because it collects useful logs for diagnostic.

This was the response of 2nd level support (traslated from Italian language):
------------------------------------------
To solve the data loss issue, please do the following operations in the exposed order
1) Install driver 6.8.0.32
2) Install KB932755 which solve many problems.
------------------------------------------

If you read KB932755 you can notice that they suggest to install driver 6.8.0.32 BEFORE to install KB932755, otherwise you could get a BSOD. And I can't find in KB932755 something directy related to the specific issue. They didn't read the KB932755 at all!
At the end of August I didn't install KB932755 because after installing driver 6.8.0.32 I got a data corruption, so I had to downgrade to ver 6.6.2.32, then if I install KB932755 I could get a crash.

After 3 months of data corruptions this is the only answer I got! An endless trial and error debugging procedure at the expense of the customer!

I think these people have nothing to do with a technical role..... they are unable to walk through some technical issue.

Regards.
fricci
Advisor

Re: ML350 G5 Disk Failure

Sorry, in my previous post I wrote in a backward order the response of 2nd level support. This is the response of 2nd level support:
------------------------------------------
To solve the data loss issue, please do the following operations in the exposed order
1) Install KB932755 which solve many problems.
2) Install driver 6.8.0.32
------------------------------------------

Regards.

Re: ML350 G5 Disk Failure

WOW!
I wish that i had found this thread a month ago. I've been investigating an unexpected reboot issue on numerous Ml350 G5 servers. Forgive me for not reading this whole thread, but i believe I'm having the same issue. After several conversations with HP support, they kept tell me that it was one of my applications or Microsoft's problem. Only after i found kb932755, did the support person admit that it may be related to the HP e200i driver. After updating all HP drivers, firmware and BIOS, i install Win2K3 SP2 and my problem seems to have gone away. It's only been 2 days, so i'm crossing my fingers.

Did this solve everyone elses issues?

Thanks
Kris
Jim Philipson
Occasional Advisor

Re: ML350 G5 Disk Failure

Sorry for no update in a while. Here is a brief recap.

Same error as folks above. Called HP and they ultimately sent the correct part and replaced the systemboard/E200i. Diagnosed the controller as bad but it is integrated so the whole board had to go. I wanted to replace the SCSI backplane too but that didn't fit the diagnosis.

System rebooted again, different stop error, sorry I don't have that with me. HP diagnosed as a bad power backplane and replaced.

System rebooted again - no stop error this time, just a complete mystery. Told HP the customer is unwilling to troubleshot this any further on a new computer. Replace or return are the only options.

They now want to replace the SCSI backplane.

I have had four different cases open on this server and have zero confidence in the box at this point. It may run a week or longer before the problem comes up again. How can I put that into production? How long does it have to run before I can declare it stable? Why would the customer put up with this?

Every computer company makes a lemon or two. I think I've got one of HP's.

Good Luck.
Jason Riebe
New Member

Re: ML350 G5 Disk Failure

Hi all, I to have been troubled with a customers ML350 Exchange server with e200i controller after installing psp 7.9, firmware 7.9 and Win2kSP2. I have just installed the KB932755 patch and all has been good for 24 hours, fingers crossed.
PinnacleCS
Occasional Advisor

Re: ML350 G5 Disk Failure

Were you having data corruption problems or blue screen problems? I am having the 1792 errors on reboot, no BSOD or data corruption yet but the server isn't in production. I appled the KB AFTER the 7.9 support pack. I sure hope this is resolved!
francis obuon
New Member

Re: ML350 G5 Disk Failure

I have a similar problem with exactly the same server, ML350 G5.It keeps reporting that both PSUs have failed or are unplugged and should be replaced.I've cleared error logs,re-installed HP SIM but to no avail. Any ideas?

Re: ML350 G5 Disk Failure


Does anyone have the definitive fix for this? I have a brand new ML350 G5 with the same issue - intermittent 1792 notification on boot. I am supposed to go into production with this box in the next week or so and after reading this am reticent to do so. I have the latest firmware and drivers (7.91 series) installed and am running SBS 2003 R2 SP2. I don't believe I have the aforementioned MS KB installed but will look into it.

Thanks.

Scott
PinnacleCS
Occasional Advisor

Re: ML350 G5 Disk Failure

Have both PSU's replaced and install the hot fix in the KB article mentioned that will fix it. I also back-reved to support pack 7.8 just to be safe because 7.9 had a lot of issues. I have another ML350 G5 coming in so I'll let you know what I find there too.

Blake

Re: ML350 G5 Disk Failure


Thanks Pinnacle CS. I checked both supplies against the Customer Advisory and both are good. I installed hotfix 932755 and will beat the snot out of the system and reboot 50 times before I go into production. The few reboots I've done since the hotfix have been clear of 1792's, although I have not done a lot of I/O on the system. Please do let me know how testing goes with your new G5 as well. Thanks again.

Scott