ProLiant Servers (ML,DL,SL)
1751815 Members
5730 Online
108781 Solutions
New Discussion юеВ

Re: ML350 G5 Disk Failure

 
Antony Ryan
Occasional Advisor

Re: ML350 G5 Disk Failure

To all

Today we installed another P400 controller, ran our stress tests, and the server passed with flying colours! Yay...

Now all we need to do is convince HP that it is some fault with the E200i (be it drivers or firmware - I don't really care) on the ML350 G5, get them to supply the part (P400 controller) for free (and compensate my collegues and I for the many hours we have wasted!).

We also need to get another P400 for our other client that is experiencing the exact same issues on the exact same hardware.

Ant

Re: ML350 G5 Disk Failure

Ant,

Are you using the P400 w/bbc? And good luck trying to get HP to compensate you. ;)
fricci
Advisor

Re: ML350 G5 Disk Failure

Unfortunately, the reseller's technical support changed its mind.....
They told me (like HP technical support did) this is a software issue caused by an **improper configuration** of server. They make this decision after our talk, they never had a look at the server!
Maybe working with HP products produces this intellectual damages? ;-)
So, considering that the customer doesn't want to pay more that 500 Euro for a new controller, I think this could bring to legal actions....

Ant,

please let us know any news about the P400 testing (with BBWC?).


Franco
Antony Ryan
Occasional Advisor

Re: ML350 G5 Disk Failure

Franco/Scott

No BBC on the P400, just the standard 256 MB cache. One thing I was reading about the E200i - it will support RAID 5 with 128 MB cache - both our servers had this upgrade in place, but still failed.

I will let you all know how we go with HP in regards to getting these parts for free - and also what sort of response we get from HP about the inabality of the E200i to work properly.

Ant
PinnacleCS
Occasional Advisor

Re: ML350 G5 Disk Failure

I originally had an ML350 G5 with the e200i controller and experienced the same issues. I sent it back and replaced it with an ML370 with a P400 controller. I had received the same error on the P400 w/BBWC installed, although not as often. I installed the 7.9 support pack and the MS Hot Fix for the STOR port driver and haven't seen it again. This server was a SBS 2003, SP2. I am 99% sure these issues started after applying SP2.

About a month ago we needed to purchase a replacement server for one of our applications so we knowingly purchased a ML350 w/E200 and BBWC. I did some testing on this server. It was a plain Jane Windows 2003 build. I intentionally did not put SP2 on it. I tested w/7.9 and after 20 reboots and gigs of data, I did not receive the error. I put SP2 on the server and after just two reboots, I got the error message. I was not able to duplicate it performing any particular action. I installed the STOR Port update from MS, another 20 reboots later and gigs of data and no error. The ML370 has been in production for 3 months and the ML350 for about a month. Knock on wood, we haven't had any issues with either. Fortunately for my customer, I caught this and questioned it before it went into production. I├в ve built enough servers and been in this industry long enough to know that any Array Controller status message after the server is built is not a good thing regardless of what the stupid Indian in Tech Support says.

IMHO, I believe SP2 in the culprit and the issue occurs during shutdown. I have not seen nor heard of anyone having the same issue with a Linux or Novell based host. I don't think the SCSI subsystem in Windows is working with the driver correctly. I have not been able to replicate this issue once I installed the updated STOR port driver from MS. I wish I had time to build a Novell or Linux box and test with that.

BTW, I second everyone├в s opinion on HP's support. The stupid Indian techs don't know crap, they don't give a crap and they are morons. Just another instance of why we need to stay the HELL out of those countries, those people don't know anything nor do they care and they have no business in the tech industry. I see it time and time again even with the people who come to this country to work. Why in the hell doesn't HP see that, oh wait, the only thing most execs see is green. Sorry......

Hope this helps.

Blake
crisscross
New Member

Re: ML350 G5 Disk Failure

Blake, you are 100% correct. HP is staffed from top to bottom by MORONS. the execs are morons because they forsake customer satisfaction for a quick buck. the techs are all morons because they just are. there is nothing worse than dealing with a stupid indian who acts like he knows what he's doing. i'm on hold with them about this issue now. i demanded to speak to a second tier engineer and they refused. i stated that i knew in advance that the first guy couldn't help me yet i was forced to waste 15 minutes with him. he insisted that the 1792 and 1794 errors would go away if I installed firmware 1.66 for the e200i. i checked and told him it was already installed so he was WRONG. moreover, i don't care if the error reporting goes away. i only care if the problem is resolved. frankly, i'm shocked that hp is even allowing this thread to stay published due to all the negative statements (backed by proof) about them. this community should move to another forum so we can stay in touch in the event that hp pulls their usual tactics of obscuring their own failures. these american companies operate like third world dictatorships.

Re: ML350 G5 Disk Failure

Amazing to me is how this is fixed for some when all of the patches and hotfixes are installed, and for others it isn't. I tried rebuilding TWICE, and the best that I was able to do in both cases was to make the problem intermittent.

Scott
crisscross
New Member

Re: ML350 G5 Disk Failure

scott, fortunately, I haven't experienced any data loss or crashes yet, but there is no way i'm going into production after reading this thread and seeing the error messages myself.

i'm running firmware 1.66 on the e200i with driver 6.8.0.32. i manage about 30 of these ML350 G5's at various clients. they are all identical and were all purchased within 6 months of each other. almost all report 1794 errors which are preceded by 1792 errors - saying the battery charge is low. fortunately, i'm running sas drives, so i hope to not experience the failures of the entire sata raid array that were reported above when a single drive fails. i want to get these tech issues resolved, but I want to do something to send a loud message to hp corporate so it doesn't happen. you know that many of their components are IBM, right? it's just how they integrate that's different. god knows IBM has been sucking lately too.
Antony Ryan
Occasional Advisor

Re: ML350 G5 Disk Failure

HP are still faffing around with us. They provided us with a new firmware for the e200i that hasn't been published on the website yet. They assured us this would fix the problem - ofcourse it didn't!

Now they have asked for one my techs to go onsite again (so much time being wasted on this issue) so they can try playing around with the e200i settings, turn off write caching or something.

Will let you all know how it goes ofcourse - needless to say (but I will say it anyway) getting really p*ssed off with all this.
fricci
Advisor

Re: ML350 G5 Disk Failure

After six months, I can now confirm that DISABLING ACCELERATOR is the ultimate workaround to solve all E200i+BBWC issues.
After disabling it I neither got one "1792-Drive Array Reports Valid Data Found in Array Accelerator" message, nor I got any DATA CORRUPTION problem, with any driver version (tested versions: 6.6.2.32, 6.8.0.32, 6.10.0.32).
So, if you get problems with E200i+BBWC controller, DISABLE ACCELERATOR !!!
I hope this could help you.

In the meanwhile I discovered another nasty issue on ML350G5, but I will post it in a specific thread........

Franco