ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ProLiant DL380G6 - error 640006 and POST Error 1779

Jason_Z
Occasional Advisor

ProLiant DL380G6 - error 640006 and POST Error 1779

Hi there,

we have a ProLiant DL380G6,  Logical drive 2 is a Raid5 consisting of 3 hard drives  (3, 4, and 5).

 

The HP Diagnose tool has indicated that drive 4 has failed (error 640006) and recommends replacement.

 

We have hot swapped the drive 4 out with a new hard drive, but Logical drive 2 has disappeared, POST Error 1779-Drive array Controller dects replacement drives came up.

 

 

We put back the original hard drive and re-enabled array config the system is running fine now, but error 640006 still comes up when we run HP Diagnoses, I am a bit nervous in swapping it out and same thing happens again.

 

I am just wondering why has this occurred? We have replaced hard drive before and have never lost the logical drive.  Can anyone shed some light on this please?

 

Cheers

Jason

 

10 REPLIES
PGTRI
Honored Contributor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

hi,

 

There are still errors on the drive:

 

 

Physical Drive 1I:1:4    Read Errors Hard                     0x0000000b

 

Thanks

 

regards,

How to Say Thank You? Just click the KUDOS!
Jason_Z
Occasional Advisor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

Hi there,

i am wanting to swap out the hard drive 4 but scared that POST error will occur again and our logical drive disppears again. Is there any explaination why this happened?

 

I don't want to count on luck to recover data. Will try back it up though.


Cheers

PGTRI
Honored Contributor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

hi,

 

That's a strange behaviour, that should't happen. What's the drive that you use for replacement?

 

Thanks

 

regards,

How to Say Thank You? Just click the KUDOS!
Jason_Z
Occasional Advisor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

 

 

 

Jason_Z
Occasional Advisor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

Existing hard drives

 

Capture.JPG

 

waaronb
Respected Contributor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

It doesn't seem like the controller things drive 4 is in a pre-failure state.

 

I didn't look at the diagnostic report, but it could have been a temporary issue with the drive, showing a failure to read from it.

 

Normally if the controller detects that a drive is going to fail soon, it will turn on the amber LED for that drive and the controller will show a pre-failure warning instead of "OK" for the status.

 

Your other problem is that when you removed the drive you thought was bad, the array disappeared?  It shouldn't do that on a RAID 5.  It *should* have gone into recovery mode the moment that drive came out, and the IML should have logged an entry saying as much.

 

If none of that happened when you removed the drive, I'd double-check all of the configurations because either it's not really a RAID 5 or the controller might be having a problem.

 

I'll probably go back and look at your diag report now since it will show for sure what the RAID level is and the status of the drives, but if I were you I'd make sure your backups are ready just in case you need them.

waaronb
Respected Contributor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

One thing I notice from the report... your firmware is VERY old. 3.0 came out in March 2010 and the latest version is 6.40 so you should really consider looking at updating firmware on the controller, the system ROM, etc. The drives also need some firmware updates.

Drive 4 shows the last failure reason as "hot removed" which is probably when you pulled it, so that's to be expected, but drive 3 and 5 show the same thing. Have you removed all 3 of those drives before for maintenance or anything? Seems strange to see all of them being hot removed.

Otherwise I didn't really see any problems with the array. Array A says RAID 1, Array B says RAID 5.

Update those firmwares (you'll need to reboot) and then see how things look. I didn't see any problems with drive 4 in particular... not sure why it would give a POST code.

Of course, removing and reinserting a drive will sometimes clear up some things, so maybe it "fixed itself". :)
Jason_Z
Occasional Advisor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

Hey, thanks for replying.


Yea the LED lights on hard drive is showing the drive is in a healthy state. but when I run Hardware diagonosis this cames up tell me Hard drive 4 needs to be replaced. I am getting mixed messages and very confused !

 

We have replaced Hard drive 3 before (before I started), I don't think we have replaced drive 5.

 

Capture.JPG

waaronb
Respected Contributor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

Hmm... the ADU report you attached earlier does show 0x3C (60) hard read errors on that drive since factory, and 0x1F (31) read errors ECC corrected.

It's reallocated 0x3C sectors which I assume correlate to the read errors.

Of those errors, 11 were since the last system reset (power cycle), if that helps you determine how frequently these happen.

I don't know how many spare sectors HP drives set aside, but once you've used them all to map out the bad stuff, the drive gets failed.

There's a parameter in the physical drive info called reserved blocks, and it's 65536 (32MB worth of space), but something tells me that's not how many spare sectors it has. That seems pretty high, so it must reserve some for other things.

I'd try swapping the drive with it's replacement again. Like I said, it *shouldn't* cause the array to disappear... it's a RAID 5, it should go into recovery mode when the old drive is unplugged and it should start recovering the parity onto the new drive.

That's a lot of "should", and since it didn't work that way before, just make sure your data is backed up.

And also, do plan to update your firmware. I'd update your firmware first because maybe the problem you had earlier was something that fixed in one of the many firmware updates your P410 is missing over the past 4 years. :) Even the drives need updating. That failing drive for instance has firmware HPDD but I think the latest for it is HPDJ.

The other model of 146GB drive you have has firmware HPDB but the latest is HPDD.

HP doesn't often update their drive firmwares so when they do, there's usually a very good reason for it, like preventing drive corruption, fixing performance issues, etc.

That's why they started (many years ago) adding a feature to the array controller updates to check your drives against a list of firmware versions and start alerting you during POST that certain drives need to be updated.

In fact, if you updated your P410 first and then booted, it would probably tell you that some of your drives need updating.
Jason_Z
Occasional Advisor

Re: ProLiant DL380G6 - error 640006 and POST Error 1779

the reinserted hard drive failed again, so we said screw it and inserted anther harddrive today - and it fixed it.

 

Looks like the first time maybe we had a sick hard drive we think it's new ... ...