ProLiant Servers (ML,DL,SL)
1755658 Members
3605 Online
108837 Solutions
New Discussion

Proliant DL360 Gen9 2012r2 - Won't boot

 
kines454
Frequent Visitor

Proliant DL360 Gen9 2012r2 - Won't boot

I'm currently having issues swith my DL360 Gen 9 in that I'm getting a blue screen error of "Innaccessible Boot Device" shortly after the HP logo screen during boot. This is a production server running 2012r2 used for SQL so is running 24/7. No changes have been made, no updates, not even a restart to explain why it was working one day and not the next. I'm running hourly office hour backups on this server and everything was working fine, but I started getting backup failures at 8am one morning. We're using Arcserve which start reporting that Microsoft VSS doesn't support drives bigger than 64TB. This server only has a 1.8TB drive.

Initial steps were to re-install the backup agent and check logs - I was able to get to the iLO web interface but that was showing me errors and I was unable to log into it. I rebooted the server and received the error of Smart Storage Battery 1 failure. It would continue on but would get to the HP loading screen, and then loop. I proceeded to order a genuine HP replacement battery and installed it, but it still wouldn't boot. I would get errors during bootup - either a single iLO error, or multiple iLO errors and the failed battery error again. I decided to swap the entire systemboard out, and ensured it was a like for like replacement.

After the board was installed, it seemed to boot up without any errors until the HP loading screen where it was showing a blue screen with innaccesible boot device, which it didn't before the board swap. Booted into a windows install media to access command prompt. I was able to confirm that the C drive and the D drive containing all data were accessible. I tried a few commands I found online using bootrec, bcedit etc All of them show successful when running. But it made no difference. Trying to boot to safe mode gets the same error too.

The server had been offline a few days by now, so in an attempt to get it back online I restored from the last successful backup. I excluded the D drive from this, just restored C, EFI partition, and OEM reserved. When it finished, it got to the HP loading screen and instead of the blue screen, showed it was installing new devices. It then got into Windows like normal. I assumed at this point all was good. Over the next day of monitoring, I noticed that the backups were still failing with the same error. I restarted and got the blue screen error again.

I restored again, but tried a snapshot a few days earlier. Once that was finished, booted into Windows and I tried a restart straight away, and it was fine. Again assumed the problem was fixed - but backups still failing. I did quite a lot of digging, and found that there were quite a lot of devices in device manager showing as unknown with a yellow ! next to them. Looking at the information on them, they appeared to be regarding the chipset. I ran the service pack for Gen 9 Proliant servers which scanned and found a lot of drivers to install. I agreed to them all, it ran and asked to be restarted. I clicked ok, blue screen error again.

I ran another restore, but tried the latest one again. When booted up, it didn't show any unknown devices. Ok, give it a restart. Blue screen error.

I did a few more checks by booting into the install disc again and loading up cmd. I ran chkdsk on C: which found no errors. I took a screenshot of diskparts list volumes, restored the server, and loaded up diskpart in Windows to compare. When in Windows, diskpart shows C: as having a "boot" flag and the EFI partition having a "system" flag. When in it's failed state, C: has no flag, and EFI is "hidden".

A few more things I noticed was that when opening msconfig, the boot section doesn't show any OS's. I did an sfc /scnanow, no errors. When I run "bcedit" in Windows, I get an error that the requested system device cannot be found. However, I am able to assign it a letter in disk management and access it. I even ran a chkdsk on that which shows no errors.