ProLiant Servers (ML,DL,SL)
1819800 Members
3312 Online
109607 Solutions
New Discussion ī„‚

HP ProLiant G7 :: "Array" Recovery

 
TopHatPrdxns115
Occasional Advisor

HP ProLiant G7 :: "Array" Recovery

I have a ProLiant DL580 G7, with all 8 front drive bays filled. The server's drives were configured such that each drive is its own logical volume, with RAID 0 fault tolerance (treating each drive as its own virtual array of one disk). I also noted the positions of and physically labeled each drive, so that if I ever had to remove them all, I'd know where each one should go. The HDD in slot 1 is the one that I installed ESXi on. The others hold VMs and datastores. With this background information out of the way, here's the issue:

Last night, I cloned the HDD in slot 1 (ESXi) to a new, larger capacity SSD. In order to do this, I powered down the server and removed all other drives first. Only the HDD in slot 1 remained (ESXi). I then placed the new SSD in slot 3, and booted into Windows to clone the HDD over to the SSD (sector-by-sector). After the cloning operation finished, I powered down the server and moved the SSD from slot 3 to slot 1, thinking that it should work. I don't remember cloning being a destructive process. The server refused to boot from the SSD, and did not recognise it as a valid boot device. When I checked the Array Configuration tool, it saw no logical disks. 

After seeing this, I tried replacing the SSD with the original HDD - the server reacted to it the same way that it did the SSD. At this point, I began to wonder if it simply didn't like that I had changed up the drive positions. I re-inserted all drives, in their original positions and rebooted. The server detected the drives, but still did not see the logical volumes/arrays. It also refused to boot from both the original HDD and the cloned SSD. When booting into any other OS, the OS can't see them. When attempting to re-install/repair ESXi, the installer can't see any of the locally-attached drives either. 

I know the drives are in the correct order, because I labeled them. All of the supposed "arrays" that I made were a single drive each, and I find it hard to believe that all 8 of the disks, from at least 4 different manufacturers, could have died simultaneously. Is there any way to recover this? Could I possibly have the server boot from the disks without (re-)configuring them in the Array Configuration tool, and losing all of my data?

7 REPLIES 7
support_s
System Recommended

Query: HP ProLiant G7 :: "Array" Recovery

TopHatPrdxns115
Occasional Advisor

Re: HP ProLiant G7 :: "Array" Recovery

It's been a rough few days, but the issue got resolved. Thankfully, without wiping all of the work that started over a year ago. Here's how I got out of the woods:

I initially thought that I needed to run the Smart Array P410i in HBA/IT Mode. So, in order to attempt this, I grabbed a copy of an HPE SPP from 2017 and updated the server's firmware. That included the P410i, which got it to v6.X. That version is new enough to run in IT Mode and pass through actual drives, without them being turned into arrays/volumes first. However, after installing the firmware update and rebooting, the Smart Array Controller refused to let me switch its mode. This was due to the fact that it was sporting an "unsupported configuration". This showed up as a Critical error/message. If the controller is already having issues with whatever's attached to it, there is a chance it may not allow the user to switch modes at all. So, passing the drives through raw was not on the table.

However, I ran into this online, earlier today:

https://serverfault.com/a/840138

The proposed idea was simple - the creation of a single-drive array is non-destructive to already-existing data on the drive. I, however, was not willing to test this on just any drive. I tested the process on the cloned SSD first, to prevent data loss. The SSD is a clone of the server's original boot volume. If the SSD got nuked by what I was testing, I'd still have the original boot volume untouched. 

I used the SPP to turn the cloned SSD back into a single-drive array, and set it as the boot volume. After that, I rebooted the server, to see if it booted into ESXi. It booted into ESXi with all of my VMs and datastores errored out. This indicated that I could use this process to safely restore the other drives in the front bay. I then used that same SPP to turn the 7 remaining drives into single-drive arrays. This was how I originally setup the server, at least a year ago.

After turning them all into individual arrays, I rebooted the server once more. The VMs and and datastores were still errored out, meaning that the datastores needed to be re-mounted to recover the VMs. I logged into esxcli and persistently re-mounted the datastores. After I re-mounted the datastores, everything came back, allowing me to check the VMs.

It was definitely not a good 4 days, I was sweating buckets. Due to the way the controller acted after this incident, I'm wondering if it's starting to go bad. Can't really tell at this time. I'll have to look into this more when I have the time. For now, the server project is back on track...

BPSingh
HPE Pro

Re: HP ProLiant G7 :: "Array" Recovery

Greetings!

 

Unfortunately, I am unable to access ADU on the web-link shared. That link is blocked for me. 

 

Could you please share via alternate source? 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
TopHatPrdxns115
Occasional Advisor

Re: HP ProLiant G7 :: "Array" Recovery

BPSingh
HPE Pro

Re: HP ProLiant G7 :: "Array" Recovery

Greetings!

I see that you have shared the ADU report when the issue had occurred.

It has the following warning message and no logical drives showing up which means logical drive configuration was lost. 

Warning:  The array controller has an unsupported configuration. You may reconfigure the controller, but the existing configuration and data will be overwritten and potentially lost.

There are no read/write errors seen on the drives though. 

Also noticed that the Physical Drive (400 GB SAS 512e SSD) 1I:1:1 is non HPE. Its recommended that we use HPE drives for better synchronization between controller and the drive.

I understand that the issue started after the cloning process and the logical drive information for all the drives were lost. 

Not sure how the controller lost the LD configuration when the cloning software was used. 


Regards,
Bhupendra 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
TopHatPrdxns115
Occasional Advisor

Re: HP ProLiant G7 :: "Array" Recovery

Would it be possible to retrieve a log of some sort from the Storage
Controller, to see what happened, leading up to the incident?
BPSingh
HPE Pro

Re: HP ProLiant G7 :: "Array" Recovery

Greetings!

 

Unfortunately, there is no slot x report captured in ADU logs so it has limited information



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo