ProLiant Servers (ML,DL,SL)
1752707 Members
5612 Online
108789 Solutions
New Discussion юеВ

The Creation of a Raid 1+0 break the GPT

 
NicoCaceres
New Member

The Creation of a Raid 1+0 break the GPT

Hello to everyone, I'm having a little problem:

I have 500Gb disk inserted in the first buy of a HP ProLiant DL180 G6, that contains a ESXi fully functionally.

Then, I insert a blank disk in the second buy (identically to the first one) to make a RAID 1+0.

When the server restart, it can't find a media to boot.

Why that happened?

 

To solve it, I have to insert a USB disk with a Unix OS, and rebuild the GPT. After that, i restart the server and the problem is solved.

 

I have to do it with 250 servers, so i don't what to do that 250 times.

 

Any idea?

(sorry for my english)

Thank you!

2 REPLIES 2
xmate
Valued Contributor

Re: The Creation of a Raid 1+0 break the GPT

I guess you used 500Gb disk as a jbod (didn't previosly configured raid controller). When you create RAID 1+0 you tell the controller to make a logical drive to provide it to the operating system. So the OS differs previosly used disk as it was and new "logical disk" provided by controller. That's why you need to rebuild th GPT.
Because rebuilding GPT is not a common practice but a service task - I guess there is no automated solution. But anyway if you want to you can try to make a PXE boot image with your Unix OS and autostart script that rebuilds GPT.
Was the post useful? Click on the white KUDOS! Star.
Matti_Kurkela
Honored Contributor

Re: The Creation of a Raid 1+0 break the GPT

Are you using the software-based SmartArray B110i RAID controller, or have you installed a real hardware-based SmartArray RAID add-on controller(Exxx or Pxxx models)? This problem sounds like something you might see with the SmartArray B110i specifically.

 

The B110i is a software-based controller: the hardware is just a basic AHCI standard SATA/SAS controller that is built in to most Intel chipsets today. All the RAID functionality comes from the server firmware (at boot time) or the driver (when the OS or VMware ESXi is running).

 

The following applies *only* to the B110i, not to the real hardware-based SmartArray models nor to the subsequent SmartArray B120:

 

If you use the RAID configuration utility built into the server firmware, it will configure a SmartArray RAID metadata block to the disk and sets up the RAID array. When the OS boots up, the SmartArray B110i driver is supposed to read the metadata block and use the RAID set as configured.

 

But if the SmartArray B110i driver is not installed, the generic AHCI driver that is built into most operating systems will take over the controller and completely ignore the RAID metadata. It will present the disks as separate physical disks, not as a RAID set. It will also allow the SmartArray RAID metadata to be overwritten (by e.g. an OS installer that overwrites it with a GPT partition table).

 

Apparently, the firmware has a fail-safe mode for this situation: if the SmartArray RAID metadata block cannot be found, the firmware attempts to access the boot disk as basic, standalone, non-RAID disk. This might be why your ESXi installation currently works.

 

If you use the firmware-based RAID configuration utility again after installing the OS with a generic AHCI driver, it will  again overwrite the SmartArray metadata to the same disk location, corrupting your GPT partition table.

 

(With a traditional MBR partition table, a simple workaround would have been to leave a bit of unused space at the end of the disk, if the SmartArray RAID metadata is located at the end of the disk. But the GPT partition table includes one copy at the beginning of the disk and another at the end, so a part of GPT is going to be damaged, no matter where the RAID metadata is located.)

 

Looks like the SmartArray B110i is not supported in VMware ESXi, so there is no driver for it.

 

HP has not released the source code for the SmartArray B110i Linux driver, so only HP could make an ESXi driver for SmartArray B110i. So adding the B110i driver and using the B110i controller as HP intended it to be used is not an option for you.

 

I am not very familiar with the details of ESXi, but the only workaround I might suggest is that you *ignore* the firmware-based RAID configuration utility. Don't use it at all. If the ESXi includes software RAID functionality, use that instead. If the GPT corruption happens as soon as you boot the system with the second disk added (i.e. not only when using the firmware-based RAID configuration utility), this obviously won't help you.

 

If your hardware vendor recommended you a DL180 G6 without an add-on hardware-based SmartArray controller specifically for ESXi use, then the vendor has given you an unsupportable configuration. The vendor has clearly made a mistake, and you should contact him/her, explain the problem and see if s/he can find a satisfactory solution for you.

 

If you chose the hardware yourself, or did not tell the vendor that you were planning to use ESXi, then you obviously can only blame yourself for not checking the RAID controller compatibility with ESXi.

 

Some newer ProLiant models have SmartArray B120, which has at least one significant improvement for situations like this: it has an "off" switch for the firmware RAID functionality. In effect, you can tell the server firmware "no, I'm not using a SmartArray Bxxx software RAID driver; do not write any SmartArray RAID configuration metadata blocks anywhere!" This makes it safe to use it with the generic AHCI driver, if a specific SmartArray Bxxx driver is not available.

MK