MSA Storage
1827286 Members
4413 Online
109717 Solutions
New Discussion

MSA2324i vDisk Rebuild Problem

 
SOLVED
Go to solution
InTech2000
Advisor

MSA2324i vDisk Rebuild Problem

Hi,

We have an issue with our MSA2324iand need your help. 

Initial Problem

  • It looks like we had a couple of disk that had SMART errors, and this made it go into a degraded state.
  • The disks were marked as LEFTOVR
  • I cleared the metadata on the disks and assigned them as dedicated spares.
  • The vdisk started rebuilding - Great.

The rebuild finished however and now the vdisk was showing "Fault" with a "red cross" and the message  "The virtual disk is offline. Disks may be down or missing. Data may be lost."

I tried clearing the metadata again, and added them as spares and disks are showing as :

2 x AVAIL

2 x VDISKSP

The vDisk is a RAID 10 with 10 Disks allocated.

I have all green ticks showing on all the disks in the enclosure - but the rebuild will not start.

I tried reallocatting again, but still it didnt start.

What I am doing wrong? How do I set off the rebuild process manually?

Many Thanks.

26 REPLIES 26

Re: MSA2324i vDisk Rebuild Problem

You already done the damage and vdisk recovery not possible it seems.

You shouldn't clear metadata in multiple drives when you have QTOF or Offline Vdisks. You should always take help from storage experts before you try anything.

It's possible that in RAID10 vdisk two drives part of same sub-vdisk you have cleared metadata so it lost that mirror set and that's why data recovery not possible.

You can still log a case with HPE support and try your luck.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

Thanks for replying.

If I add more disks in - would it start rebuilding then do you think?

Also , how do I bring the volume back online manually?

Thanks..

Re: MSA2324i vDisk Rebuild Problem

No it will not help.

Try to understand the logic, every drive have one section called metadata which contains vdisk configuration details. If you clear that metadata then that drive lost all details so it will not participate for RAID recovery.

For example, suppose there is one vdisk created with 5 drives in RAID 5 this means 4 data and 1 parity

In this condition if 1 data drive failed then you can get back data because other 3 data drives and 1 parity drive will rebuild failed drive data. But in case back to back two drive failed then we can't recover back data. That's called RAID failure.

In your case also same happened. Only difference is your vdisk was in RAID10 where sub-vdisk RAID set got failed.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

************************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

The drive was in Leftover state. The only way to rebuild was to clear the metadata. This did work and it started rebuilding.

After the rebuild, it showed the vdisk as offline.

The config at the moment shows :

Raid 1-1  - Disks 1,2

Raid 1-2  - Disks 1,2

Raid 1-3 - Disks 1,2

Raid 1-4 - Disks 1,2

Spare - 1,2,3

 

There are no errros on the disks - so I am not sure why its not rebuilding as there are spares available..

Is there a manual way to start the rebuild?

Re: MSA2324i vDisk Rebuild Problem

Thanks for sharing the details but you still not able to catch the RAID logic

Your vdisk created with 10 drives and in RAID10 which means you have 5 RAID1 sub-vdisk which are in RAID0 to build the complete RAID10 vdisk.

You see you have 4 sub-vdisk available out of 5 sub-vdisk. Your 5th sub-vdisk is completely missing because those two drives lost their metadata which means one RAID1 sub-vdisk is missing.

That's why complete vdisk shows as offline.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

************************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

Apologies I gave you the wrong information.

The original config had 8 disks and had :

Raid 1-1  - Disks 1,2

Raid 1-2  - Disks 1,2

Raid 1-3 - Disks 1,2

Raid 1-4 - Disks 1,2

---------------

After the failure its now showing :

9 Disks (not sure why!)

Raid 1-1  - Disks 1,2

Raid 1-2  - Disks 1,2

Raid 1-3 - Disks 1,2

Raid 1-4 - Disks 1

Spare - Disk 1

 

I was hoping it would rebuild from the dedicated spare.  There are no quarantined disks.

Is there manual way to do this? 

Re: MSA2324i vDisk Rebuild Problem

Can you please share the below output,

show vdisks <Vdisk_name>

Where Vdisk_name will be the vdisk which you are trying to recover

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

************************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

#show vdisks vDisk1_R10
Name Size Free Own Pref RAID Disks Spr Chk Status Jobs
Serial Number
---------------------------------------------------------------------------
vDisk1_R10 2398.0GB 0B A A RAID10 8 1 64k OFFL
00c0ff1162050000c9371b4e00000000
---------------------------------------------------------------------------

Re: MSA2324i vDisk Rebuild Problem

I would suggest to get a call log with HPE support because without checking MSA log it will be difficult to work on this further.

As vdisk in offline state so we may have to excute below command but that's not safe to run without checking logs otherwise you will face data loss,

trust enable

trust vdisk vDisk1_R10

Until vdisk back to degraded state Vdisk rebuilt wouldn't start.. Any vdisk in QTOF or Offline state wouldn't start rebuild eventhough you have spare configured.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

*********************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Thanks for you assistance.

I think I have at least 3 faulty disks as it shows the same error with different serial numbers :

There is a problem with a FRU. (FRU type: disk, enclosure: 1, device ID: 3, vendor: HP , product ID: EG0600FBVFP , SN: KWxxxxx, version: HPDC, related event serial number: A66399, related event code: 55)

This error gets generated when the MSA does the disk scrub i think. However it still shows a green tick on the disk in the enclousure view. 

I have ordered 5 brand new disks to be added tomorrow to the enclosure.

Would it be best to add the new disks, configure as dedicated spares, then use the trust command do you think?

Can I send the logs to you in private message to have a look?

Thanks

 

 

Re: MSA2324i vDisk Rebuild Problem

You need to clearly tell me below details,

1> How many total drives present in your system initially before you tried metadata clear?

2> How many total drives configured as spare before you tried metadata clear?

3> Ofcourse multiple drives having issue otherwise Vdisk can't go into offline state

4> Location of the drive which you have cleared metadata?

5> Do you have sub-vdisk details means which drive paired with which drive to make one RAID set, like that I need details of the 4 RAID set

You can send me the logs in private as well. I can give it a try.

If you have valid support with HPE then it's always good to log a support case officially.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

**************************************************************************

 

 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

Answers below :

1) 15 drives present in enclosure

2) Zero drives Configured as spare

3) .

4) Slots 4, 7, 9 and 13

5) 

Disk1-1 Raid1
Disk1-2 Raid1
Disk1-3 Raid2
Disk1-5 Raid2
Disk1-8 Raid3
Disk1-10 Raid3
Disk1-11 Raid4
Disk1-13 Spare

I dont know how to attach logs, I cant see any upload button in the DM. 

Re: MSA2324i vDisk Rebuild Problem

So I understand RAID sets as below right now,

RAID set 1-> 1,2

RAID set 2-> 3,5

RAID set 3-> 8,10

RAID set 4 -> 11, ?

So you say 4, 7 and 9 were not part of any of the above RAID sets, only 13 was part of them. This can't be possible because drives can only go to LEFTOVER state when they were part of some vdisk and for some reason they marked as LEFTOVER by controllers.

Do you remember how was the RAID sets before drives went to LEFTOVER state?

4 drives metadata clear always risky for data recovery.

You can be extremely lucky if none of 2 drives out of 4,7,9 and 13 part of same RAID set. Then we have good chance to recover this vdisk.

Without logs it's difficult but if you can tell me condition of RAID sets when everything was fine, then which drive join which RAID set after you cleared metadata that may help us. For example, lets say RAID 2 set earlier drive 3 and 4 was in pair but later drive at location went to LEFTOVER state. Then drive at location 5 you have configured as spare and it joins drive 3 as partner. Sameway other sub-vdisks also.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

**************************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

THanks, for taking time to look at this - really appreciate it.

Luckily I took a screen shot when everything was ok as follows :

vDisk1

RAID set 1-> 1,2

RAID set 2-> 3,5

RAID set 3-> 8,10

RAID set 4 -> 11, 7

vDisk2

RAID set 1-> 6,14

RAID set 2-> 12,15

Disk 13 - must have been a global spare at that particular time.

Not sure about Disk 4,9 though.

 

Sorry I should have mentioned I have a 2nd vDisk Volume.

Does that help?

Solution

Re: MSA2324i vDisk Rebuild Problem

So it's possible that 4, 9 and 13 went to LEFTOVER state long back.

Recently 7 went to LEFTOVER state

Going with the fact and information that you have provided you can try below and 13 should participate in rebuilding the vdisk1

trust enable

trust vdisk vDisk1_R10

show vdisks vDisk1_R10

 

Note: You can still log a HPE Support call and take official help before you try the above.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

**************************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

Unfortunately we only have hardware support only this MSA so I am unable to log a call with HP.

I am waiting for the 5 brand new disks to be installed first, before I use the trust command.

I think it is best to make disk 13 back to "avail" and configure the new disks as dedicated spares, so the volume would rebuild?

What do you think?

Re: MSA2324i vDisk Rebuild Problem

I don't know if drive 13 having medium errors or any kind of hardware errors or not. If so then you should wait for new drives. Otherwise you can use any of 4,7,9 and 13 for this Vdisk1 data recovery if none of them having any hardware errors.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Thanks,

Will let you know how it goes.

InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

We added the 5 disks on friday bays 16-20, and we had some issues with the controllers going offline.

Manage to restart the controller, and configure the 4 out of the 5 as dedicated spares.

We did the "trust" command and the volume came back online and it did a reconstruct momentarily. and it went offline again.. It now showed RAID set 4 -> 16 only - previously is was disk 11.

We did the trust" command and the volume came back online and it did a media scrub.  

It all shows online now and configured as follows :

vDisk1

RAID set 1-> 1,2

RAID set 2-> 3,5

RAID set 3-> 8,10

RAID set 4 -> 11, 16

vDisk2

RAID set 1-> 6,14

RAID set 2-> 12,15

According to the previous logs, they have been issues wth Disk 8 and 11, but at moment,  its showing all green, ticks  vdisk is online and health status "ok". 

Whats the best next steps, I am worried disk 8 and 11 may fail together? Is they any commands I can run to do a check on the disks?

Also, is it better to configure as dedicated or dynamic spare?

Thanks.

 

Re: MSA2324i vDisk Rebuild Problem

Nice to hear that we are able to recover back the vdisk. It was not easy without checking logs but thanks for your co-operation to made this possible.

Now regarding drive health state, only option is to check logs and find if any hardware error exists or not.

If 8 and 11 went down together but still your vdisk will be up and data can be accessible.

Real challange will be when you have both drives failure from a same sub-vdisk or RAID set.

Request you to mark the forum as resolved if there is no more outstanding query from your end on this issue.

This will help for everyone who are all following your forum.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

Thank-you for your help, without this I would have been completely stuck

I ran a vdisk verify overnight, and as suspected it has shown errors on disk 8 & 11 :

An error was detected by a disk drive. (disk: channel: 0, ID: 7, SN: 6XR1xxxxxxxxxxxxx, enclosure: 1, slot: 8)(Key,Code,Qual:0x1,0x17,0x1)(CDB:Rd 03bf6280 0040)(Info:0x03BF629A)(CmdSpc:0x11201100, FRU:0x0, SnsKeySpc:0x800001)(Recovered Error, recovered data with retries)

An error was detected by a disk drive. (disk: channel: 0, ID: 10, SN: 6XR1MAKxxxxxxxx, enclosure: 1, slot: 11)(Key,Code,Qual:0x1,0x17,0x1)(CDB:Rd 00239e80 0040)(Info:0x00239EBC)(CmdSpc:0x11201100, FRU:0x0, SnsKeySpc:0x800001)(Recovered Error, recovered data with retries

The RAID is still in a healthy state - whats the best way to replace the drives?

  1. Pull them out one at a time - and rebuild with the dedicated spare? 
  2. Wait until drive failure, and raid is "degraded state" - then rebuild with spare - this is more risky I think, just in case I get more than 2 drives fail again.

Thanks

Re: MSA2324i vDisk Rebuild Problem

Yes you can go for proactive replacement instead of waiting for those drives to fail but I would suggest 1 drive at a time.

Request you to mark the forum as resolved if there is no more outstanding query from your end on this issue.

This will help for everyone who are all following your forum.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Thank you so much for you help..

Could not have done this without you.

Many Thanks.

InTech2000
Advisor

Re: MSA2324i vDisk Rebuild Problem

Hi,

Just to update this thread, it looks like the original disk 8 (Raid set 3)  and disk 11  (RAID Set 4) now gone into LEFTOVR state.

Current config :

vDisk1

RAID set 1-> 1,2

RAID set 2-> 3,5

RAID set 3-> 10,18

RAID set 4 -> 16,17

vDisk2

RAID set 1-> 6,14

RAID set 2-> 12,15

 

Global Spare 

Disk 19

Disk 20

 

Disks which have currently been pulled from the enclosure :

4, 7, 9, 11, 13, 21, 22, 23, 24

 

Both vdisks are showing heathly and status is FTOL.

Is it safe to pull disks 8, and 10 out of the enclosure? 

Thanks.