MSA Storage
1832978 Members
2662 Online
110048 Solutions
New Discussion

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

 
SOLVED
Go to solution
Lekhpal
Advisor

Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2040)

Dear All,
Looking for some clarifications on the VDRAIN process and after it - Re-Creation. We have two Pools A and B.
Pool B has two VDGs and total 3 volumes. We intend to delete 2 volumes out of it. Then we'll delete the VDG and expect its data moves to other VDG.
Now, we want to formulate the exact steps, and that is what this discussion is created for.

Will these be the steps:
1. Stop IO on volumes(to be deleted) at host end. If possible remove volumes or mapping on the host side.
2. Delete the two volumes in MSA.
delete volumes VOL01,VOL02
2. Delete the VDG.
delete volume-groups VDG1 <= We have two VDGs, so which one of them to delete here. Total volumes in Pool B are three, we want to delete 2 volumes out of 3. Can we put any one of the VDGs here as we don't know which volume is in which VDG.
3. Does the deletion of VDG automatically kicks-off the data draining, or there is a command to do that.
3. What needs to be done if it doesn't starts that process automatically. Document says:Any data in that DG need to be drained to the remaining disk groups. Does this statement in official doc means, its a manual command. It doesn't say how or is the process automatic. Incomplete doc. 
4. Then after VDRAIN completion, how to re-create the new VDG.
Does the new VDG needs to be created with 1 spare drive, or we need to create a new Global HotSpare. Note: Dynamic Spare Configuration option is 'Disabled' as per our checks. show vdisks indicates that Pools are RAID6. So, do we have to have a Global Spare. Where do we select the RAID type: At Pool level or at VDG level.
Also, before creating the new VDG, do we have to clear disk-metadata, as it was used earlier. I also read some drive(s) can still hold the metadata of original disk-group.So, in above Step:2, do we need to first do:remove disk-groups VDG1, to make the drive as AVAIL. And, how to make sure in above steps, that VDG doesn't ever goes into QTOF state. I have read even this command doesn't works sometime:dequarantine disk-group VDG1, and Support contact is the only option then.

Please advise, as basically we don't at all want to end up in any of the above situations.

Best

6 REPLIES 6
JonPaul
HPE Pro

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

@Lekhpal 
First VDRAIN can take a long time to complete,  are you prepared for that?  What are your expectations for completion timeframe?
What is the current configuration and what are you attempting to accomplish with the Disk-Group re-create?

When you delete the volume, you will delete the data in the volume.  Is the data in the volumes backed up or not necessary?

The remaining volume will need to be small enough to fit on the remaining Disk Group.  Are your 2 disk groups the same size?  Are your 2 disk-groups the same type (Performance, Standard or Archive)?

Once the disk-group is removed the disk-group will start the VDRAIN process.  The disk-group will need to remain in the system until the data drains from the deleted disk-group to the remaining disk-group.  THIS WILL TAKE A LONG TIME, depending on a bunch of factors.  Possibly days.

After VDRAIN the disks will be AVAIL and can be configured as a new Disk-group in either Pool or since this is a 2040 as a Linear Disk group.

RAID is set at the disk-group creation time.  It is recommended to have a spare drive but that is up to your risk level, ability to service timeframe and since this system is End Of Life, availability of replacement drives. (that last one may be your key for determining your risk level)

Deleting the Volumes will allow the Pool to 'forget' about the pages of data stored for those volumes.  This effectively frees space in the pool.

Disk-groups go quarantined for a number of reasons, among them because too many drives are lost in a RAID set (2 drives in RAID 5,  3 drives in RAID 6).  When a disk-group goes quarantined, the logs should be reviewed by support to determine what drives should be kept and if some drives should be removed before attempting recovery this is why often it takes HPE support to de-quarantine a disk-group.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Lekhpal
Advisor

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

Hi Jon. Thanks for the reply and insights provided. Yes, we are prepared to wait till VDRAIN process completes. Time is not an issue and we have learnt that it also depends on the how much is actual space usage.
We have virtual pool configuration. Our DGs are RAID6. One disk has failed in Pool B, and replacement disk not available.
Pool A has one DG. Pool B has two DGs. The failed drive is in one of these two DG. So, it has been identified that if we delete two volumes we will have enough space to fit in remaininig DG. Then we remove the DG for VDRAIN process. After VDRAIN, we re-create DG with one less drive. So, thats why we were looking for the right procedure.  Best

Lekhpal
Advisor

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

@JonPaul 

Hi Jon.

And, as you asked for configuration. It looks like this:

volumes:
Pool Name Total Size Alloc Size Class Type Health Reason Action
-----------------------------------------------------------------------------
B VOL01 34.0TB 30.4TB Virtual base OK
B VOL03 38.9TB 38.9TB Virtual base OK
B VOLO4 39.9TB 39.9TB Virtual base OK
A VOL05 54.9TB 54.9TB Virtual base OK
A STORE1 2199.9GB 1797.0GB Virtual base OK
A STORE2 2199.9GB 1274.2GB Virtual base OK

volume groups:
Group Name Serial Number Type Number of Members
----------------------------------------------------------------------
LNVR 000000000000000000000000000000000 Volume 2

Pool Name Total Size Alloc Size Class Type Health Reason Action
-----------------------------------------------------------------------
B VOL01 34.0TB 30.4TB Virtual base OK
B VOL03 38.9TB 38.9TB Virtual base OK

vdisks:
Name Size Free Own Pref RAID Class Disks Spr Chk Status Jobs Job% Serial Number Spin Down SD Delay Sec Fmt Health
--------------------------------------------------------------------------------------------------------------------------------------------
ISPOOL1 59.9TB 1879.2GB A A RAID6 Virtual 12 0 64k FTOL 000000000000000000000000000000000 Disabled 0 Mixed OK
ISPOOL2 59.9TB 5255.9GB B B RAID6 Virtual 12 0 64k FTDN 000000000000000000000000000000000 Disabled 0 512e Degraded
dgB01 59.9TB 5255.9GB B B RAID6 Virtual 12 0 64k FTOL 000000000000000000000000000000000 Disabled 0 512e OK

One disk in the RAID-6 disk group failed. Reconstruction cannot start because there is no spare disk available of the proper type and size.
- Replace the disk with one of the same type (SAS SSD, enterprise SAS, or midline SAS) and the same or greater capacity. For continued optimum I/O performance, the replacement disk should have performance that is the same as or better than the one it is replacing. Configure the new disk as a spare so the system can start reconstructing the vdisk.
- To prevent this problem in the future, configure one or more additional disks as spare disks.

Pools:
Name Serial Number Class Total Size Avail Snap Size OverCommit Disk Groups Volumes Low Thresh Mid Thresh High Thresh Sec Fmt
-----------------------------------------------------------------------------------------------------------------------------------------------
A 000000000000000000000000000000000 Virtual 59.9TB 1879.2GB 0B Enabled 1 3 25.00 % 50.00 % 99.64 % Mixed
B 000000000000000000000000000000000 Virtual 119.8TB 10.5TB 0B Enabled 2 3 25.00 % 50.00 % 99.82 % 512e

disk-groups:
ISPOOL1 59.9TB 1879.2GB Virtual A Archive 100 A A RAID6 12 0 64k FTOL 000000000000000000000000000000000 Disabled 0 Mixed OK
ISPOOL2 59.9TB 5255.9GB Virtual B Archive 50 B B RAID6 12 0 64k FTDN 000000000000000000000000000000000 Disabled 0 Mixed Degraded
dgB01 59.9TB 5255.9GB Virtual B Archive 50 B B RAID6 12 0 64k FTOL 000000000000000000000000000000000 Disabled 0 Mixed OK

2 Pools - Pool A and Pool B.
There is a disk group with the name Pool2.
Pool B is created using 2 DGs - Pool2 (59.9TB and free 5255.9GB) and DGB01 (59.9TB and free 5255.9GB).
Pool B total size would be around 119.8TB and free space of around 10.5TB. DGB01 is a RAID 6 DG.
The failed drive is in POOL2 disk group.
Now, need to delete the underlying volumes in Pool to create free space in the Pool.
DGB01 should have enough free space to accomodate the data in Pool2 disk group before it could be deleted.

So, what will be the steps and commands sequence to execute stating with delete volumes to accomplish above without any issues.

JonPaul
HPE Pro

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

@Lekhpal 
What is the end goal?

It looks like you have 6TB Archive drives and you are using most of the capacity in Pool B.
Volume sizes:  34.0TB + 38.9TB + 39.9TB = 112.8TB 
Used capacity:  30.4TB + 38.9TB + 39.9TB = 109.2TB  of Pool capacity 119.8TB

Using VDRAIN in any scenario is going to take a long time.  See the following.
https://support.hpe.com/hpesc/public/docDisplay?docId=a00018177en_us&docLocale=en_US
https://community.hpe.com/t5/msa-storage/msa-2050-vdrain-takes-months/td-p/7103498

You should replaced the faulty drive in the disk-group to get maximum fault tolerance.  This will take a while.
You should get a backup of all data.
You will need to delete 2 volumes to get below the 59.9TB capacity of a single disk-group.
You can then remove a disk-group -- Did I mention this will take a long time??
    If you have the ability to backup the data and restore after that would be the recommended process

But before all that, what is the end goal?



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Lekhpal
Advisor

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

Hello Jon.

End goal is to make the array working in good shape without replacing the faulty disk(unavailable). So basically we are not in a position to replace the faulty drive. We are open to delete the two volumes to make the space below 59.9TB in the DG. We are also ready to restore after the re-creation. Thats why needed the right steps, which seems to be : 1. delete the two volumes. 2. Remove the DG from the PoolB, hopefully drain kick-in and let it run for day(s). Then re-create the DG, but how with what RAID config etc.  This step is still not clear to me and the commands to be used as first time they were created by HPE tech. How to create the new DG with which RAID.  BR.

JonPaul
HPE Pro
Solution

Re: Clarifications, Steps , Procedure and Queries - MSA-2040 VDRAIN and Re-Creation(MSA-2

@Lekhpal 
Your current state is 'degraded' with the disk-group having 1 drive of fault tolerance available.  The performance of that disk-group will be degraded as well.  Without being able to fully support the system (replace parts) it would be recommended to review the need and update to something that would be supportable.  You will likely experience additional drive faults in the coming months.
I looked on https://parts.hpe.com  to see if any of these drives were available through spares but they don't seem to have any (J9F43A)  they recommend trying to find one with a reseller.

You can go through the steps to reduce the drive set and keep RAID 6 but be aware this is a long process and will put a lot of activity on drives which are already well used.  



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo