Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

Understanding EVA reconstruction and leveling

SOLVED
Go to solution
sls
Advisor

Understanding EVA reconstruction and leveling

Can anyone explain exactly what is happening when an EVA goes through the reconstruction and leveling processes (or point me at any docs that explain this)? I cannot find enough information to be able to complete my understanding of these processes.

Ultimately I am trying to determine whether or not an EVA could tolerate an additional disk failure whilst it is still leveling but, in trying to find the answer, my lack of knowledge regards the EVA's ability to cope with disk failure has been exposed :(
8 REPLIES
Uwe Zessin
Honored Contributor

Re: Understanding EVA reconstruction and leveling

Reconstruction restores the VRAID redundancy. It depends on many factors (VRAID-Level, *which* disk drive has failed and *which* disk drive could fail next) whether the EVA can cope with another failure.

The problem is: in theory we can make great assumptions how many disk drives can fail at once, but *nobody* knows which ones will fail.


Leveling happens after the reconstruction and just makes sure data is redistributed equally across all disk drives in a disk group.
.
Steven Clementi
Honored Contributor

Re: Understanding EVA reconstruction and leveling

"The problem is: in theory we can make great assumptions how many disk drives can fail at once, but *nobody* knows which ones will fail."


Well, there ARE proactive failure warnings... unfortunately, it seems like the disk(s) always fail within minutes of the warning (or at least that's the experience of one of my customers)

;o)
Steven Clementi
HP Master ASE, Storage and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5)
RHCE
NPP3 (Nutanix Platform Professional)
Uwe Zessin
Honored Contributor

Re: Understanding EVA reconstruction and leveling

I don't think my point has anything to do with pre-failure alerts.

If you can ungroup a disk drive before it fails, great, but then no reconstruction is necessary anyway.
.
Daniel_279
Advisor

Re: Understanding EVA reconstruction and leveling

Reconstructing - Volume is inaccessible and redundant data is being regenerated and moved to other storage in this disk group


Leveling - The moving of PSEG between volumes in a disk group to provide equal distribution of capacity for each Virtual disk across all members of a disk group.

During the level-ing process, the VRAID protection must be maintained and when PSEGs are moved from one disk to another, the RSS integrity must not be violated.


PSEG - The smallest allocated physical capacity of 2MB of a physical disk drive that is used to build a virtual disk

RSS - Redundant Storage Set: A collection of RStores that span 6 to 11 physical Volumes


Disk groups can handle multiple disk failure's (reduced redundancy) and still maintain virtual disk accessibility to hosts.

1) Depends if multiple failures are in the same RSS
2) Depends on VRAID level

Key to success - capacity management availability -

Proper settings for the protection level, occupancy alarm, and available free space provide the resources for the array to respond to capacity-related faults


Dan
sls
Advisor

Re: Understanding EVA reconstruction and leveling

Many thanks for the prompt replies :) I have a couple of follow-ups:

Does reconstruction use unassigned capacity within the affected RSS only (to regenerate the redundant data)? If the loss of a disk took an RSS below the minimum number of members, would any subsequent RSS adjustment take place as part of the reconstruction process?

> During the level-ing process, the VRAID protection must be maintained and when PSEGs are moved from one disk to another, the RSS integrity must not be violated.

So the loss of a second disk in an RSS during leveling could result in data loss?

Thanks!
McCready
Valued Contributor
Solution

Re: Understanding EVA reconstruction and leveling

>Does reconstruction use unassigned capacity within the affected RSS only (to regenerate the redundant data)?

No - capacity on the entire EVA disk group that had the failed disk is used, so as part of the reconstruction process some initial transfer of data to other RSS groups may take place,

>If the loss of a disk took an RSS below the minimum number of members, would any subsequent RSS adjustment take place as part of the reconstruction process?

Yes - if the RSS group is too small to exist on it's own, it will be joined to another. Depending on the number of groups and disks, it may in turn be split again (somehow, I'm not an RSS internals expert).

>So the loss of a second disk in an RSS during leveling could result in data loss?

No, the loss of a second disk while RECONSTRUCTING in the same RSS could result in data loss. Note that if your data is RAID5, the loss of any other disk in the affected RSS group before reconstructing is over will result in data loss. If your data is RAID 1, you will only lose data if the phyical disk that was the failed disk's mirror fails. With Raid 6, you would have to have two disks fail in the same RSS group while reconstructing is going on.

Note you could have lots of other disk failures, but as long as each one is in a different RSS group, you are fine.

And yes, I have seen several instances of multiple disk failures within a 24-36 hour period (fortunately, while leveling) after an initial disk failure in my EVA's. Seems that the leveling "disk exercise" will sometimes cause additional failures. Only logical...


Thanks!
check out evamgt.wetpaint.com and evamgt google group
mollo_fire
Occasional Visitor

Re: Understanding EVA reconstruction and leveling

can i carry on eva xcs upgrade while leveling is on because it takes days

leix
Regular Advisor

Re: Understanding EVA reconstruction and leveling

you can upgrade it while leveling is on.
But i still suggest do it when the eva have no any issue ,data safe is the very very important things after all.