Re: How to view RSS on EVA! + Expansion

UselessUser · ‎11-04-2009

Hi,

As a follow up to my previous post. I currently have a EVA 4400 with 4 shelves and 24 FSAS disks.

I have the protection level set to double, simply because the last two times I have needed to replace a disk it has taken over a week both times to replace in line with HP Best Practices config guide.

The person who set it up used VRAID 5 to create LUN's, which means I have the ability to lose a single disk at a time in each RSS before data loss. (But if reconstruction completes after the first failed disk I can lose one more from each RSS due to double "protection")

I believe that because I only have 4 shelves I do not have shelf redundancy at the moment, because the parity and data chunks of the VRAID 5 LUN's could reside on disks in the same shelf?

Now if thats the case, I believe I have 2 options to give me shelf redundancy, either buy 4 more shelves and re-arrange my existing disks across them, or convert all my VRAID 5 arrays into VRAID 1 arrays, because I am led to believe that VRAID 1 would split each mirror pair ensuring that they do not reside on the same shelf? Can I verify this in anyway on my EVA using some commandline?

If I were to buy the shelves, how would I add them and redistribute the disks and keep it online at the same time? And once this is done can I view the RSS layout to verify it has done the right thing?

IBaltay · ‎11-04-2009

Hi,

1 RSS member=1 physical disk of the same RSS in each disk shelf is giving you full verticality both for RAID5 and RAID1...

2 the current (non vertical) and future RSS layout (vertical) can be checked via SSSU CLI

3. the redistribution can be done online via the ungroup/reshuffle/group of the "RSS horizontal" to become new "RSS vertical" in the new positions in the new disk enclosures (this is online but time consuming)

4. or it can be done offline after the EVA shutdown at once. (offline but quick)

the pain is one part of the reality

UselessUser · ‎11-05-2009

Hi,

I am sorry I do not fully understand the answer 1??

Does this mean I am correct in what I am saying about shelf redundancy at this moment in time???

Víctor Cespón · ‎11-05-2009

Several misconceptions here:

1) You do not need "disk failure protection" = double on an enclosure with 24 disks. It's only needed on enclosures with more than 100 disks and where an spare disk can take several days to get there.
"disk failure protection" = double does not mean you can lose two disks from the same RSS. It means there are space reserved to be sure to be able to perform two rebuilds.

2) Currently you cannot see the RSS state in Command View. It can be deduced from the SSSU output, but has to be done manually.

3) Even if you add another 4 enclosures to have 8, the RSSs will not be automatically modified so there's one disk on each enclosure.

4) Even if you move disks around to reach that state, it can change at any moment, after you add or remove a disk.

IBaltay · ‎11-05-2009

Hi,
1 RSS member=1 physical disk of the same RSS in each disk shelf is giving you full verticality both for RAID5 and RAID1...

this was meant in the sense that if you have only 1 RSS member in the disk enclosure, then there is no double disk failure of the whole DG in case of the possibility (even very rare)
of the whole disk enclosure failure.

the pain is one part of the reality

sam bell · ‎11-05-2009

The RSS layout can bee seen in Command View by opening it the following way:

https://server:2372/nsafieldservice.htm

Just select the appropriate EVA and choose "Disk Groups and Redundant Storage Sets (RSS)"

@vcespon: Though with double disk drive failure protection he is not protected for loosing two disks of one RSS at the same time it however means that he can loose a second drive from one RSS once the first failed drive has been reconstructed.

Víctor Cespón · ‎11-05-2009

@sam bell
Please DO NOT post HP restricted information on a public forum.

That page has a big red box saying:

The features offered in this menu are intended for use by authorized Hewlett-Packard service engineers only. If not properly used, some features can cause data loss or corruption. Do not use these features unless you are authorized to do so.

Several people in this forum have access to HP internal advisories, tools and documentation, but we refrain from posting things that are not public.

UselessUser · ‎11-05-2009

I do not understand this still...

"disk failure protection" = double does not mean you can lose two disks from the same RSS. It means there are space reserved to be sure to be able to perform two rebuilds.

My thought for this was from this:

"As an example, a Protection Level of 1 provides continued operation in the event of two disk failures, assuming the restore from the first failure completes before the second disk fails."

However thinking about this is the reason it cannot survive 2 disk failures from the same RSS is because if I lose a disk in the RSS, it will then rebuild the data from this disk over to the remaing disks evenly within the entire disk group. However some disks will be in the same RSS set as the failed disk, which means with VRAID 5 I get into a situation where one of the remaining disks in the RSS has both a parity and data chunk for each ?kb stripe, and therefore losing one more breaks my RAID5? (Which leads to the obvious questions can I lose more than 1 disk in an RSS if it is using VRAID1 LUN as long as it is not 2 of the same pair at the same time.. and reconstruction from the first failure has finished?)

I also do not understand the recommendation of double protection unless you have over 100 disks AND it takes a long time for a replacement to arrive. I got my idea for double from this:

"Conversely, the statistical availability of disks and the typical service time to replace a failed disk (MTTR2) would indicate that a Protection Level of two would be unnecessary in Disk Groups of fewer then 168 disks in all but the most conservative installations. A mitigating condition would be if the service time (MTTR) might exceed seven days, then a Protection Level of 2 might be considered."

I just do not think I understand exactly where this protection level comes into play. And how crucial it is, as it sits outside of the VRAID setups

I would have assumed if I gave the EVA the best opportunity to create redundancy (ie blocks of 8 disks spread evenly over 8 shelves) it would do its hardest to align itself for that purpose (shelf redundancy) but obviously I am wrong.

I got the whole idea of using VRAID 1 because I have less than 8 shelves from this taken from the best practices guide:

"The highest level of redundancy is achieved when eight or more disk shelves are used. In this way, the array minimizes the conditions in which the RSS has two disks in the same disk shelf."

"With Vraid1, the EVA firmware attempts to place the individual members of a mirror pair on different shelves. Because of this, the guidelines are much simpler, and the suggested number of shelves can be less than eight"

I am happy to bow down to other people's advanced knowledge, hence even asking the question.

IBaltay · ‎11-05-2009

Hi,

i ll try to give u the example for RAID1:

1. Single disk failure (rebuild)
if one disk carryng the RAID1 stripes of the RSS fails, then it is only a 1 disk failure of the RAID1 mirror set and all its mirror stripes will replicate to other RSS members as well as it will be mirrored to the adjacent RSS members in the new places to create the complete mirrored set.

2. Double disk failure (rebuild)
if one disk of the RSS fails and carries the RAID1 stripes, and at the same time the second failed disk does not hold any of the mirror stripes of the first failed disks, then the rebuild will start as described in the variant 1

3. Double disk failure (data loss)
if one disk of the RSS fails and carries the RAID1 stripes, and at the same time the second disk holding the "secondary" mirror stripes fails, then it is a double disk failure of the whole disk group

the pain is one part of the reality

sam bell · ‎11-05-2009

@vcespon: Sorry, didn't know that it's not permitted to publish such information on the forum and won't do it again. I however think it's less than optimal that we don't have any option to view the RSS information in Command View. I mean, even though the EVA takes care about the RSS design and usually you don't have to worry about it, in general the purpose of RSS is important to understand and except the best practices whitepaper the whole thing is kept out of the loop.

@Useless1

> However thinking about this is the reason
> it cannot survive 2 disk failures from the
> same RSS is because if I lose a disk in
> the RSS, it will then rebuild the data
> from this disk over to the remaing disks
> evenly within the entire disk group.
> However some disks will be in the same RSS
> set as the failed disk, which means with
> VRAID 5 I get into a situation where one
> of the remaining disks in the RSS has both
> a parity and data chunk for each ?kb
> stripe, and therefore losing one more
> breaks my RAID5? (Which leads to the
> obvious questions can I lose more than 1
> disk in an RSS if it is using VRAID1 LUN
> as long as it is not 2 of the same pair at
> the same time.. and reconstruction from
> the first failure has finished?)

I'm currently thinking about this too and I think that in a following configuration the theortical behaviour is like the following:

Given on an EVA4400:

* 24 disks
* 3 RSS with 8 disks each
* LUNs: 3x Vraid 5, 2x Vraid1, 2x Vraid6
* Protection Level 1

If I understand correctly, then:

- I can loose one disk in each RSS without affecting any of my Vdisks. I can loose a second disk in one of those RSS if the disk that failed first has been reconstructed to the free/protection level reserved space.

- If two disks in one RSS fail simultaneously I'll definitively loose *all* my Vraid5 Disks (since the Vdisks are striped over the whole disk group) and possibly also all of my Vraid 1 LUNs (only if the failed disks were married to one pair).

- The Vraid 6 LUNs can withstand a simultaneous fail of two disks in each RSS, which means a total of six drives (two per RSS).

- The Vraid 1 LUNs can withstand a simultaneous fail of four disks in each RSS as long as no married pair is affected, which means a total of 12 drives (very unlikely but theoretically possible).

Not exactly sure though.

IBaltay · ‎11-05-2009

small errata in my wording:

ll try to give u the example for RAID1:

1. Single disk failure (rebuild)
if one disk carryng the RAID1 stripes of the RSS fails, then it is only a 1 disk failure of the RAID1 mirror set and all its mirror stripes will replicate to other RSS members as well as it will be mirrored to the adjacent RSS members in the new places to create the complete mirrored set.

2. Double disk failure (rebuild)
if one disk of the RSS fails and carries the RAID1 stripes, and at the same time the second failed disk from the different RSS does not hold any of the mirror stripes of the first failed disks, then the rebuild will start as described in the variant 1

3. Double disk failure (data loss)
if one disk of the RSS fails and carries the RAID1 stripes, and at the same time the second disk from the different RSS holding the "secondary" mirror stripes fails, then it is a double disk failure of the whole disk group

the pain is one part of the reality

sam bell · ‎11-05-2009

@IBaltay: If I understood the corrections you made in your description you mean that the mirrors of a RAID 1 disk are residing in different RSS. Are you sure about that?

My understanding was that a RSS forms a complete RAID protection domain as stated here by Ulrich Zessin in the following thread:

--snip--
A disk group is devided into multiple RSSes (Redundant Storage Sets). Each RSS forms a complete RAID protection domain.

Disks are 'married' to pairs to define the VRAID-1 mirror members and both members are always contained in the same RSS. If one of the members fails, yes, the EVA picks up the data of the remaining member and stores it on a different member pair. Then it releases the data from the 'widow'.
--snap--

http://forums13.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1257422203270+28353475&threadId=1111467

UselessUser · ‎11-05-2009

WOW...

I was not expecting so many replies so quickly...

Reading other forum messages I am 99% sure of the following:

Single level protection

--Can not lose more than 1 disk in an RSS if using VRAID 5 full stop.

--Can lose more than 1 disk in an RSS if using VRAID 1 as long they are not from the same pair.

Double level protection

--Can not lose more than 1 disk in an RSS if using VRAID 5 full stop.

--Can lose more than 1 disk in an RSS if using VRAID 1 as long they are not from the same pair.

--"Protection Level" is just spare space to allow the EVA to rebuild the redundancy set somewhere TEMPORARILY until the physical disk is replaced. I assume if I had a maxed out EVA and lost a disk in a VRAID 5 or a mirror pair in VRAD 1 and had no protection level set, the EVA would continue to function but the RAID controller would have to calculate the data "on the fly" as opposed to simply reading the data off a disk?

--Single protection provides enough space on the EVA to rebuild 2 failed disk's data somwhere else

--Double protection provides enough space on the EVA to rebuild 4 failed disk's data somewhere else

Set to be 2/4 multiples due to VRAID1 and the fact a fail is fixed by moving the data off the remaining disk of the pair.

Bottom line - level of protection does not influence data protection, just availability of data!

So the recommendation of double for groups larger than 100 disks etc is simply a mathematical calculation based on the probability that there is significantly more chance of a multiple disk failures in 100 disks than say a group of 32.

And as nobody has corrected me otherwise regarding the whole shelf redundancy issue I am sticking to VRAID 1 and hoping RSS has separated each disk pair out onto separate shelves.

I am hoping Uwe will respond at some point!

IBaltay · ‎11-05-2009

sorry another errata in my wording, it is quite hard to explain it in words (hopefuly this will be better :-)):

2. small errata in my wording:

ll try to give u the example for RAID1:

1. Single disk failure (rebuild)
if one disk carryng the RAID1 stripes of the RSS fails, then it is only a 1 disk failure of the RAID1 mirror set and all its mirror stripes will replicate to other RSS members as well as it will be mirrored to the adjacent RSS members in the new places to create the complete mirrored set.

2. Double disk failure (rebuild)
if one disk of the RSS fails and carries the RAID1 stripes, and at the same time the second failed disk from the different disk of the same RSS does not hold any of the mirror stripes of the first failed disks, then the rebuild will start as described in the variant 1

3. Double disk failure (data loss)
if one disk of the RSS fails and carries the RAID1 stripes, and at the same time the second disk from the different disk of the same RSS holding the "secondary" mirror stripes fails, then it is a double disk failure of the whole disk group

the pain is one part of the reality

sam bell · ‎11-05-2009

Hi,

> Single level protection
> --Can not lose more than 1 disk in an
> RSS if using VRAID 5 full stop.
>
> Double level protection
> --Can not lose more than 1 disk in an
> RSS if using VRAID 5 full stop.

what do you mean with "full stop" in regards to Vraid5? Regarding the failure of disks please note that even with a Double Level protection you cannot loose more than one disk simultaneously without loosing all your Vraid 5 LUNs.

As I understand with Single Level Protection you can loose one disk per RSS simultaneously without affecting any Vraid5 LUNs. Once the failed disk has been reconstructed to the free space or the space that has been reserved for the Protection Level, another disk from the same RSS of the first failed disk can fail again. This disk however cannot be reconstructed as long there is no free space available (either available space or space reserved for the Protection Level).

> --Single protection provides enough space
> on the EVA to rebuild 2 failed disk's data
> somwhere else
> --Double protection provides enough space
> on the EVA to rebuild 4 failed disk's data
> somewhere else

I think Singe protection provides only enough space for one failed disk and Double protection for two failed disks. This is because if you are using Vraid1 in your disk group, each disk in that group contains Vraid1 information (as EVA stripes all Vdisks across all disks). And if I understood correctly, the EVA always reconsturcs *both* the disk that contains the Vdisk1 data information *and* the disk that contains its mirror!

Not sure what happens if you are only using Vraid5 in your disk group - maybe you can reconstruct two failed disks even with a Single protection Level?

UselessUser · ‎11-05-2009

Hi,

OK I think we are getting somewhere...

Taking this:

As I understand with Single Level Protection you can loose one disk per RSS simultaneously without affecting any Vraid5 LUNs. Once the failed disk has been reconstructed to the free space or the space that has been reserved for the Protection Level, another disk from the same RSS of the first failed disk can fail again. This disk however cannot be reconstructed as long there is no free space available (either available space or space reserved for the Protection Level).

I would tend to agree with this statement, because if a disk fails as the EVA uses all disks in a disk group to spread the recovered data onto, then my argument of a single disk from the original RSS containing both data and parity chunks from the same stripe would never happen. This would of course also equate to the two stages of recovery which I believe the EVA performs, the first where it rebuilds the data from within the RSS set, and then once this is done it performs levelling, which is the dispersing of it to the other RSS's equally.

The second statement:

I think Singe protection provides only enough space for one failed disk and Double protection for two failed disks. This is because if you are using Vraid1 in your disk group, each disk in that group contains Vraid1 information (as EVA stripes all Vdisks across all disks). And if I understood correctly, the EVA always reconsturcs *both* the disk that contains the Vdisk1 data information *and* the disk that contains its mirror!

I also agree with, but I dont think I wrote it well. Technically speaking as the EVA reserves 2 x the largest disk size in single mode, it can protect 2 disks. However with the EVA you never get to use a single disk on its own, its always in a VRAID, and as you pointed out, the failure of a disk with VRAID 1 would move both copies onto the space so its only protection for 1.

The only reason I doubt this is this:

As an example, a Protection Level of 1 provides continued operation in the event of two disk failures, assuming the restore from the first failure completes before the second disk fails.

But this quote does not explain what kind of operation this is... and I think this ties in with what you are saying in that a protection of single, provides enough space for 1 disk to fail, and be recovered within the EVA. If a second were to fail after the rebuild, then the EVA would still be able to operate as enough data would exist on the disks for the EVA controllers to work out the missing data chunks on the fly.

sam bell · ‎11-05-2009

> The only reason I doubt this is this:
>
> As an example, a Protection Level of 1
> provides continued operation in the event
> of two disk failures, assuming the restore
> from the first failure completes before
> the second disk fails.

Hmm, I currently cannot see where this opposes to what I have written?

BTW - What do you think about my example I posted some post above - would you see it the same way?:

Given on an EVA4400:

* 24 disks
* 3 RSS with 8 disks each
* LUNs: 3x Vraid 5, 2x Vraid1, 2x Vraid6
* Protection Level 1

If I understand correctly, then:

- I can loose one disk in each RSS without affecting any of my Vdisks. I can loose a second disk in one of those RSS if the disk that failed first has been reconstructed to the free/protection level reserved space.

- If two disks in one RSS fail simultaneously I'll definitively loose *all* my Vraid5 Disks (since the Vdisks are striped over the whole disk group) and possibly also all of my Vraid 1 LUNs (only if the failed disks were married to one pair). So I think this really means that when having real bad luck you could loose all your data only because of two simultaneous disk fails in one RSS (when only using Vraid5 disks for example).

- The Vraid 6 LUNs can withstand a simultaneous fail of two disks in each RSS, which means a total of six drives (two per RSS).

- The Vraid 1 LUNs can withstand a simultaneous fail of four disks in each RSS as long as no married pair is affected, which means a total of 12 drives (very unlikely but theoretically possible).

Víctor Cespón · ‎11-05-2009

Let me try to explain in in short sentences:

1 disk failed = No problem at all
2 disks failed, different RSS = No problem at all
2 disks failed, same RSS = RAID 5 fails, RAID 1 depends on whether the disks had the same information

The parameter "disk failure protection" confuses many people. It does NOT add any further protection against disk failures. 2 disks failed on the same RSS will invalidate all RAID 5 vdisks, whether you have None, Single or Double.

It's a parameter intended to reserve spare space, so that when a disk fails, the EVA can rebuild RAIDs using that space and go back to redundant state as soon as possible. In the extreme case there's no free space anywhere, the RAID will remain in degraded state until you physically replace the failed disk.

Why 2 x (biggest drive on disk group)?
When a disk fails, the EVA has to rebuild the RAIDs. RAID 5 info is regenerated from the remaining 4 data stripes and writed to a disk on the RSS. RAID 1 data has to be moved to another PAIR of disks. The disk that failed had a partner, with the other copy of the data. We need to have two copies of this data, so we need to move it to the other disks, making two copies of each stripe.

That's why the spare space is 2 x biggest disk. It's to make sure we can preserve data redundancy in the worst case: one of the biggest disks fails and it's full of RAID 1 data. You're going to need twice it's size to store all the RAID 1 data.

Disk failure protection = double means the EVA can have a disk failure, a rebuild, and another disk failure and still be fully redundant, before you replace the failed disks.

sam bell · ‎11-06-2009

Thanks for your explanations!

> That's why the spare space is 2 x biggest
> disk. It's to make sure we can preserve
> data redundancy in the worst case: one of
> the biggest disks fails and it's full of
> RAID 1 data. You're going to need twice
> it's size to store all the RAID 1 data.

Am I correct in assuming that once you are using at least one Vraid1 disk the reconstruction process will always restore two disks? Since even if there is only one Vraid1 disk it is striped across all disks in the disk group so every disk holds Vraid1 data and has a mirror / acts itself as a mirror for another disk.

UselessUser · ‎11-06-2009

vcespon... can you help me with this:

2 disks failed, same RSS = RAID 5 fails, RAID 1 depends on whether the disks had the same information

Now we know that this is the case if 2 disks fail in a single RSS AT THE SAME TIME.

However is this the case if:

Protection Level is single
A single disk in an RSS has failed and reconstruction has completed
A second disk in the same RSS as the first has just failed...

I think this is the real question being posed here...

I assumed as the protection level allocates space across all disks then the recovery of data must be placed across all disks (thus splitting across RSS if more than 11 disks), therefore technically as an RSS is a failure domain you could in theory lose another disk in the same RSS without losing VRAID 5 data.

Any thoughts?

Víctor Cespón · ‎11-06-2009

@sam bell

"disk failure protection" always reserves 2 x (biggest disk on disk group), to have enough space to rebuild in the worst case scenario.
Does not matter if you have vdisks in RAID 1 or not, it will reserve this space.

@Useless1
Once a disk fails, the rebuild process regenerates the missing data and puts it on the remaining disks on the RSS.
Once the rebuild process completes, we're back to the fully redundant state, so we can lose another disk.

RAID failure happens only if a disk fails before an ongoing rebuild completes.

After rebuild completes, RSS reorganization or leveling will be launched.

It's easy to see with RAID 1:

When a disk fails, we have a set of data that's only on one disk. We must have two copies of this data, so we copy the data to the other disks on the RSS. Each data segment on two different disks.

Until this process has been completed, we cannot lose another disk, if we lose the disk containing the only copy of the data, before we have 2 other copies on two other disks, there's no way to recover the data.

As you can imagine, losing two disks that are a mirror in the few hours that a rebuild takes, is very unlikely.

Disk failure protection only reserves free space to perform rebuilds, does not enhance the rebuild process.

UselessUser · ‎11-06-2009

Thanks for this, I am reaching clarity!

So technically with protection level set to single on a maxed out EVA so there is no free space to use first before the reserved protection space...

With single protection we can technically lose any 2 physical disks before losing data, as long as a rebuild completes before losing the second.

The issue of course is, if this happens, the array will not have any spare space to rebuild the redundancy and will be operating in degraded mode - calculating data on the fly?

sam bell · ‎11-06-2009

Useless,

could you assign some points in this Thread please as I think it evolved to a really good source for understanding RSS/redundancy on the EVA. People should see this at first glance that this thread has good (magical :) answers.

Thanks

Víctor Cespón · ‎11-06-2009

@Useless1

OK, let's say I have a 32 disk disk group filled with vdisks / snapshots / snapclones, to more than 99% capacity.

1) This is not recommened as the leveling process needs some space to work. Minimum is 5 GB free.

2) If while performing a snapshot the EVA runs out of space on the disk group, the snapchot will go to "snapshot overcommit" state and all snapshots will be deleted.

3) If the EVA is replicating with another one and the disk group is almost full, in the event of a link failure or a DR Group being suspended the log disk will run out of space and will be invalidated, this will then force a full copy of the DR group.

So, I have 32 disks and one fails. RAID 1 data is stored on pairs of disks. Disk 31 has no pair, so we must get all the RAID 1 data out of it and copied to a pair of disks.

While doing this, another disk fails. If it's disk 31, then we have lost the only copy of the data. If it was on the same RSS as 31 but it's not disk 31, then we lose RAID 5 but not RAID 1.

If it's on another RSS, now we have to perform rebuild on two RSS, also, this will trigger a change on the RSS as we cannot have two RSS with an odd number of members.
As we have 2 disks that have no partner, one will be moved to the other RSS and they will form a pair to store RAID 1 data.

If there's no free space anywhere to perform rebuild, the RAIDs will work on degraded mode. RAID 1 with only one copy of the data, RAID 5 generating the missing data from the 3 data and 1 parity left.

So you see, it's complex... is not "I can lose 2 disks"... it all depends on what disks and when each failure happens.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: How to view RSS on EVA! + Expansion

How to view RSS on EVA! + Expansion