Raid5 with distributed parity layout and failure question

Mike_316 · ‎02-18-2004

Hey Guys!

I have a question about Raid5 with distributed parity, to refresh my aging memory. Basically, this question related to two things...

1. How parity data is actually striped and used across the disks.

2. And a formula for how many disks can fail in a Raid5 with distributed parity, before the entire raid device goes down.

(*Everything which follows assumes five disks...as that is what I am using for now...although a formula for less and/or more disks would be great! *)

My understanding of raid5 with dist parity has the disk and parity laid out as in FIGURE 1 (see attched JPG if it doesn't make sense as text)

FIGURE 1:
------------------------------------
|Disk1 Disk2 Disk3 Disk4 Disk5 |
|======|======|======|======|======|
| 1 | 2 | P1,2 | 3 | 4 |
|------+------+------+------+------|
| P3,4 | 5 | 6 | P5,6 | 7 |
|------+------+------+------+------|
| 8 | P7,8 | 9 | 10 |P9,10 |
|------+------+------+------+------|
| 11 | 12 |P11,12| 13 | 14 |
|------+------+------+------+------|
|P13,14| 15 | 16 |P15,16| 17 |
|------+------+------+------+------|
| 18 |P17,18| 19 | 20 |P19,20|
|__________________________________|

...where the Parity for stripe 1 and 2 (P1,2) can be used to reconstruct slice 2 (using 1, and P1,2) or slice 1 (using 2 and P1,2) but P1,2 CANNOT reconstruct BOTH 1 and 2 (you must have either 1 OR 2 WITH P1,2 to reconstruct the missing data.)

Also, if the above is the correct distribution, is the parity always distributed in this pattern...where you have 2 data slices(1 and 2), then their parity slice(P1,2), then two more data slices (3 and 4), then those two's parity slice (P3,4), ETC.

HOWEVER, I have also seen diagrams such as follows in figure 2 ...

(again, see the attached JPG if it doesn't layout right as text)

FIGURE 2:
------------------------------------
|Disk1 Disk2 Disk3 Disk4 Disk5 |
|======|======|======|======|======|
| 1a | 2a | 3a | 4a | Pa |
|------+------+------+------+------|
| 1b | 2b | 3b | Pb | 4b |
|------+------+------+------+------|
| 1c | 2c | Pc | 3c | 4c |
|------+------+------+------+------|
| 1d | Pd | 2d | 3d | 4d |
|------+------+------+------+------|
| Pe | 1e | 2e | 3e | 4e |
|------+------+------+------+------|
| 1f | 2f | 3f | 4f | Pf |
|__________________________________|

...which infers (to me, at least) that the parity data for slices "a" (Pa) could be used to reconstruct (for example) slice 1a by using Pa, 2a, 3a and 4a.

IF this IS the case...then how many active data slices do you need to reconstruct from parity (I.E. could you reconstruct BOTH 1a AND 3a using Pa, 2a, 4a?)

The big reason for this question, besides a deeper understanding of parity distribution is...how many disks can the raid5 with distributed parity lose before the raid is non-functional.

If the distribution is as in the first diagram...I get a 0% chance of non-functionality with 1 disk failure, an 80% chance of non-functionality with 2 disk failures and a 100% chance with 3 disk failures.

However, if it is as the second diagram lays it out you get a 0% chance of non-functionality with 1 disk failure and a 100% chance of disk non-functionality with 2 disk failures.

Seems like a big difference, especially as the odds for a dysfunctional array would decrease with the more disks added to the raid setup in diagram one...but would NOT necessarily decrease according to diagram two.

Sorry for being do long winded, but I appreciate all the great help (as always!)
Mike

"If we treated each person we met as if they were carrying an unspeakable burden, we might treat each other as we should" - Dale Carnegie

Mike_316 · ‎02-18-2004

Oh...I guess I can add the attachment here as well.

Sorry for taking up space!
Mike

"If we treated each person we met as if they were carrying an unspeakable burden, we might treat each other as we should" - Dale Carnegie

Uwe Zessin · ‎02-18-2004

Ah. Interesting. You first picture looks very similar to the way the EVA does RAID-5. It uses 4D+1P - 4 data segments and 1 parity segment which means a fixed overhead of 25%. However, due to the underlying virtualization the actual placement on the disks will look very different. I won't go into detail, because RAID-5 can be confusing enough.

Your second figure looks similar to what DEC/COMPAQ/HP's StorageWorks HSx
controllers use. The parity is calculated horizontally over all disks, which means
that the 'overhead' becomes less with more disks. On the other hand you have to move way more data when a disk has failed.

I have also seen a slightly different variant that looks like this:
|01|02|03|04|Pa|
|06|07|08|Pb|05|
|11|12|Pc|09|10|

Parity is still calculated horizontally over all data (Pa=01x02x03x04, Pb=06x07x08x09 an so on), but there might be a little better throughput, because a sequential read will go steady over all 5 disks (01,02,03,04,05 then 06,07,08,09,10 and so on).

In both cases RAID-5 uses a single parity segment within a group of disks. For the EVA that group is always 5 disks (4D+1P), for other controllers that can be a variable number of disks.

The array cannot reconstruct your example of 1a and 2a going bad. If more than one disk goes bad, you have lost too much data. Do you recall how RAID-5 uses the XOR mechanism to build parity or rebuild data?

I'm not very good with math, sorry, but I think it is clear that with EVA's mechanism of 'sub-groups' or 'sub-slices' or whatever we like to call them, we can afford to loose one disk per 'sub-group'.

There are some arrays (VA7xx0 and MSA1000) that can provide a 'stronger' parity protection. On the VA7xx0 it is called RAID5DP (Double Protection if I recall correctly) and on the MSA it is called ADG (Advanced Data Guarding).

Both arrays, however do NOT duplicate the parity information on a second disk - you still cannot recover 2 lost data segments that way. I have read that they are using 2 different math. algorithms to calculate P and Q. The formula was a little bit too big for my small brain - I didn't understand
it, so I will leave it at that, OK?

.

Mike_316 · ‎02-18-2004

Thanks for the excellent reply. I am one of those sick individuals who likes both long winded replies...and brain-stifling mathematical formulas. The information was very helpful and got me on the right track!

Which basically means, I have discovered that my array will not support raid6, nor does it have the possibility of surviving multiple drive loss. I can, however, daisy chain a second (or third, fourth, etc.) array to it and use them as mirrors or redundant data locations.

Thanks again!
Mike

"If we treated each person we met as if they were carrying an unspeakable burden, we might treat each other as we should" - Dale Carnegie

Michael Schulte zur Sur · ‎02-18-2004

Hi,

if you want to know more about raid5dp, which is double parity btw, look here:
http://www.hp.com/products1/storage/products/disk_arrays/midrange/va7410/infolibrary/index.html

greetings,

Michael

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Raid5 with distributed parity layout and failure question

Raid5 with distributed parity layout and failure question

Re: Raid5 with distributed parity layout and failure question

Re: Raid5 with distributed parity layout and failure question

Re: Raid5 with distributed parity layout and failure question

Re: Raid5 with distributed parity layout and failure question