Disk Enclosures
1748033 Members
4591 Online
108757 Solutions
New Discussion юеВ

Unfortunate RSS Allocation

 
SOLVED
Go to solution
Mark Poeschl_2
Honored Contributor

Unfortunate RSS Allocation

I've added a 4th drive enclosure to a 2C3D EVA3000 so that we now have a fully populated 2C4D with 56 HDDs. We are running VCS 3.025.

When initially setting this 2C3D EVA up I allocated 16 spindles for a "log disk group" and made sure they were the physically farthest left spindles in the cabinet - so 5, 5, and 6 spindles per shelf in the "log disk group". The remaining spindles were allocated to the default disk group. All VDisks in both disk groups in this EVA are VRAID1. Unfortunately, I never looked too closely at how the RSS were laid out at that time.

After adding the 4th shelf, I wanted to retain just 16 spindles in the "log disk group" and also wanted to retain the most symmetrical left/right split. Therefore I took these steps:

1) Added the 4 leftmost spindles of the new shelf to "Log disk group" and let levelling complete.

2) Ungrouped the 4 disks from bays 5 and 6 in the original shelves that had been in "log disk group" and let migration complete.

3) Added the (now) 14 ungrouped disks to the default disk group and let leveling complete.

After this was complete I was dismayed to find that rather than re-arranging existing RSS to make use of the new spindles, I basically had some new RSS with most of the new spindles in them. (See document attached). Note that when looking at the attachment the original enclosures were numbered 2, 3, and 4 and the new one is number 1.

The way I read this if I lose enclosure 1 my default disk group will be toast because RSS 8 can't possibly survive. I'm assuming that the controllers are smart enough to pair the VRAID1 redundancy in the "log disk group" so that an enclosure 1 failure doesn't have the same catastrophic consequences there.

Surely there would have been a better way of allocating the new spindles to RSS' than what happened. My questions are now:

- What, if anything, could I have done differently to get a better RSS allocation? Add 1 drive at a time to the default disk group? That seems awfully painful...

- What, short of ungrouping the whole thing and rebuilding, can I do to rectify this situation?
12 REPLIES 12
Uwe Zessin
Honored Contributor

Re: Unfortunate RSS Allocation

Mark,

unless I'm interpreting your picture incorrectly, RSSid 1, 2, 3, 4 + 7 cannot survive an enclosure failure either, because all of them have at least one disk-pair in a single enclosure.

When I did upgrade a 2C3D to a 2C4D, I manually moved existing disk drives into the new enclosure to keep the RSS state intact. Then I put in the new disks, distributed over all 4 enclosure and added them to the disk groups.


I am not sure if you can get out of this mess by ungroup/group. The allocation algorithms are unknown (to me at least). I would physically move the disk drives.
.
Mark Poeschl_2
Honored Contributor

Re: Unfortunate RSS Allocation

Thanks for the good news Uwe ;-( But that's what I meant about hoping the controllers are smart enough that (because I'm all VRAID1) since there are no more than half the disks in any RSS in the same enclosure - except RSS 8 - the other RSS can survive an enclosure failure. For example, in RSS 1, if we assume that both halves of mirrored data blocks are never stored in the same enclosure, even a failure of enclosure 1 is not fatal.
Uwe Zessin
Honored Contributor

Re: Unfortunate RSS Allocation

Like I said: I might interpret your document wrong, but I read it that the last four disks in RSSid:1 (idx:6 enc:1, idx:7 enc:1 and idx:8 enc:1, idx:9 enc:1) form two mirrored pairs which are located in the same enclosure:1.
.
Mark Poeschl_2
Honored Contributor

Re: Unfortunate RSS Allocation

Ah no - sorry for the confusion Uwe. I have no knowledge of which disk are "paired" with which other ones in the RSS foir purposes of VRIAD1 - or for that matter if the "pairing" is on an entire-disk basis. Do you know a way to figure that out? The document DOES NOT show the disks in RSS:IDX order. Is that relevant to VRAID1 mirror configuration?

All the document is trying to show is which disks are in which RSS.
Uwe Zessin
Honored Contributor

Re: Unfortunate RSS Allocation

Mark,
I have put up some SSSU commands in this thread:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=958939

If you can come back with the output I'll try to run an analysis.
.
Mark Poeschl_2
Honored Contributor

Re: Unfortunate RSS Allocation

Here you go Uwe - thanks
Uwe Zessin
Honored Contributor
Solution

Re: Unfortunate RSS Allocation

Thanks, that's interesting data.

If you look at the SSSU output you sent me, you see that both disk groups have an RSS disk state of 'none'.
"rssdiskstate : none"

I am a bit confused by the enclosure numbering on your system. My software uses the disk drive's loop-id to calculate the physical position and so far it has worked well for EVA-3000s. On the other hand, the XML output seems to confirm that you have an enclosure numbered 1, strange.

As I cannot easily change my code to swap enclosure numbers, I am leaving it as-is - I am confident that you can make the translation rather easily.

If my guesses are correct, you should be able to 'fix' the RSS disk states with just 5 disk swaps as indicated in the middle of the attached report - enjoy!
.
Mark Poeschl_2
Honored Contributor

Re: Unfortunate RSS Allocation

Thanks Uwe - that's a pretty cool tool! Yes - our enclosures are definitely numbered 1 through 4 as seen in the LEDs on the back. I'm assuming your enclosure 6 is my enclosure 1.

Disk swaps such as this would need to be done cold - correct?
Uwe Zessin
Honored Contributor

Re: Unfortunate RSS Allocation

Well, if you like a little adventure, you can try to do it online. The default disk replacement delay is 1 minute, but you can temporarily increase it.

I have checked and it looks like all suggested swaps except one (E:6 B:1 - E:4 B:1) contain disks from different RSSIDs, so a dual disk failure should not hurt in those cases. You can replace E:4 B:1 with B:2 and should be safe.

Here is the new table (cut&paste, replace '.' with SPACE and view in a non-proportional font ;-)

**********.move.disks
|Disk.055=BD30058232.|Disk.035=BD30058232.|
|2000-0014-C301-EE30.|2000-0011-C631-D7BE.|
|R08:07..............|Q04:03..............|
|E:06.B:11->E:04.B:11|E:04.B:06->E:06.B:05|
|Disk.044=BD30058226.|Disk.037=BD30058232.|
|2000-0000-877E-4021.|2000-0011-C631-E6CE.|
|R08:04..............|R06:01..............|
|E:06.B:05->E:04.B:06|E:04.B:03->E:06.B:03|
|Disk.045=BD30058226.|Disk.046=BD30058226.|
|2000-0000-877E-4A21.|2000-0000-877E-4028.|
|R01:10..............|Q01:04..............|
|E:06.B:03->E:04.B:03|E:06.B:01->E:04.B:02|
|Disk.011=BD30058232.|Disk.017=BD30058232.|
|2000-0011-C631-D92E.|2000-0011-C631-E740.|
|R02:04..............|R04:01..............|
|E:04.B:11->E:06.B:11|E:04.B:07->E:06.B:07|
|Disk.051=BD30058226.|Disk.004=BD30058232.|
|2000-0000-877A-E8C7.|2000-0011-C631-E71A.|
|R08:02..............|R06:03..............|
|E:06.B:07->E:04.B:07|E:04.B:02->E:06.B:01|
.