Re: EVA Best Practice (RSS, Balancing Loops and drive numbers)

compaqact · ‎06-03-2009

Hi Everybody,

I am wanting to try and put something to bed that has been a burning question in my company for quite some time now - and I finally have the supporting evidence to ask the appropriate questions, and hopefully find the answers.

EVA best practice guide for 4x00/6x00/8x00 states the following:
1) Keep your disk group to a number that is a multiple of 8.
2) Let the EVA choose which disks are to be added into the relevant disk groups.
3) The number of disks per drive enclosure should not differ by more than one.
4) Try to load balance the number of disks within the loops.

Now four EVA's have recently been installed at two separate customers of ours. All have been EVA 8100 2C18D arrays, with 144 disks in one pair, and 200 disks in the other. Let’s discount the fact that the EVA's have room for expansion and imagine that they will never have another disk added.

The first pair of EVA's I spent quite some time balancing the number of disks per loop, drive enclosure and disk group. The second I followed HP's best practice ,and inserted all of the disks prior to initialization and then carved out the disk groups. It should be noted that both arrays have 4 disk groups which are split on the I/O profile and vRaid types.

I have just analysed the results of the SSU capture of both arrays and I find something a bit puzzling. On the arrays where I spent some time balancing everything manually - there are no drives that have shelves in common in the one array and 2 drives that have shelves in common in the other array. However where the EVA was left to decide where the disk groups will be carved out etc the following can be found:
RSS #3 has 4 drives that have shelves in common.
RSS #5 has 4 drives that have shelves in common.
RSS #6 has 4 drives that have shelves in common.
RSS #7 has 4 drives that have shelves in common.
RSS #12 has 4 drives that have shelves in common.
RSS #15 has 6 drives that have shelves in common.
RSS #16 has 6 drives that have shelves in common.

Now home come there is such a difference between the two?
What have the EVA's done differently?
Why is this so?
Should I be worried about it?
Should I leave the EVA's to their own devices every time?
Does anybody have technical reason as for one method versus the other? I am looking for some deep dive technical reasons and not just a "HP says so"

I can provide any RSS maps, warnings etc if anybody would like to see them.

Many Thanks

Andrew

IBaltay · ‎06-03-2009

Hi, in general, SSSU states that there are more then 1 RSS disk in 1 enclosure. It could be repaired by the ungroup/regroup of the specific RSS members by swap, but it could take a time. From the accessibility point of view it is always good to have all RSS members in the different disk enclosures and if you have this design, you can loose the whole enclosure without the downtime. Unfortunately the RSS internal algorithm is not self repairing and after e.g. the disk failures it can be broken. Thats why it is good to check the RSS verticality from time to time and do the repairation in a systematic way. At the same time it is true, that the whole enclosure failure is very rare... So you need not bother with it in a panic and you can create the map of all RSS of each EVA and then prepare the ungroup/regroup swap procedures...

the pain is one part of the reality

Uwe Zessin · ‎06-03-2009

Could you provide a "ls disk full XML" as an attachment (.TXT in .ZIP would be great), so I can run my own analyzer on it?

.

compaqact · ‎06-04-2009

Hi Uwe,

Please find the LS Disk Full for 3 of the arrays. At present I don't have the scans for the fourth on hand. Please ignore the 23 FATA disks in EVA 2 Pair 2, as they are temporarily housed on this EVA until a new array can be procured, and one disk has failed.

Uwe Zessin · ‎06-04-2009

Thanks for the data. I haven't touched my code for some time, but the data was processed without a single hickup.

Looks like EVA003 made a major screw-up on DG04!! Has the disk group been expanded at one time?

You also have a duplicate disk name (022) in EVA001 at E:01/B:02 + E:18/B:02 and a 'parity state' degration.

EVA002 looks fine.
Reports are attached - enjoy.

Please ignore the "NewFW" column in the disk listing - I have not kept the tables current.

.

compaqact · ‎06-04-2009

Thanks Uwe,

That disk group wasn't expanded funnily enough...I'm debating uninitializing that array again (its been troublesome since I turned it on about a 10 days ago).

Based on your findings do you know why EVA's do this ? and do you have any answers to my questions above ?

PS What application are you working with ? Is that a Navigator report ?

Uwe Zessin · ‎06-05-2009

I say "the RSS code simply does not cut it".

More than 4 years ago I've heard rumors that engineering was thinking about code enhancements to allow 'automatic RSS repairs', but apparently that never happened and even the basic code is far from perfect. Here on the ITRC forums I've read that HP gives no support if somebody attempts to fix it on his own - sorry, I don't have a pointer to the thread. Maybe you can ask HP for an official statement.

The 'application' does not have a name. Well, it is just a set of APIs without a nice user interface. It is based on 9..10 years old code I did to analyze / model HSG-based storage arrays.

.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: EVA Best Practice (RSS, Balancing Loops and drive numbers)

EVA Best Practice (RSS, Balancing Loops and drive numbers)