HPE EVA Storage
1850496 Members
2507 Online
104054 Solutions
New Discussion

EVA 5000 RSS Disk Problem

 
SOLVED
Go to solution
Squick_1
Occasional Advisor

EVA 5000 RSS Disk Problem

Hi

I have inherited a eva 5000, and we recently did a survey on it, only to find out that we have 7 shelves out of 12 that have more than 1 rss disks on that shelf....

The eva is fully populated but it seems that whoever added the disks didnt follow the best practics and my boss would now like them moved. I understand what i know of rss, which is dangerous in itself, and would like a bit of advice on how to proceed.

We contacted HP and they advised us that we would need to turn the EVA off, which is not going to happen soon because it hosts a 24/7 critical app, and upgrade the command view to 7 and the VCS to 4.

We had a SAN consultant advise us that we could ungroup disks that exist in the problem rss's and then group them back and then the firmware would sort it out.

To make matters worse there is a rss with 6 disks while the others all have 8. would the firmware move disks dynamically or would it just stay at 7 until a new disk was grouped and add that to the rss with 7 disks. Hopefully that hasnt sounded like complete garbage. Attached is a excel spreadsheet that shows the san survey and the problem disks.It doesn't have any macros adn is virus free !
16 REPLIES 16
Uwe Zessin
Honored Contributor

Re: EVA 5000 RSS Disk Problem

Squick,
can you provide the output of SSSU's "show disk full" or "ls disk full XML", store it in a .TXT file, .ZIP it up and attach it with a response? I'd like to take a look at it with my own utility.

> The eva is fully populated but it seems that whoever added the disks didnt follow the best practics

Well, that might not be correct. The EVA can screw up RSS configurations quite easily and is NOT able to fix it. An ungroup/group might work, but it also might make the situation worse.
.
Squick_1
Occasional Advisor

Re: EVA 5000 RSS Disk Problem

Thanks for the reply, is this what you wanted?
Uwe Zessin
Honored Contributor

Re: EVA 5000 RSS Disk Problem

Almost, can you provide the XML version, please?
I can only parse 'show' or 'ls ... XML' outputs.
.
Squick_1
Occasional Advisor

Re: EVA 5000 RSS Disk Problem

Xml output attached.
Uwe Zessin
Honored Contributor
Solution

Re: EVA 5000 RSS Disk Problem

Very interesting...

The system has two duplicate names:
- Disk 093
- Disk 127

you can easily rename one of the duplicate names.

Please ignore the "NewFW" column in the disk drive table, I have not updated the table in a while.


It looks like the controller firmware screwed up an RSS a little.
There is a 'hole' in the index numbers of ID:003, but it does not cause an RSS state degration.

> To make matters worse there is a rss with 6 disks while the others all have 8.

True, but that is how the EVA works.
It does not make sense to move one disk drive from an 8-RSS to the 6-RSS.
If it did, it could not store any VRAID-1 data on the last disk drive of both 7-member RSSes.

Hope it helps.
.
Squick_1
Occasional Advisor

Re: EVA 5000 RSS Disk Problem

I am getting a problem opening the zip file. Says it a corrupt archive.
Uwe Zessin
Honored Contributor

Re: EVA 5000 RSS Disk Problem

Hm... works for me - I have used Windows' embedded function :
sendto - compressed (zipped) folder.
.
Del_3
Trusted Contributor

Re: EVA 5000 RSS Disk Problem

Moving disks in an EVA is always risk. You must ask you self why? The only practical reason I can think of is that you think you will lose an entire shelf - DUAL switch loops, DUAL power supplies, DUAL blowers at the same time. The odds have got to be almost engineering 0 - even on an EVA .

But it (RSS aliging) can be done. It takes downtime (risky) and then moving cold disks from one bay to another (risky*3.

We have covered the steps several times here before.

BTW HP will not officially support this and they no longer even report the RSS parity status on the new EVA's.

BTW the CV 7 and VCS 4 upgrade has nothing to do with the alignment. Have no idea what they are thinking.
Tom O'Toole
Respected Contributor

Re: EVA 5000 RSS Disk Problem

I agree with Del - the cost/benefit of fixing it is not as good as the cost/benefit of leaving it. I have done the live group/ungroup several times and it's time intensive and scary: You must pick each pair very carefully to make sure you are not creating the same problem as you fix it. Never choose to ungroup a drive in an rss of 6, the rest of the drives will merge with a new rss and renumbering will occur that's out of your control. You must move a drive to a temporary slot and make sure it picks up the same rssid/rssindex as it had before, Then move the second drive to the spot vacated (where it re-picks up the same id/index. Finally move the temporary drive to the spot occupied by the 2nd drive. It SUCKS.

It does work, but over time with disk failures, automatically ungrouping drives, etc... you will probably start degrading again.

What happens in the extremely rare case that you DO lose a shelf containing more than one drive in an RSS - the rss goes bad - but as long as the OS stalls correctly (e.g. VMS), you don't lose any data - if you quickly fix the shelf (what would happen to cause a shelf loss here where everything is redundant? maybe you have a bad loop which is scheduled for service and - bad luck- the io module on the other loop fails. Maybe you could steal one from a shelf on the other loop pair to fix?) - you will get all your data back.
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Uwe Zessin
Honored Contributor

Re: EVA 5000 RSS Disk Problem

> Maybe you could steal one from a shelf on the other loop pair to fix?)

That is not possible as the A and B I/O modules are not interchangeable.
.
Rob Leadbeater
Honored Contributor

Re: EVA 5000 RSS Disk Problem

Hi,

> what would happen to cause a shelf loss
> here where everything is redundant?

I've never had the misfortune/opportunity to test this on an EVA, but I have had a (self induced) shelf failure on an HSG80 based system with 4300 disk shelves.

Whilst troubleshooting a problem, I had one of the power supplies removed for more than a few minutes, which causes the whole shelf to turn off.

As the same power supply/fan modules are used in an EVA, the same may well happen. I've got access to an EVA to play with, so I'll try it out if I get a few minutes...

Cheers,

Rob
Rob Leadbeater
Honored Contributor

Re: EVA 5000 RSS Disk Problem

> We contacted HP and they advised us that we would need to
> upgrade the command view to 7 and the VCS to 4.

Unfortunately, the stock response from HP support always seems to be to get everything to the latest version, before even looking at the problem. I've had arguments with first level support on that issue. Fortunately, the field service guys seem to take a more logical response to things.

What you probably should do, is to ensure that you are on a supported version of VCS - 3.028 is still current at the moment - although only till March. There's no reason whatsoever to move to v4 VCS or CV EVA 7.

Cheers,

Rob
Tom O'Toole
Respected Contributor

Re: EVA 5000 RSS Disk Problem


That's why I'm saying steal one from the other loop pair, both of the other loop pair are assumed up in this scenario, so you can steal one of the same type as the failed one from a shelf and the stolen-from one will still have a functioning loop. NOT that this is a great scenario:-)
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
IBaltay
Honored Contributor

Re: EVA 5000 RSS Disk Problem

Hi,
1. the quickest way of the RSS vertical design is really going offline with the EVA and prepare the map to reshuffle in advance.
Thus the RSS algorithm is switched off and you can avoid the unneeded RSS merge/split operations. After the rellocation you can have the nice vertically spread RSSs with only one member of each in one enclosure
This is applicable always, but you can consider it if you have many disks in the unneeded positions.


2. the other method is finding the possible pairs for the mutual exchange and then ungroup them. And then group them. The rule here is to group first the RSS with the lower number. Otherwise there is a potential "risk" of RSS split/merge and creation of the 9-11 disks RSS group. This method is online but very time consuming and sometimes unpredictable
This is applicable mainly in situations if only a few RSS groups needs to be rellocated.
the pain is one part of the reality
Squick_1
Occasional Advisor

Re: EVA 5000 RSS Disk Problem

Many Thanks for all the help, in the end we decided to take the eva offline once we have move the current live service to its new location and then start from scratch.
Squick_1
Occasional Advisor

Re: EVA 5000 RSS Disk Problem

Case closed, eventually it was easier to start again than fix the problem with any degree of certainty.