Re: 2 x EVA 4400 Questions

UselessUser · ‎01-27-2009

Hi,

Can someone explain to me the protection level of the EVA (Double, Single or None)?

I understand how an EVA spreads the vdisk data across all disks in the disk group but I am unsure as to why a single protection level requires 2 x largest spindle and double requires 4 x largest spindle??

Also we are getting a lot of these two events very close to each other... should I log?!

--HSV Controller Event--

Date/Time Stamp: 27-Jan-2009/12:44:21
StorageCell WWN - Name: 5001-4380-013C-0150 AGEVA
Event Type: HSV Controller Event
Event Code and Description: 0e13ca19: CAC=ca - System Data Center Services Status Change Status of a System Data Center element has changed. Battery module cannot hold up its cache.

--HSV Controller Event--

Date/Time Stamp: 27-Jan-2009/12:44:21
StorageCell WWN - Name: 5001-4380-013C-0150 AGEVA
Event Type: HSV Controller Event
Event Code and Description: 0e110019: CAC=00 - System Data Center Services Status Change Status of a System Data Center element has changed. Battery module operating normally.

IBaltay · ‎01-27-2009

Hi,

A Disk protection level

the protection levels mean the following:
NONE - no virtual spare drive
SINGLE - 2 virtual spare drives
DOUBLE - 4 virtual spare drives

Battery problem
this needs to be observed if it had been the transient problem. If it reocurrs it can be the power problem inside the rack. (When the controller cache is not backuped it first switches/reduce the IO to the second controller only for the cache write back, but there is no mirror and thus the controller could be overloaded and the system could be degraded at that time)

the pain is one part of the reality

UselessUser · ‎01-27-2009

Hi,

I dont understand this??

the protection levels mean the following:
NONE - no virtual spare drive
SINGLE - 2 virtual spare drives
DOUBLE - 4 virtual spare drives

Basically in my limited and I am sure stupid way of thinking, if I clicked single, it would be the equivalent of saying I would like 1 hot spare drive. If I clicked double, it would be like saying I want 2 hot spare drives..

So why does the EVA retain 2 drives for single and 4 for double?

Víctor Cespón · ‎01-27-2009

The explanation is RAID1 vdisks. If a disk fails and there are vdisks in RAID 1, you'll have a disk that has no partner, so no redundancy for that data.
To avoid this, when a disk fails, the data on that disk and on it's partner disk is moved to the remaining disks on the disk group, so there's two copies of each data.
To ensure that it can do this even if all the disk if filled with RAID 1 data, it reserves the equivalent of 2 disks.

Disk failure protection = duble reserves enough spare space to allow for 2 disks failing (not at the same time, the rebuild process must complete between failures)

The messages about the battery having no charge and then on the same second having charge are a known bug corrected on firmware 09006000.

UselessUser · ‎02-02-2009

Hi,

I am going to be honest and say I still do not understand this process of disk allocation for single or double protection? Can someone please spell it out for me... I appreciate I am thick!!

Also I am trying to work out how the capacities are determined...

I have one disk group which has 24 300GB SAS disks which shows a capacity of 5583.4GB and another disk group which has 8 1TB FATA disks which has a capacity of 3723.78...

How does this work out when technically the FATA disk group has more space but less is available??

Steven Clementi · ‎02-02-2009

"I have one disk group which has 24 300GB SAS disks which shows a capacity of 5583.4GB and another disk group which has 8 1TB FATA disks which has a capacity of 3723.78...

How does this work out when technically the FATA disk group has more space but less is available??"

First question... what IS your Disk protection level for bot hof these groups?

For these calculations, I will use hard numbers (Sales Number, not actual numbers.

24*300
No Protection:
@7200GB Available vRAID0
@5760GB for vRAID5
@3600GB for vRAID1

Single Protection:
vRAID0 = 6600GB
vRAID5 = 5280GB
vRAID1 = 3300GB

Double Protection:
vRAID0 = 6000GB
vRAID5 = 4800GB
vRAID1 = 3000GB

8*1000
No Protection:
vRAID0 = 8000GB
vRAID5 = 6400GB
vRAID1 = 4000GB

Single Protection:
vRAID0 = 6000GB
vRAID5 = 4800GB
vRAID1 = 3000GB

Double Protection:
vRAID0 = 4000GB
vRAID5 = 3200GB
vRAID1 = 2000GB

"How does this work out when technically the FATA disk group has more space but less is available??"

Your protection is based on larger disks. The the very most, you will reserve 1.2TB in the 300GB disk group.

In the 1TB disk group, the very most you can reserve is 4TB

Even with Single protection, you are reserving 600GB vs. 2TB.

Steven

Steven Clementi
HP Master ASE, Storage, Servers, and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5, vSphere 6.x)
RHCE
NPP3 (Nutanix Platform Professional)

Uwe Zessin · ‎02-02-2009

Looks like you are running each disk group with the so-called 'protection level': double.

In that case, the EVA reserves 4x the size of the largest disk in the disk group for data rebuilds and distributes the capacity across all physical disk drives in the group.
Think of it as the capacity of 4 disk drives is 'rotated' by 90 degrees and 'striped' across all disk drives.

Why the FATA group has less free capacity? A larger reservation, because a single drive is sooo much larger:
4*300GB = 1,200GB
4*1000GB = 4,000GB

((24-4)*300 HWGB)/1.024/1.024/1.024 ~= 5587 SWGB

((8-2)*1000 HWGB)/1.024/1.024/1.024 ~= 3725 SWGB

HWGB is 'HardWare GigaBytes' (number of blocks on the disk drive * 512 Byte/block divided by 1,000,000,000)

SWGB is 'Software GigaBytes' - Command View EVA divides by 1,073,741,824 = 1073741824

(1024 = 2^10)

The EVA uses Fibre Channel and FATA disk drives - no SAS as of today.

.

Uwe Zessin · ‎02-02-2009

Sorry, cut&paste error:
((8-4)*1000 HWGB)/1.024/1.024/1.024 ~= 3725 SWGB

.

UselessUser · ‎02-03-2009

Hi,

Thank you for your replies...

I feel like I have finally got the capacity part...

Although I now have 2 further questions...!!

I understand the RAID 5 space calculation from here:

http://forums13.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1233650243409+28353475&threadId=855601

Now I was under the impression normal RAID 5 capacity was 2/3 of disk space... why is the EVA using a different mechanism (4D + 1P) and is this shown in the EVA documentation anywhere?!

Now I am still trying to work out why the single protection requires 2 drives and the double requires 4?

Also where can I find the 0906000 firmware download... all I can find is the 0904000 which I am currently running?!

Víctor Cespón · ‎02-03-2009

I don't think it's documented anywhere public, and recently even the RSS state was removed from XCS and Command View, customer no longer can see the distribution of the RSS in the disk group.
Internally the data for a vRAID 5 logical drive is segmented and stored as 4 data segments with one parity segment. And these segments are written on groups of 8 disks (usually 8 disks, which from a RSS)

The issue of "disk failure protection" = single requiring 2 drives is because if a disk fails and there are logical drives in vRAID 1, you need to move the data that was on the failed disk and it's partner to the remaining disks, so there are two copies of each data.
To be able to do this even if the disk is all VRAID 1, the controllers reserve twice the size of the biggest disk on the disk group.

Firmware 09006000 is here:

http://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=T4256-63119

Steven Clementi · ‎02-03-2009

basically what happens is...

D1 paired D5 (data)
D2 paired D6 (data)
D3 paired D7 (data)
D4 paired D8 (reserved for single protection)

If you have a vRAID1 disk allocated.. and D1 fails... it moves/copies the data from D1/D5 to D4/D8.

Steven

Steven Clementi
HP Master ASE, Storage, Servers, and Clustering
MCSE (NT 4.0, W2K, W2K3)
VCP (ESX2, Vi3, vSphere4, vSphere5, vSphere 6.x)
RHCE
NPP3 (Nutanix Platform Professional)

Uwe Zessin · ‎02-03-2009

> Now I was under the impression normal RAID 5 capacity was 2/3 of disk space...

No, that depends completely on the array implementation. Some traditional RAID arrays can do between 2/3 and 13/14, depending on the number of disk drives (3 to 14 in this case).

The EVA 'virtualizes' 4D+1P RAID-5 and distributes it onto different disk drives within the disk group. This even works in a disk group with 240 disk drives. In that case the data rebuild for a failed disk drive is MUCH faster then rebuilding a 239D+1P array.

Another reason for smaller RAID-5 chunks is the (potential) ability to withstand multiple concurrent disk drive failures within a disk group. It is not a true RAID-6, but the disk group is divided into multiple 'failure domains' - the RSS (Redundant Storage Set) that vcespon has mentioned - each of which contains isolated RAID protection.

.

UselessUser · ‎02-04-2009

Hi,

I can not believe I am saying this but I finally get it!!

May I ask why the "double" protection cannot sustain 2 drive failures before 1 has finished rebuilding?!

And as put in another post I have HP SIM running on a VM... how do I get it to see the EVA..

At the moment it just shows up as unmanaged!

Finally I had a disk fail, which I logged with HP and they sent a replacement... the problem is the new one has finished levelling but now says attention..

It does not have one of its loop ports connected... I have tried reseating it but no joy...

Does this mean only controller can see the drive (As in if this controller fails the other controller cant) or is there some internal switching mechanism and the failure of one port simply means it cannot be access by both controllers simultaneously?!

Víctor Cespón · ‎02-04-2009

Even with disk failure protection = double, you cannot sustain 2 **simultaneous** drive failures, because if they are from the same RSS you'll lose vraid 5 data (rebuild not possible with only 3 out of 5 disks), or if they are a vRAID 1 pair (both copies of the data gone).

In SIM you must first discover the server with Command View and provide SIM the credentials to log into it. (I think this is done already). If it sees the EVA, edit the device details to fill the fields, tell SIM it's and storage enclosure, it's serial number, etc.

Regarding the disk you replaced, if after reseating it it's still connected to only one loop, call HP and request another one.
One loop connection lost means the disk is accessed only by loop A or loop B, not both. It works but it's not fault-tolerant, if the active loop fails, it cannot acess though the other loop.

UselessUser · ‎02-04-2009

Hi,

I appreciate your response... however the HP SIM server is not connected to the EVA in anyway bar being on the same LAN as the management interface.

Is this some kind of pre-requisite...

I have updated to the latest firmwares and the CommandView EVA is now hosted on the management module itself now...

What do I need to do?!

Uwe Zessin · ‎02-04-2009

I did intentionally write **so-called** 'protection level', because it does not actively protect anything - it is the virtualized equivalence of a set of spare disks. The redundancy is provided by the VRAID-level, or not:
Run your EVA with a virtual disk in VRAID-0, experience the failure of a single disk drive and you have lost data! No matter what the **so-called** 'protection level' is set to: none, single or double.

Let me try to read your mind ;-)
Why virtualize the spare disk(s), too, might be you next question.

What does a spare disk do in a traditional RAID array?
- it rotates
- consumes power
- creates noise and heat
but it does not do any work (I/Os) for you.

On the EVA, data and space reservation for RAID rebuilds are distributed across all disk drives. Every disk in a disk group will do I/Os for you.

.

UselessUser · ‎02-10-2009

Hi...

Thanks to everyone who has responded..

Regarding my loop failure problem, I have a new disk now...

However the disk group is currently in transition state reconstructing...

Is it best to wait for it to finish reconstructing before moving the disk from the "Ungrouped disks" group to the disk group which it belongs, or can this be done whilst the original group is reconstructing?!

Uwe Zessin · ‎02-10-2009

Please wait for the reconstruct to finish - that is signalled in the EVA's event log. Do not remove the offending disk drive during the reconstruct.

Make sure the disk addition policy is set to manual. Otherwise the EVA might automatically add the replacement disk to the WRONG disk group!

Then remove the failed disk drive and put the replacement in the same slot. Manually add the disk drive to the original disk group. You can do this while data leveling is running.

.

UselessUser · ‎02-11-2009

Hi,

Thanks I will wait for it to finish...

I have been searching the forums for more information regarding the EVA and am trying to summarise it here... if a pro could go over this and just double check it I would appreciate it!

I have also got an Excel spreadsheet which I have created to show an example of the distribution.. I cannot show how the VRAID-5 allocation works so if someone could fill that it would be good!

RSS is the base layer of the protection mechanism inherent in an EVA. I assume this is needed because with a standard RAID 5, (Which can only stand a single disk failure), the chances of more than a single failure occurring at any one time (Without finishing any necessary rebuilding following a replaced disk) increase each time more disks are added?

Obviously with EVA where you can have loads of disks in the same VRAID, this could be a problem as it does not scale...

So the EVA uses RSS, which splits the disk group (if more than n disks) into separate RSS's.

Each RSS can hold between 6 and 11 disks and if a diskgroup has 11 disks and a 12th is added, the EVA automatically creates another RSS and reallocates disks according to an algorithm.

This next bit is the most confusing part, so any clarity would be good!

Any data and its protection (Be it another copy of the data for VRAID-1 or the parity bit for VRAID-5) is contained in a single RSS?

As data is placed on the EVA, it spreads it across all the RSS groups automatically?

OK, next redundancy measures

Using VRAID-5 a minimum of 8 shelves is recommended... is this because along with the recommendation of multiples of 8 disks, this could survive a failure of a shelf? I figure this because if I had say 16 disks in 8 shelves with 2 disks in each shelf, the EVA would create 2 RSS groups, each with 8 disks, ideally with each RSS having 1 disk from each of the 8 shelves. Now if 1 shelf failed each RSS would have 7 disks, and as RAID 5 allows one disk to fail because it can calculate the missing information from either the parity bit, or recalculating the parity bit from the actual data...

The protection level is purely a mechanism for ensuring that there is enough disk space to recover from a disk failure.. This works with the RSS group somehow??

For single protection the highest amount necessary is the equivalent of 2 disks because if a disk fails which contains VRAID-1 data, the EVA moves the VRAID-1 data from the remaining disk to the capacity that has been reserved on a disk that has got a mirror.

Víctor Cespón · ‎02-11-2009

In vRAID 5 data is stored as 4 data + parity, and all those segments are distributed among the disks on the RSS.
The reason for the RSS is that a normal RAID 5 with 200+ disks will have only one parity drive and the failure of 2 drives will bee too likely.

In an EVA the information is divided in segments and each segment and it's parity is stored fully in a RSS, distibuting it among the disks in the RSS. This allows you to lose one disk in each RSS without losing the RAID 5s.

If the RSS contains disks of different size, not all space will be used, as the smaller disks fill faster than the big ones, this leads to strange numbers on the available capacity, that can change when the RSS distribution changes. (See a recent case on this forum)

Regarding the distribution of the RSS, if there are enough shelves, yes, EVA will try to create vertical RSS, so there's only one disk of each shelf, and you can lose one entire shelf. But this happens in very few cases, and changes as disks fail and you replace them.

Some time ago Command View showed you the RSS state (none/parity/mirror), which means that nothing, vRAID5 or VRAID1 would survive a shelf failure. But since a shelf failure is very rare and this caused a lot of confusion on customers was removed.
Currently only people with HP internal tools can see how the RSS are distributed.

The "disk failure protection" is intended to make sure you have enough spare space to rebuild the RAIDs in the event of a disk failure, so the EVA goes back to a full redundant state. It's similar to a spare disk on a traditional RAID, but the spare space is taken from all the disks.

Uwe Zessin · ‎02-11-2009

> Any data and its protection (Be it another copy of the data for VRAID-1 or the parity bit for VRAID-5) is contained in a single RSS?

Correct. That's the key to provide multiple independent 'failure domains' within a diskgroup. The EVA can tolerate concurrent disk failures as long as they happen in different RSSes.

The VRAID-5 implements a 4D+1P redundancy, but it does not use a fixed set of 5 disks. A 4D+1P stripe is always stored on 5 _different_ disks (that's why an RSS uses 6 disk drives miniumum - if one fails, full redundancy can still be restored) -- think of VRAID-5 data is 'shuffled' across all disk drives in an RSS.

> As data is placed on the EVA, it spreads it across all the RSS groups automatically?

Implicitly. During creation / expansion of a virtual disk, allocation is striped across all RSSes/disks in an RSS.

> 8 shelves is recommended

To withstand a 'shelf meltdown', yes. Many arrays work fine with less shelves and complete meltdowns are quite rare, phewww...

> protection level ... This works with the RSS group somehow??

It does not. The so-called 'protection level' is just a guaranteed space reservation for data reconstruction (pretend it to be striped across all disk drives in a disk group). You can change it any time - it isn't a fixed area on the disk drives. As long as enough 'free space' exists in the disk group, the EVA can even reconstruct without the 'protection level'. In fact, the EVA first picks up from 'free space'.

> the EVA moves the VRAID-1 data

Correct.

Great summary, well done research!

.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: 2 x EVA 4400 Questions

2 x EVA 4400 Questions