StoreVirtual Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

(network) raid question

johndoew
Occasional Contributor

(network) raid question

I started working in an environment with the following storage cluster setup:

* 1 storage cluster containing 2 storage sites 

* each storage site has 3 storage nodes 

* each storage node contains 12  600GB SAS drives

* site A contains 2 P4500 nodes and 1 P4530 nodes (the same goes for site B)

* every P4500 node contains 2 raid arrays: so 6 drives/array 

* every p4530 node contains 1 raid array: so 12 drives/array 

* every node produces the same amount of usable space: 5.32 TB

* The cluster creates 32.627,27 GB total space

 

For example: i create a network raid-10 array of 1 TB.

To acchieve this I need to take 500GB of storage on 4 nodes.

 

Every p4500 node contains 2 raid arrays. 

Does that mean that 1 physical P4500 is seen as as 2 logical nodes?

 

So in my example, when i need to take 500GB on 4 nodes.

i take 500 GB of my first array and 500 GB of my 2nd Array.

I do the same for my 2nd P4500 node. 

Because by now i use 4 nodes in total, i don't need to waste space on my other node (the p4530)

 

I'm not able to test this out that's why I ask it here theoretical.

 

So the big question is, does my 2 arrays on 1 P4500 gets seen as 2 different nodes?

if no, than all the above text is false but..

How can I achieve network raid 10 between my storage sites if i only have 3 physical nodes per site?

To acchieve network raid 10 you need a minimum of 4 nodes.. 

 

Little drawing: 

 

-Site A-

* P4500 (2x raid 5 arrays = 6 drives/arrays) --> total space: 5.32 TB

* P4500 (2x raid 5 arrays = 6 drives/arrays) --> total space: 5.32 TB

* P4530 (1x raid 5 array = 12 drives/array) --> total space: 5.32 TB

 

-Site B-

* P4500 (2x raid 5 arrays = 6 drives/arrays) --> total space: 5.32 TB

* P4500 (2x raid 5 arrays = 6 drives/arrays) --> total space: 5.32 TB

* P4530 (1x raid 5 array = 12 drives/array) --> total space: 5.32 TB

 

Thanks in advance! 

 

 

 

 

 

 

 

 

 

11 REPLIES
oikjn
Honored Contributor

Re: (network) raid question

are you 100% sure your 4530 has 600GB disks?  The HP site says that should be 12x 4TB disks.  If that is the case, you really need to split your cluster up so you have two...  One cluster with the 4500's and another cluster with the 4530's.  To know for sure, check out what CMC says for "Raw space" and whats usable, if you have your nodes matching those values should be the same.

 

the physical raid structure(s) within a node are all within a node.  They aren't seen as individual nodes and only as a sum of the space available for that specific node.  The actual sum is the "raw space" for the node.  The "usable space" is what is actually contributed to the culster capacity for that specific node.  The reason the usable space could be smaller than the raw space is that every node in the cluster contributes the same amount of space to the cluster, so if you have 6 nodes, 5 of which are 10TB in size and one that is 3TB in size, every 10TB node will only be able to contribute 3TB of storage to the cluster resulting in a total cluster usable size of 3*6=18TB.  If you simply remove the 3TB node, you could have a cluster available size of 10*5=50TB!  Think of cluster nodes as disks in a traditional raid array so that the speed of the cluster is only as fast as the slowest node and only as large as the smallest node.  This means is critical to make sure that your nodes are matching for both performance and capacity when joined to the same cluster.

 

This is not the case for the management group so you can have two clusters in the same management group and those can be different sizes/speeds and not affect eachother which is why I say you probably need to make two clusters (one for the 4500's and one for the 4530's).

 

As for network raid, network raid 10 means the data is always replicated to two nodes in a cluster.  network 10+1 makes sure the data is stored on three nodes, and NR10+2 holds the data on four nodes.  Forget about NR5 for anything in a multi-site cluster and really anything in a single site cluster other than a static data library that is read-only.  

 

NR10 is similar to disk raid10 in that its a stripe of mirrors, but it is a little more advanced in that it does allow for having an off number of nodes, so if you have Nodes A,B,C and data 1,2,3, the data would be split as follows:  A12,B13,C23.  This way there is always two copies of your data on two different nodes.  Now in your case with 6 nodes at two sites, if you have your nodes setup correctly you will end up with nodes A,B,C,D,E,F and data 1,2,3,4,5,6 and the data will look like: A12,B34,C56,D12,E34,F56 so your replicated data is always located at different sites.

 

to achieve network raid10 you need a minimum of TWO nodes.  If you are looking for two copies of your data at each site (so you can have an entire site go down and also an additional node fail and maintain data availability), that would require a minimum of four nodes and thatis what NR10+2 is.

 

johndoew
Occasional Contributor

Re: (network) raid question

Yes, i'm absolutely sure the P4530 nodes contain 12  SAS 6.0 600 GB disks and they generate the same usable space as any other node in the cluster. 

 

As for network raid 10. I follow your explanation when you mention you need only 2 nodes to acchieve network raid 10.

But what about network raid 5? I assumed you need at least 3 nodes for this? But 3 nodes for Network raid 5 and only 2 for network nraid 10? Doesn't seem logical..

 

One more question: how does my cluster know on what node it has to mirror the data of my network raid 10? Is it automatically stored on a node on another site or can it be stored on another node in the same site? I'm not fully understanding your explanation of "data 1,2,3,4,5,6"? 

 

Before I forget, thanks for your clear explanation! 

Bart_Heungens
Honored Contributor

Re: (network) raid question

Hi,

 

For an explanation of networkRAID in a single site and multisutesolution check out my blog

http://www.bitcon.be/?p=2537

 

And yes it is correct for NR10 2 nodes and NR5 3 nodes. In the blog you will see that with NR10 every block of data will b written twice so you need at least 2 nodes, and for NR5 you need at least 2 data blocks and a 3rd data block for the parity... So 3 nodes...

 

 

Kr,

Bart

 

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
My blog: http://blog.bitcon.be
oikjn
Honored Contributor

Re: (network) raid question

bart's link is a good one.  check that out.

 

Avoid NR5 at all costs.  Forget its an option and don't use it like you shouldn't use NR0.  In practice it doesn't get you any significant savings and its only really a good solution for something like an .iso storage library.

 

As for where the data is stored, thats just part of the StoreVirtual OS.  As long as you have assigned your nodes to their correct sites, it will automatically make sure that your mirror data is kept on two different sites at all times.  Multi-site clusters are an easy sales feature and are simple to turn on through CMC, but their actual propper implimentation is COMPLEX.  Its very rare to actually want to do this and requires a lot of support infrastructure and additional costs which you might not actually want.  You really should read the multi-site documentation and not cut corners.  Keep in mind that your storage system latency will now be worse than your site-site latency and your storage bandwidth will be limited by your site-site banddith which means your site-site link will likely be $$$$$ unless you are talking about sites which are really physically very close (like same building or at least same street/campus).  most of the time people see multi-site listed as an available option and go "ooooh I want that" even though all they really need is remote snapshots.

johndoew
Occasional Contributor

Re: (network) raid question

Thanks everyone for the answers!
I just have one more question to fully understand this.

 

So in my multi-site environment with raid 10 volumes set in my CMC.

Network raid 10 so i use 2 nodes. 

My CMC chooses than to take 1 node from site A & 1 node from site B to create my volume, right?

 

And what about network raid 5?

My CMC decides to take 2 nodes from site A & 1 node from site B to create the volume?

 

And how exactly does network raid 1 works?

I assume you need at least 2 nodes, so 1 node from site A & 1 node from site B?

But that's exactly the same as network raid 10 than?

I feel i'm almost getting there but need little bit more help.

 

Bart_Heungens
Honored Contributor

Re: (network) raid question

Hi,

 

You should not do NR5 across 2 sites... For NR5 you need minimu 3 nodes which cannot be equally split up across 2 sites...

 

2 sites? Always go for NR10 or higher...

 

Where did you see NR1? For it was always NR10 (2 copies), NR10+1 (3 copies) and NR10+2 (4 copies)...

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
My blog: http://blog.bitcon.be
johndoew
Occasional Contributor

Re: (network) raid question

Hi, 

 

Forget my (stupid) question regarding Network RAID1.

I was asking this question because i have the following situation:

 

We're running Exchange 2010 with a DAG configured.

The DAG consists of 2 mailboxservers.

I have 3 mailboxdatabases running in the DAG.

On every mailboxserver there is a LUN created per database. 

 

So in the CMC, there are 6 LUN's created on network raid 10 level. (3 LUNs per mailboxserver)

So my 6 LUNs are replicated to the other site due to RAID 10.

Also, because of the HA due the DAG config my 3 databases gets replicated to the other mailbox server.

 

I'm not quite sure if there's a best practice regarding this topic?

Looks like a bit of overkill to me.

Because i have HA built-in via Exchange DAG, i can maybe lower the raid level on my storage?

Change it from network raid 10 to network raid 0 or network raid 5?

I'm guessing network raid 0 is maybe the best choice?

I loose more storage space with configuring network raid 5 instead of network raid 0. (3nodes vs 2 nodes) 

Because every node itself is protected by hardware raid 5, seems like no problem to choose for network raid 0.

Anyone can share some experience on this issue?

 

HPstorageTom
HPE Pro

Re: (network) raid question

Network RAID 5 is not an option at all for any multi-site configuration - we simply do not support it in multi-site setups because NWR 5 is not site aware.

 

Network RAID 0 is not really a good option for any production data. First of all if you loose one node of the cluster you would loose data and you would loose access to the data. And the Exchange DAG would not help you here. All NWR 0 volumes will be unavailable as soon as you loose a node of the cluster. Furthermore, all NWR 0 need to be taken offline when you do a Firmware or LeftHand OS upgrade if a node reboot is required. 

 

 

johndoew
Occasional Contributor

Re: (network) raid question

Assuming i have 2 Network RAID0 volumes on seperate nodes.

One volume contains the active database and the other volume contains the database copy.

 

Let's assume one node fails. Due to the DAG, my database copy gets active on the other node. 

So with this set-up, I have a fault tolerance situation? 

 

Can I btw choose on what node my volumes are created? If my 2 Network RAID0 volumes are created on the same node than i have a problem.. 

oikjn
Honored Contributor

Re: (network) raid question

forget NR0.  Its a stripe, your data evenly split on ALL nodes.  It appears you are thinking the LUN is assigned to a single node, but that just isn't the case.  Its very hard to find a situation where NR0 is a good idea at all.  Yes, they exist, but 99.9999% of the time its NR10.  Its such a rule that if you make a NR0 LUN just to test you will see that CMC will constantly pester you to change it.

 

With NR10, your can can do firmware upgrades and almost anything you want and maintain 100% LUN availability with ZERO downtime.  Each node will actually reboot during these maintenance periods, but because of the NR10 structure the servers don't care one bit.  HOWEVER, if you use NR0, any time ANY node is rebooted or loses connection or has any availability problem, your LUN is instantly taken offline.  

 

 

I get your question about the inefficencies of the storage with exchange and it being unneeded and you are probably RIGHT...  that said, ask yourself why is the data on the SAN at all?  The nice advantage with the new exchange server is that they say you can now use local disks instead of SAN.  IMO, unless you are running a huge exchange instance with many servers, I would just as soon keep the data on the SAN in NR10 simply because I know the data is going to be available all the time... actually I would probably see about making sure one of the DAG member storage is NOT on the SAN so if for some reason hell freezes over and the san goes down, exchange doesn't stop.

GilPhilbert
Advisor

Re: (network) raid question

Johndoew,

 

What you've implemented means that you have six copies of the data (three mailbox servers x two copies of the data) which are then mirrored at the storage level, totalling 12 copies of the data and excellent resilience, since you can lose a P4000 node and multiple Exchange Mailbox servers and still maintain availability.

 

However, if you change your volumes to Network RAID (NWR) 0, you'll actually remove all of your storage resilience meaning that the loss of a single P4000 storage node would result in the loss of every Exchange volume and, therefore, all of your Mailbox servers. This is because in NWR0 your volume isn't located on a single storage system, it's striped (spread) across every node in the system.

 

Think of NWR10 this way: imagine you're out for a camping trip. There's four of you going and you all need a tent to sleep in. With NWR10, you've got two identical tents and each person holds a part of each tent. If one person gets eaten by a passing Griffin, you've still got all the parts to make up a single tent:

 

  • Person 1: Rods / Floor
  • Person 2: Floor / Inner
  • Person 3: Inner / Outer
  • Person 4: Outer / Rods

If Person 2 gets eaten, for example, you've still got a floor and inner to make up a complete tent.

 

Extending my admittedly odd metaphor, in NWR0 you'd all share the same tent, each carrying a single part of the tent.

 

  • Person 1: Rods
  • Person 2: Floor
  • Person 3: Inner
  • Person 4: Outer

Now of Person 2 gets eaten, you can't put your tent together and you're stuffed.

 

In my example, the tent is the volume. With NWR0 we have a single copy while with NWR10 we have two copies - that's basically the difference. Most people get mixed up assuming each volume sits on one node (NWR0) or two nodes (NWR10) but that's NOT the case.

 

As the others have said, NWR0 isn't an option and neither is NWR5 (mainly due to performance). That pretty much leaves you with NWR10.

 

I hope that helps rather than muddies the waters...