Re: About to pull the trigger on a CS300

PokerMunkee · ‎12-29-2014

Hi all,

This will be my first shared storage system. I did help setup an EqualLogic for a sister company a year ago with a 2 node VMware cluster and it's been working great other than a few bad firmware upgrades. I have a great VAR that I have worked with for a number of years but want another opinion before I spend a lot of money on my employer's behalf. I've been impressed with the Nimble presentations and don't think there is any other product out there I'd want. I do think VMware's VSAN is interesting, but too new and expensive to get on board at the moment.

My current environment is 8x ESXi hosts with about 45 VMs. Not a lot of I/O intensive apps. I have 3x physical SQL 2000 servers that I can't upgrade for at least another year.

I'm a Dell shop and have no reason to change. I'll be going with 3x Dell R630 servers for my hosts. Probably 320GB each, dual procs with 12 cores each, VMware on mirrored SD cards.

Will buy two HP ProCurve switches dedicated for iSCSI traffic. My core is a ProCurve 5412zl but will keep iSCSI isolated on it's own switches.

Our CS3000 will have 24TB raw (12x2TB, 4x160GB SSD).

Now to my questions...

My VAR is recommending we use 1GbE instead of 10GbE since our requirements don't justify the costs of 10GbE....with having multiple 1GbE ports trunked and VMware's MPIO, performance will be fine they tell me. Is this sound advice or should I bump up to 10GbE? Could I order my servers with 10GbE but connect to 1GbE ports to future proof? I believe 10GbE is backward compatible, but is that OK for connecting to the iSCSI switches? Not sure how much more 10GbE controllers are over the 1GbE ones on Nimble. Upgrading on Dell servers is less than $1K.

Having all of my eggs in the Nimble basket makes me nervous, won't lie. I have never had an outage in the 7 years I've managed this environment, all with local storage. Without replicating to another Nimble, what's the most cost effective way in having a Disaster Recovery Plan in the event of a complete Nimble outage? I'll probably switch from Unitrends to Veeam for our backup solution and I've heard Veeam has a decent replication feature at no additional cost. Does anyone use Veeam as their DR plan?

In the event a controller fails or we failover on purpose, is an outage created? Would SQL or Exchange have a problem?

Is there a single point of failure on a Nimble unit? I can't find any horror stories online with a unit going completely down. Has anyone seen one go down? What spare parts are good to keep onsite?

I think that's about it for now.

Appreciate any feedback!

Nick_Dyer · ‎12-30-2014

Hi Pokermunkee, welcome to the forum. I'll try to answer your questions as honestly as possible...

pokermunkee wrote:

My VAR is recommending we use 1GbE instead of 10GbE since our requirements don't justify the costs of 10GbE....with having multiple 1GbE ports trunked and VMware's MPIO, performance will be fine they tell me. Is this sound advice or should I bump up to 10GbE? Could I order my servers with 10GbE but connect to 1GbE ports to future proof? I believe 10GbE is backward compatible, but is that OK for connecting to the iSCSI switches? Not sure how much more 10GbE controllers are over the 1GbE ones on Nimble. Upgrading on Dell servers is less than $1K.

From my experience, 99% of the time environments don't require 10Gb networking, as performance is mostly random IO and thus more bottlenecked on IOPS rather than MB/sec (which is typically attributed to sequential IO). The most optimal networking configuration for a 1Gb Nimble unit would be to reserve at least one NIC for management per controller (two is best practice for resilience, but not needed if you have a single management switch) - this will leave you with four or five 1Gb ports available for iSCSI data. And using Nimble's MPIO toolsets for VMware and Windows this will ensure you get all the required bandwidth as possible, although you'll be limited to as many host-side NICs for throughput.

Having said that, you have a few options available here. Firstly, any Nimble system you purchase (aside from a CS210) can be upgraded from 1Gb to 10Gb non-disruptively in the future should you need to make the step up - although the costs associated with doing so are slightly more than when purchasing upfront. Secondly, you also have the option to buy a Nimble system populated with 10Gb Base-T (RJ45 connectors) which speed down to 1Gb should they need to. This could be a good way for investment protection should you ever want to step up to 10Gb in the future. Also all Dell servers come with 10Gb Base-T on the motherboard as an option to make it even simpler.

Having all of my eggs in the Nimble basket makes me nervous, won't lie. I have never had an outage in the 7 years I've managed this environment, all with local storage. Without replicating to another Nimble, what's the most cost effective way in having a Disaster Recovery Plan in the event of a complete Nimble outage? I'll probably switch from Unitrends to Veeam for our backup solution and I've heard Veeam has a decent replication feature at no additional cost. Does anyone use Veeam as their DR plan?

I'll let other forum users comment on their direct experiences on Veeam; however with Nimble we have no single point of failure so anything that can go wrong (disks, SSDs, fans, controllers, PSUs) are automatically protected through resiliency, and using our Infosight data analytics engine we can proactively spot and divert these issues before they become a problem. This is also true with firmware and software on the arrays too.

There certainly is a need for backups, and either Veeam or Unitrends are good options. With the latter there's a toolset called ReliableDR which actually integrates into Nimble, and can provide VMware SRM-like functionality for DR and failover... this does require Nimble replication so you will need two Nimble systems (don't have to be the same models though).

In the event a controller fails or we failover on purpose, is an outage created? Would SQL or Exchange have a problem?

We've designed the Nimble platform to be more intelligent than standard active/active or active/passive solutions, and so as long as the Nimble MPIO toolkit is installed within VMware and Windows then no outages will be registered within VMware or any Windows apps such as SQL and Exchange. In fact during these types of failovers (even when we perform firmware upgrades) we'll typically register anywhere between 3-10 data packets dropped before connectivity resumes. In fact during any POC or installation we deliver this is a standard test.

Is there a single point of failure on a Nimble unit? I can't find any horror stories online with a unit going completely down. Has anyone seen one go down? What spare parts are good to keep onsite?

Again, i'll leave this question for our end users to answer (my customer feedback has always been very positive). Things that can fail are drives, SSDs, fans, power supplies etc (as previously mentioned). If you have a 4 hour SLA for support then that should cover you for parts onsite within a short period of time... however it's always a good idea to get a drive and an SSD on a shelf spare should you want to replace the component quicker.

Nick Dyer
twitter: @nick_dyer_

CBVista · ‎01-04-2015

G'day!

Your story sounds exactly like mine. We just purchased and installed a CS300 (36TB + 3.2TB), 10G iSCSI and 4x R630 VMware hosts all at 10G.

We already had a pair of Extreme Networks X670V switches (stacked @ 160Gbps) so we had plenty of 10G available. While Nick is correct, most traffic will probably suffice on 1Gb, where 10G truely shines is vMotion (& storage vMotion when migrating data onto the Nimble)

Yes, 10G is backwards compatible to 1Gb so I'd highly recommend making the host NICs 10G from day 1. You'll have to talk to your sales rep about the cost for 10G vs 1Gb cards in the Nimble controller, again, if the cost is minimal I'd suggest doing this from day 1 (having left over parts from upgrades is just another pain). This will just leave to to purchase some 10G switches further down the track.

We have CommVault already (this was doing our backups & replication). Had I not managed to convince the boss to buy CommVault, Veeam was our next choice

Like you, I was very conscious of the 'whole eggs in one basket' concern but many, many, many hours of research convinced me I'd be fine. We're planning the purchase of another CS300 for a remote office (and replication site) now.

Can't comment on the spare parts, haven't had anything go wrong yet

Breaking stuff since forever

Valdereth · ‎01-06-2015

pokermunkee wrote:

My VAR is recommending we use 1GbE instead of 10GbE since our requirements don't justify the costs of 10GbE....with having multiple 1GbE ports trunked and VMware's MPIO, performance will be fine they tell me. Is this sound advice or should I bump up to 10GbE? Could I order my servers with 10GbE but connect to 1GbE ports to future proof? I believe 10GbE is backward compatible, but is that OK for connecting to the iSCSI switches? Not sure how much more 10GbE controllers are over the 1GbE ones on Nimble. Upgrading on Dell servers is less than $1K.

I've found a lot of customers are stepping up to 10GbE switching when purchasing Nimble arrays. Depending on your vSphere licensing and Host config you could significantly cut down on the number of cables a Host requires with a dual nic card or two. In the past I've dealt with mixed 1GbE and 10GbE iSCSI and I'll be honest, it always left me suspicious when looking into latency or networking issues, I really wish I would have had consistent bandwidth (all 1GbE or all 10GbE) just to rule that out.

Having all of my eggs in the Nimble basket makes me nervous, won't lie. I have never had an outage in the 7 years I've managed this environment, all with local storage. Without replicating to another Nimble, what's the most cost effective way in having a Disaster Recovery Plan in the event of a complete Nimble outage? I'll probably switch from Unitrends to Veeam for our backup solution and I've heard Veeam has a decent replication feature at no additional cost. Does anyone use Veeam as their DR plan?

In the event a controller fails or we failover on purpose, is an outage created? Would SQL or Exchange have a problem?

Is there a single point of failure on a Nimble unit? I can't find any horror stories online with a unit going completely down. Has anyone seen one go down? What spare parts are good to keep onsite?

Unitrends and Veeam both have the ability to create NFS shared storage from their repositories so you can fire up protected VMs immediately without having to restore them to new storage first. Obviously you sacrifice some performance in order to gain this ability but its a life saver when you find yourself in that situation. I've used the feature more for testing and running through mock DR scenarios to ensure the recover-ability of my VMs - always a good idea to be prepared for a failure ahead of time!

Unitrends and Veeam both have replication features built into the base products but Veeam has a WAN accelerator that can be utilized with their Enterprise Plus licensing.

In my experience Nimble has provided the fastest controller failovers of all the arrays I've worked with. vSphere, MSCS, SQL, and Exchange have all continued to function with no hiccup when failing over controllers during controller maintenance. I think the key here is that you're following Nimble best practice guides and utilizing Nimble's integration kits when appropriate.

PokerMunkee · ‎01-09-2015

Thanks everyone.

I pulled the trigger and signed the PO today. Went with recommended 1GbE to save around $10K.

This is what I have coming:

1x CS300 24TB Raw

3x R630 (8x1GbE, 2x 12 Core 2.5GHz, 256GB, dual SD card)

ESXi 5.5 Essentials Plus for 6 procs

2x ProCurve 2920-24G for iSCSI network

I'm excited to get rid of all of my various ESXi hosts and have a fast, redundant, and compact system. Can finally start migrating off Server 2003 and not have storage issues.

Valdereth · ‎01-12-2015

Congrats!

Sounds like you've got some projects to look forward to - good luck on those!

In the meantime I'd recommend reading up on Nimble best practice guides since it sounds like you'll have some fresh Switches and Hosts to work with - your VAR or Nimble SE should be able to get some material together for ya.

PokerMunkee · ‎01-22-2015

Well, look what showed up today!

[img]https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xpa1/v/t1.0-9/10933951_10100944890044971_2436325238959184279_n.jpg?oh=

rugby0134 · ‎01-22-2015

it's been 60 minutes - up yet?

Nick_Dyer · ‎01-23-2015

a thing of beauty!

Nick Dyer
twitter: @nick_dyer_

Valdereth · ‎01-23-2015

Still amazes me, so much power in just 3U

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: About to pull the trigger on a CS300

Pulling the Trigger On A CS300