ProLiant Servers (ML,DL,SL)
1748288 Members
3178 Online
108761 Solutions
New Discussion юеВ

MS cluster and dual path MSA1000

 
SOLVED
Go to solution
CG_2
Frequent Advisor

MS cluster and dual path MSA1000

I have the following:
>> Two DL580G4s, each with two FC2142s
>> One MSA1000 with two internal 2/8 SAN switches

I'm trying to set up an active/passive Windows2003 cluster and reduce points of failure. I'd also like to load balance the traffic from the active node through to the disk.

Do I need anything besides the active/active firmware for the MSA and SecurePath software?

TIA
7 REPLIES 7
Mike Bollman_1
Respected Contributor
Solution

Re: MS cluster and dual path MSA1000

You do not have to purchase/use SecurePath, but instead could use MPIO. If so you will need to install the MPIO manager on the servers:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=18964&prodSeriesId=421492&prodNameId=421495&swEnvOID=1005&swLang=13&mode=2&taskId=135&swItem=co-45035-1

Also, you will need the active/active mpio driver found here:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=18964&prodSeriesId=421492&prodNameId=421495&swEnvOID=1005&swLang=13&mode=2&taskId=135&swItem=co-44892-1

I would use the storport driver for your servers, but be sure you apply the latest storport hotfix; MS KB916048. You will also want to check out MS hotfix in KB911030.

Both of these have been included in SP2, but I am notsure where HP stands on supporting SP2 at this time.

On the heartbeat side I would install 2 additional nics in your servers for a total of 4. Team 2 with failover only for the public interface. Then configure the other 2 independently on separate networks for the heartbeat.
mpapet
Advisor

Re: MS cluster and dual path MSA1000

It sounds like you are trying to do two things.

1. run an active/passive cluster

2. load balance across two SAN switches?

As far as running active/passive, you've got everything you need.

Maybe I'm not reading the question right, but attempting to load balance the SAN switches is a single, critical, point of failure.
CG_2
Frequent Advisor

Re: MS cluster and dual path MSA1000

Wow, thanks for the quick and detailed responses to you both.

Michael, you just saved me many $K. I do have the NICs, but didn't mention them.

Mpapet, I thought that with balancing I would at least run on the remaining path if one goes down. I'd be at 50% throughput, but not completely failed. Do I not get that from the MPIO that Michael mentioned?
Mike Bollman_1
Respected Contributor

Re: MS cluster and dual path MSA1000

CG, Please be kind and assign points
CG_2
Frequent Advisor

Re: MS cluster and dual path MSA1000

(Sorry on the points delay...)

OK I now have the MSA at firmware v7, mpio and manager. It looks good so far.

I created a drive and I'll start testing by pulling paths while it is being accessed.

Should I see 'roughly' 4GB aggregate throughput to the disk while everything is up and "green"?

How should I test controller and switch failure, by hot unplugging? (Or will I damage something?)

Also, the "Fault Indicator" light is on for both controllers. I can't find any other indication of what is up in SMH or ACU. How do I find what is wrong?

Patrick Terlisten
Honored Contributor

Re: MS cluster and dual path MSA1000

Hello,

if you're using the active/passive firmware for MSA1000 (latest version is 5.20), you can't distribute the traffic from the active node over the controllers. There is only one active controller in the MSA1000, and one active hba in the server.

Please meet the requirements for clustering. The following sentences are taken from the readme for firmware release 5.20:

Critical: FW 5.20 is currently NOT supported with Windows Clustering AND Multipath software (SecurePath or MPIO). New versions of MPIO and SecurePath are needed to work properly with the new MSA1500cs 5.20 firmware

The MSA 1000/1500cs FW 5.20 requires one of the following software updates for multi-controller Windows clustering:

HP Secure Path 4.0C Service Pack 2-9 for Windows

HP MPIO Basic DSM v1.40 fo MSA 1000/1500cs Active-Passive Disk Arrays

Kind regards,
Patrick
Best regards,
Patrick
CG_2
Frequent Advisor

Re: MS cluster and dual path MSA1000

I'm using the version 7.0 Active/Active firmware on the MSA1000.
The MSA is Active/Active; the Windows cluster will be Active/Passive.

I started some simple throughput and failover tests. I'm getting is rather low throughput.


The failover test just involved unplugging FC cables from the back of the server.

The MPIO is set to Round Robin. I chose this primarily because I was worried more about failover and figured that I'd be guaranteed to utilize both paths when they were up.
I can see that the data is going over both cards via Perfmon, the flickering activity lights on the switch and the Emulex HBAnyware app.

Failover works just fine. When I pull a cable there is about a 20 second halt in IO and then it picks back up on the remaining path. When I plug it in it picks back up and starts using both paths again.

The alerting/e-mail notifications in the MPIO Mgr also work as advertised.


So far so good, but here is where things go wrong...

My throughput test is a local copy of a dir of about 4GB worth of files and copy of a 4GB flat file.

The indicator lights on the front of the MSA show that both are in "active" mode, no standby. The CLI confirms that I'm running v7 and that "Current Redundancy Mode: Asym-Active/Active". Only the #1 controller seems to be busy based on the indicator lights and the CLI 'show perf' command. This seems to not be fully Active/Active.

I only get about 30MB/sec throughput and my disk queue averages over 100! With both paths connected, each is moving about .5 of the total. When I unplug one path, the other jumps to 30, what's up with that? If one path can do 30 then I should get 60 aggregate. The slowest bus (weakest link) in this is the 2GB fibre channel interface of the SAN switch. If I'm 'bottlenecking' through just one of those I could move almost 10 times the data I am now (theoretically).

Please help me speed this up.


Thanks