Storage Software
cancel
Showing results for 
Search instead for 
Did you mean: 

Polyserve / HBA timeout issues across SAN switch firmware upgrades

johnCatBE
Advisor

Polyserve / HBA timeout issues across SAN switch firmware upgrades

Planning to upgrade switches from 5.3.x to 6.0.x and beyond in readiness to start doing Fibre Channel routing.

We have mixed SAN: HP Polyserve / EVA8100 and Sun cluster / sun 3510 storage - in a dual fabric environment.

Recently, i had occasion to reset the routing policy back to Port based routing (from exchange).....I did this each switch at a time, one fabric at a time, during a Sunday maintenance window. This was to correct settings for EVA CA replication.

Our only issue was that one of our 6 -node Polyserve cluster panicked - ie it rebooted itself and its only virtual server was rescued by another node. Luckily for me, whilst this was a production system, this particular system is not in use currently.

To try and avoid something similar happening across 3 upcoming SAN switchfirmware upgrades i want to try and understand why only one node misbehaved. I'm confident that there is no interruption to I/O across an upgrade from 5.3.x to 6.0.x but there is a pause whilst the CPs swap over.

So, i started wondering whether disk i/o timeouts may have provoked this one server into leaving the party.......i found that via HBAnywhere a setting of NodeTimeOut has been set at 10s whereas the default is 30s. This 10s value seems a bit tight in the context of a firmware upgrade, albeit one involving a pause in i/o.

My Windows colleague has told me that the HP Proliant support pack install may be doing this ... my question is should this matter in a Polyserve environment? Has anyone experienced similar issues?

No timeouts have been manually tuned to the best of my knowledge.

It seems reasonable to me to expect to do this work without interruption in a dual-fabric configuration.

regards,

John
2 REPLIES
Tammy Lawson
Advisor

Re: Polyserve / HBA timeout issues across SAN switch firmware upgrades

I know this is an old post but it may help someone else. I had a similar problem when my server support team was updating some drivers. I found that following the provided SAN maintenance instructions in the admin guide helped. You do this on each server, as you are doing maintenance on each one:

Change a Host Bus Adapter or Driver
Matrix Server and the psd driver must be disabled when you add,
remove, or update a Host Bus Adapter or its driver. The following
procedure describes how to change a Host Bus Adapter or driver on a
server. All commands are run from the Command Prompt.
1. Stop Matrix Server:
net stop matrixserver
2. Disable the Matrix Server service:
mxservice -uninstall
3. Remove the psd driver from the driver stack:
psdcoinst -uninstall
4. Reboot the server. The server will come up without Matrix Server and
the psd driver.
5. Make the necessary change to the HBA or driver. Reboot the system if
the HBA installation prompts you to do so.
6. (Optional.) Verify that the new HBA is supported:
mxcheck
7. Update the registry:
regedit /s psd.reg
Chapter 19: SAN Maintenance 254
Copyright © 1999-2007 PolyServe, Inc. All rights reserved.
8. Enable Matrix Server and the psd driver:
mxservice -install
psdcoinst -install
9. Reboot the server to return the psd driver to the driver stack.
10.When the system is rebooted, Matrix Server will still be disabled in the
Windows Services Control Panel. Re-enable it for Automatic startup if
desired.
11. Start Matrix Server (or wait until the next reboot).
net start matrixserver
johnCatBE
Advisor

Re: Polyserve / HBA timeout issues across SAN switch firmware upgrades

Tammy

Thanks for your post - whilst the scenario you describe is not exactly as ours was - ie i had no intention (or perceived need) of disabling or stopping matrix server - your notes will be useful in their own right.

In fact, since this post, we've upgraded again from 6.0.x to presently at 6.1.1d - with no more Polyserve issues .... (it was VMware that played up last time!)

john