HPE EVA Storage

No Failover with Securepath 4.0C SP3

 

No Failover with Securepath 4.0C SP3

Hi,

we receive a strange behaviour at one of our customer sides: It seems that SecurePath does not perform a failover in heavy load situations.

The Server system is a Proliant DL585G1 running Windows 2003 SP2 with latest Patches. The Server is connected to a B-Series SAN Director through 2xQLogic 2214 HBA's.
Securepath 4.0C SP3 is installed.
The Server is Member of a MSCS Cluster.
The shared Cluster Disks are on a EVA 5000 with latest 3.x FW
(System and Bootdisks are local, no Boot dfrom SAN)
Sometimes we receive Raidisk Event 1026 followed by ntfs Messages which report that data could not be written Do Disk.
We already replaced the Preferred HBA and the Fibrecable to the SAN Director. However, the problems still exists.
Securepath performs ok when we test the functionality in a maintenance window with no users online.
However we still get these 1026 and ntfs messages (followed by a crash of the services using the shared cluster disk) when the systen is under load.

Questions:
Does anybody has experienced the same problem?
Is this a limitation of securepath and / or could this be solved?

3 REPLIES 3
Uwe Zessin
Honored Contributor

Re: No Failover with Securepath 4.0C SP3

Is the FCA2214 running with SCSIport or StorPort driver?
Any other events in the log?
.

Re: No Failover with Securepath 4.0C SP3

Hio Uwe,
We're using StorPort Drivers.
Sometimes we get Securepath 515 Events directly after the failure.
But it doesn't look like the other hba is taking over. I would expect Securepath 1028 Events for that.

Since the does not failover properly we received ntfs Events 55 and 50 and had coruppted filesystems, which must be restored from the backup.

The system is up and running for 2 years now and there is no hardware change.
Last year we upgraded securepath 4.0c to sp3 and applied the latest PSP 8.10
We didn't had any problems with this until end of april.
Very strange...
Uwe Zessin
Honored Contributor

Re: No Failover with Securepath 4.0C SP3

I've seen a number of systems that did run for some time and then suddenly failed or hung, but in all cases I have seen events from the port driver as well.

You could try an update of the Microsoft storport.sys and then the QLogic driver. Before you do that, you might have to upgrade the SmartArray driver:

c00715130 - Advisory: Blue Screen and Stop 0x000000D1 on ProLiant Server Configured with Smart Array E200, P400, or P600 Controller Under High Utilization and Running STORPORT.SYS and HPCISSS2.SYS Version 5.8.0.xx and Any Edition of Microsoft Windows Server 2003

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c00715130
.