HPE EVA Storage
1838652 Members
2514 Online
110128 Solutions
New Discussion

SAN Boot cluster Event 51errors

 
Tony Tregidgo
New Member

SAN Boot cluster Event 51errors

I have 2 HP proliant DL 580 G2 servers. I want each server to boot from our HP storage works SAN, and reside in a clustered configuration.

Each server has 4 Emulex FCA2408 HBAs. 2 HBAs for Boot, and 2 HBAs for data. This is the recommended supported configuration for a SAN boot cluster.

I have tested this configuration with Windows 2000 SP4 and Windows 2003. I am running the latest HBA drivers, the latest version of secure path, HP smarstart drivers v. 7.10, and I have applied the latest BIOS updates to the motherboard and the Emulex HBAs.

The switches are running firmware: 4.2.0b

In all scenarios, whenever the boot volume is accessed for a large file read/write operation, the system intermittently pauses, and lots of "Event 51" errors appear in the system event log.

I have ensured that the Vdisk volume presentations, LUN ids, and zones are all correct.

I suspect the problem is due to the fact that we are running 4 HBAs in each server (which possibly introduces some kind of device conflict). If I disable the 2 data HBAs in the Windows device manager and restart the system, the problem goes away. As soon as I re-enable them - the problem comes back.

The specific message I am getting is:

An error was detected on device \Device\Harddisk8\DR9 during a paging operation.

To elminate potential page file problems, I put a local SCSI drive in the servers and used that for the page file. No improvement - the problem still exists.

The system is completely unusable in its current state, and I would greatly appreciate some help on this matter.

--
TT
4 REPLIES 4
JonL
Advisor

Re: SAN Boot cluster Event 51errors

Tony

What storage system is this on.... I'll assume EVA as you mentioned VDisks?

It really sounds like your LUN presentation is incorrect - and I know you've stated that you've ensured this is correct. But have you created two hosts per node - one for boot and one for data so that the luns aren't presented to all four cards?
Tony Tregidgo
New Member

Re: SAN Boot cluster Event 51errors

It is an EVA 5000.

Both Boot and data hosts completely separate within the "command view EVA"
Tony Tregidgo
New Member

Re: SAN Boot cluster Event 51errors

If I change the way the boot and data paths are configured so that the boot path goes through one switch and the data path goes through another (as opposed to spreading the boot and data paths across the 2 switches). Then the problem goes away. This suggests a problem with the switches.

This is the configuration displayed in the HP diagram for a SAN boot cluster (see attached). However, this is a pointless solution as it does not provide any redundancy, because of either switch 1 or switch 2 fails, the whole system will stop working (single point of failure).

So it looks as though it is not possible for us to implement a SAN Boot SQL server cluster using 4 HBAs per box (2*boot and 2*data). Instead we will use 2 local-boot DL580 servers with 2 HBAs in each for the SAN data drives.
Tony Tregidgo
New Member

Re: SAN Boot cluster Event 51errors

We fixed the problem some time back, and I figured I should post the solution in order to conclude this thread.

It was a dodgy GBIC!! - which was intermittently failing. The offending item was replaced, and everything worked fine. It seems odd that such a small component can throw the whole system out of line so easily - plus it is quite difficult to diagnose in a multipath environment.

In light of the problems we faced with SAN-booting in general we decided to go for local SCSI-booting machines with 2*HBAs connecting to the SAN for app and data volumes - much more stable!