Windows Server 2003
1819518 Members
2901 Online
109603 Solutions
New Discussion юеВ

Windows 2003 Clustering

 
Cynthia Phalen
Occasional Contributor

Windows 2003 Clustering

We are seeing an issue when building a cluster where when you install clustering everything looks good, then after a couple reboots you start to see issues.

You can bring node 1 up fine, but when you try to join node 2 to the cluster you get SCSI Bus reset errors on each resource in the cluster, and the cluster service will not start.

After about 5 minutes the SCSI bus resets itself and then you can start the cluster service and the 2nd node starts to run as it should.

We are seeing this issue on both SCSI based and Fibre based clusters.

We have been building clusters for years and this problem has just started to occur.

Has anyone seen this before? Any thoughts?
3 REPLIES 3
Patrick Terlisten
Honored Contributor

Re: Windows 2003 Clustering

Hello Cynthia,

can you please post some event IDs and a short overview about your software setup (OS, SP level...).

Regards,
Patrick
Best regards,
Patrick
Kevin Walker_3
Occasional Advisor

Re: Windows 2003 Clustering

have x2 DL360 G4P servers running Windows 2003 SP1 Enterprise Server edition. They each have a single qlogic HBA attached to a EVA3000.

I presented a LUN for the quorum drive and proceeded to install / configure MS Cluster.

Cluster install went fine.

When the primary node comes online, all services start and the server is operational. When the redundant node is powered on, I start to receive error messages in event viewer ( Event ID: 1209, 118, 1034, 7031 ) basically saying that the cluster service is starting before the HBA / LUN is available and since the quorum is not present, the cluster service fails. A few seconds later, the cluster service starts once the quorum comes online and is available.

If you fail the cluster from the primary node to the secondary node, the fail is successful. If you then reboot the primary node, the exact thing happens to it.

Has anyone seen this issue before?

Any help is appreciated.

Thanks.
Rune J. Winje
Honored Contributor

Re: Windows 2003 Clustering

Check out:
http://support.microsoft.com/kb/923830

Clusdisk.sys update is important
storport.sys update is important. If using storport use this version http://support.microsoft.com/?id=916048

updated/correct version of storport driver component (or scsiport driver component) for the QL2300 card is important

Qlogic 2300 = FCA2214 => search for this on HP's pages. For Win2003:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=315741&prodTypeId=12169&prodSeriesId=439556&swLang=8&taskId=135&swEnvOID=1005

If using boot from SAN on the same controller as the cluster disks (for some necessary reason...) the registry setting in
http://support.microsoft.com/?id=886569 is necessary on all nodes.

Review design of SAN. Zoning etc is important. LUN masking may be important. Tape devices or other devices servers on the same fabric that could send scsi resets?

Check the cluster.log for quourm disk arbitration/failures.

Check that SCSI3 persistent reservations are not enabled
http://support.microsoft.com/?id=911030
also search your registry for UsePersistentReservations and change to 0 if 1.

Watch out for Gbic problems - tricky to troubleshoot...

Cheers,
Rune