HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- Windows Server 2003
- >
- Problem with HP Cluster
Windows Server 2003
1827262
Members
2040
Online
109717
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2004 11:35 PM
11-19-2004 11:35 PM
Problem with HP Cluster
Hi,
Our Hp Cluster failed this week for no apparent reason. We’re running Windows 2003 Enterprise with Microsoft clustering on two HP DL740’s using HP Secure Path back to a MSA1000.
According to the cluster log the quorum drive went off line and was inaccessible for approximately 9 minutes. Then several other drives on the SAN went offline as well.
Whatever happened caused the SCSI buses not to close down properly and our SQL database was corrupted.
The Quorum drive consists of 4 36.4gb 10000rpm hard drives. These 4 drives are in a RAID 1 setup and then striped. All disks are spilt over different controllers.
There is no way that this should have failed unless the entire SAN failed (unlikely). As the other drives went off line over a period of 9 minutes this doesn’t make sense either.
As all the drives, SAN and controllers appear to be ok and the cluster is now performing correctly I have no idea of how to diagnose what happened.
Any advice or comments would be apprecia
Our Hp Cluster failed this week for no apparent reason. We’re running Windows 2003 Enterprise with Microsoft clustering on two HP DL740’s using HP Secure Path back to a MSA1000.
According to the cluster log the quorum drive went off line and was inaccessible for approximately 9 minutes. Then several other drives on the SAN went offline as well.
Whatever happened caused the SCSI buses not to close down properly and our SQL database was corrupted.
The Quorum drive consists of 4 36.4gb 10000rpm hard drives. These 4 drives are in a RAID 1 setup and then striped. All disks are spilt over different controllers.
There is no way that this should have failed unless the entire SAN failed (unlikely). As the other drives went off line over a period of 9 minutes this doesn’t make sense either.
As all the drives, SAN and controllers appear to be ok and the cluster is now performing correctly I have no idea of how to diagnose what happened.
Any advice or comments would be apprecia
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-22-2004 02:25 AM
11-22-2004 02:25 AM
Re: Problem with HP Cluster
the best way to fix something like that is to eliminate the impossibilities and fix whatever remains. scientific method. find ways to disprove possibilities based on the evidence.
if all the LUNS when of at once, then it indicates the problem can't be based in a single LUN. we're left with a problem with the server, fibre channel NIC(s), RAID manager, switches, or fabric.
multiple servers had the same problem? that eliminates a software or server based problem.
redundant fabrics? then it can't be a switch problem, or a lost connection.
redundant array managers? then it can't be the control board in the disk array, but it still could be the backplane.
check the SAN switches for module disconnects or power events. possibility of a "cleaning crew unplugged it" problem?
a word of caution, don't eliminate anything unless the evidence shows it isn't possible. don't eliminate that new shiny GBIC because it was just installed last week.
if all the LUNS when of at once, then it indicates the problem can't be based in a single LUN. we're left with a problem with the server, fibre channel NIC(s), RAID manager, switches, or fabric.
multiple servers had the same problem? that eliminates a software or server based problem.
redundant fabrics? then it can't be a switch problem, or a lost connection.
redundant array managers? then it can't be the control board in the disk array, but it still could be the backplane.
check the SAN switches for module disconnects or power events. possibility of a "cleaning crew unplugged it" problem?
a word of caution, don't eliminate anything unless the evidence shows it isn't possible. don't eliminate that new shiny GBIC because it was just installed last week.
There have been Innumerable people who have helped me. Of course, I've managed to piss most of them off.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-22-2004 06:09 AM
11-22-2004 06:09 AM
Re: Problem with HP Cluster
Stephen
why do you have such behemoth quorum disk? Where is your data? Please post information about your disk and cluster groups.
Have you reviewed the System and Application log? Read them and find references to Securepath events at the time of the problems.
How do you know "the SCSI buses didn't close down properly" ?
Regards
why do you have such behemoth quorum disk? Where is your data? Please post information about your disk and cluster groups.
Have you reviewed the System and Application log? Read them and find references to Securepath events at the time of the problems.
How do you know "the SCSI buses didn't close down properly" ?
Regards
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Support
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP