- Community Home
- >
- Servers and Operating Systems
- >
- HPE BladeSystem
- >
- BladeSystem - General
- >
- Known issue with ESX and MSCS (Microsoft Cluster) ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2010 08:44 AM
08-03-2010 08:44 AM
Known issue with ESX and MSCS (Microsoft Cluster) = very slow BfS. Architectural limitation of ESX with MSCS
Terri was working an Microsoft Clustering issue with VMware ESX and at least got to an understanding of the situation:
************************************************************************************
This is an FYI only.
ISSUE: c-class BL495 blades servers and BfS (Boot from SAN) with ESX 4.0 U2. If a blade was rebooted it could take up to 22 minutes to re-aquire access to the LUN and actually boot. This was a Virtual Connect solution with Brocade SAN switches
I was able to verify the delay was not with the blade logging back into the Brocade switch. Once customer accepted this, he worked with Vmware tech support and the following information was provided by Vmware:
Using VMDKs would prevent the boot issues by containing the SCSI reservations within a file instead of an actual volume. However, that limits you to keeping both MSCS nodes on the same host - not the most ideal.
Unfortunately, other than that, there is nothing we can do about the slow boot issues other than make the adjustment to the SCSI.UWRetries count to reduce the wait time. This is an architectural limitation of ESX with MSCS simply based on the reservation types that the two use.
One thing we may want to try- we can reduce the amount of time that hostd waits for the volumes. If the driver itself is what is taking time, this will not help, but if it is actually ESX that is taking that long, it may improve things.To do so, go to- configuration -> advanced settings (software) -> Scsi.UWConflictRetries and change the value to 80. I'd go ahead and make the change to all of your hosts with MSCS clusters, and possibly consider continuing to isolate them as you get downtime for maintenance. Ideally, we'd have the MSCS nodes off in their own 2-4 host cluster so that the normal VM clusters are completely separate from them and totally unaffected by the other side-effects of MSCS clustering.
***********************************************************************************
Let us know if you have run into this situation and what you have done to resolve the problem if anything.