- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: QDSKINTERVAL Setting
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2011 12:33 PM
тАО05-24-2011 12:33 PM
OpenVMS V8.3
2 EVA5000 disk storage w/Quorum disks defined
I discovered that attempting to upgrade the disk firmware on our EVA5Ks would cause a quorum disk timeout when the EVA5K would pause at the end of the upgrade. I opened a case & found that HP has a "note" about this and recommends changing QDSKINTERVAL from the default of 3 seconds to 10 seconds.
How would this affect the cluster integrity?
I asked about also changing the RECNXINTERVAL but was told to leave it at 20 seconds.
Thoughts please?
TIA
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-24-2011 12:48 PM
тАО05-24-2011 12:48 PM
Re: QDSKINTERVAL Setting
The HP-provided work-around implies that the device might be catatonic for longer than ten seconds, too.
I might well look to connect a SCSI between the two boxes (if you can get a supported configuration for that) and move the quorum disk off the SAN. Or add a third voting node to the cluster.
RECNXINTERVAL: host-to-host activity
QDSKINTERVAL: host-to-quorum-disk activity
In short: follow what HP recommends. You have paid them for the privilege of calling them for support, and particularly to have them resolve these cases for you, after all.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2011 05:50 AM
тАО05-25-2011 05:50 AM
Re: QDSKINTERVAL Setting
No cluster integrity issues moving up the value of qdskinterval. The idea of this is to move it up so as to avert an issue during your storage bounce.
Additionally, not sure how your nodes are connected to each other. The only way someone would say leave recnxinterval at 20 is if they are hard connected.
The 2 params as Hoff stated are different in what they are looking for. My feeling on 2 node clusters is to keep the nodes close and run a crossover cable between them. Best $7 you will ever spend for SCS communications. SCS always speaks down the path of least resitance plus, we can use SCACP to prioritize that channel. Thus, as long as both nodes are up... you are good... But that is only for host-to-host.
- Dave Sullivan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2011 06:38 AM
тАО05-25-2011 06:38 AM
Re: QDSKINTERVAL Setting
But to be clear, it isn't a replacement for the quorum disk.
The quorum disk is what allows this cluster to survive the failure of either host within the cluster. (The alternative here being the addition of a third voting node.)
The multi-host (shared) SCSI with a quorum disk on that shared SCSI bus allows you to avoid the issues arising from the quorum disk access delay should that EVA 5000 box go walkabout.
As for avoiding switches, that's your call. Some folks do like that configuration. But I prefer having available ports on the core cluster network, and (ignoring the expensive, managed switches, which can and sometimes fail more often than I'd prefer) the "dumb" and cheap and unmanaged switches tend to be quite reliable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2011 07:27 AM
тАО05-25-2011 07:27 AM
Re: QDSKINTERVAL Setting
My thinking on the QDSKINTERVAL is that changing the polling time would reduce, but not eliminate, the chance of a quorum disk timeout, i.e. it would still be possible for the poll to occur while the disk controller was "frozen".
Am I thinking correctly here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2011 08:03 AM
тАО05-25-2011 08:03 AM
SolutionFour failed polling I/O requests sent to the quorum disk cause the votes from the quorum disk to be discounted. The quorum disk is effectively ejected from quorum calculations. You get three I/O errors (for whatever reason), basically.
This processing means that the maximum delay expected with a departing host with a quorum disk with QDSKINTERVAL set to 10 here is circa 40 seconds. (Clean cluster exits can be and usually are faster than that.) This is also the duration of the quorum hang, during an unclean exit.
If you have tighter timing tolerances and more stringent timing requirements here, then you do have some choices.
- Move the quorum disk to an interconnect with sufficiently fast and particularly more consistent response times
- add additional voting nodes (to total three or more),
- migrate to a different clustering technology that (better) meets your applications needs
- replace that EVA controller with one that reacts within your timing requirements.
- work with HP to get that EVA to react more quickly in these cases.
FWIW, this EVA (mis)behavior and this pause is hitting all I/O activity, and not just the quorum disk polling. (You're just not seeing that application pause because you're wedged in a cluster hang here, and the applications are apparently not specifically coded to continue operations during the quorum hang. Yes, you can do that to a degree, if you're willing and careful, and play within the rules.)
As a side note, I'd consider asking HP what to expect as the worst-case EVA controller wedge duration. That's the key determinate here, and it looks to be somewhere between 10-ish and 40 seconds.
Prior to about V7.2, the default QDSKINTERVAL was 10.
Here is a write-up on a low-end cluster; on a cluster configuration of two voting hosts:
http://labs.hoffmanlabs.com/node/569
My previous reply indicated 3x. That recollection was incorrect, based on what I see in the available low-level cluster docs. It's a 4x poll before a decision is rendered.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2011 08:06 AM
тАО05-25-2011 08:06 AM
Re: QDSKINTERVAL Setting
Clean cluster exit: a shutdown, or a crash, or any of the failure paths that send out the so-called last-gasp datagram.
Disconnecting a cable won't send that datagram, for instance, so you'll get the full 4x poll.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-25-2011 08:51 AM
тАО05-25-2011 08:51 AM
Re: QDSKINTERVAL Setting
We'll start scheduling down times to change QDSKINTERVAL.