- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Cluster hang when one node under SYSBOOT >
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 08:06 PM
02-17-2005 08:06 PM
Cluster hang when one node under SYSBOOT >
We've got a small problem I 've never seen before. Cluster is made of 2 DS25 VMS 7.3-2 MSA1000 & all patches.
Quorum disk is defined, all votes OK (EXP 3 V1 QDSKVOTE 1).
When we shutdown both node, rebooting one of them under SYSBOOT (b -fl x,1) prevents the other from booting (hang just after starting CPU #1).
Continuing to boot the first member (c under sysboot) makes everything works fine.
We've no problem if we try to do this (put the other one under sysboot) while one node is up... any idea
Thx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 08:48 PM
02-17-2005 08:48 PM
Re: Cluster hang when one node under SYSBOOT >
if the booting node hangs, just force a crash: press HALT, then enter >>> crash
Once everything is up again, you can look at the dump and try to figure out, why the node does not continue.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 08:53 PM
02-17-2005 08:53 PM
Re: Cluster hang when one node under SYSBOOT >
To me, that sounds just like desired behaviour.
(I am now assuming that you boot from the same system disk).
In the early phases of the bootstrap the system does a physical access of the system disk. It knows where to find the bootblock, and from there, where to find the xVMSSYS.PAR
file. After reading that is when the SYSBOOT> wait function is.
At that moment you are accessing the disk, but have NOT mounted it. And you still have the ability to modify system params (including VAXCLUSTER) before loading them and using them (that is the PURPOSE of SYSBOOT).
So, the system also does not yet know about the quorum disk.
Now, if you try to boot the second system, after starting the CPU's you are trying to access the system disk. It is a good thing that that is not allowed, it would be two systems uncoordinatedly accessing the same disk, an all too easy way to generate corruption.
Soon after you give Continue, the disk is formally mounted as System Disk, and the system "knows" it is to be a cluster, and what the Quorum Disk is. That is also 'accessed', and now it is a valid cluster, in which another booting node is allowed access to the system disk. So, the second node can continue, as you have experienced.
In the case of one node being shut down and rebooting, it directly encounters the situation where the other node and the QD are a valid config, so, it can simply continue.
Hth,
Proost.
Have one on me.
Jan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 08:59 PM
02-17-2005 08:59 PM
Re: Cluster hang when one node under SYSBOOT >
I agree with Jan, if your system is at SYSBOOT, i guess the system will not know its a cluster and the system disk is not formally mounted as a result of which the other node will not boot up.
Jan, thanks a lot for your explanation....i am really becoming your fan :) You have managed to explain it very well...
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 09:16 PM
02-17-2005 09:16 PM
Re: Cluster hang when one node under SYSBOOT >
(I am now assuming that you boot from the same system disk).
and you're OK ;-)
>In the early phases of the bootstrap the system does a physical access of the system disk. It knows where to find the >bootblock, and from there, where to find the xVMSSYS.PAR
>file. After reading that is when the SYSBOOT> wait function is.
>At that moment you are accessing the disk, but have NOT mounted it. And you still have the ability to modify system params >(including VAXCLUSTER) before loading them and using them (that is the PURPOSE of SYSBOOT).
>So, the system also does not yet know about the quorum disk.
Yes
>Now, if you try to boot the second system, after starting the CPU's you are trying to access the system disk. It is a good >thing that that is not allowed, it would be two systems uncoordinatedly accessing the same disk, an all too easy way to >generate corruption.
Yes but... when you're under sysboot, the only thing you can do is modify system params (yes including vaxcluster,votes...) or modify system startup file (set /startup), so I was thinking (but I may be wrong) that, in that case, the other node will form the cluster (with all the required votes and quorum disk vote) and then decide if it can "allow" the first node to join the cluster ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 09:46 PM
02-17-2005 09:46 PM
Re: Cluster hang when one node under SYSBOOT >
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 10:19 PM
02-17-2005 10:19 PM
Re: Cluster hang when one node under SYSBOOT >
Fortunately we have a similar configuration here (only difference is VMS, V7.3-1 not 7.3-2). And since we're in the process of configuring the cluster (no application or users yet), I took the opportunity to test it. And guess what, with 1 node at the SYSBOOT> prompt, the other node boots normally, and happily forms a cluster with the assistance of the quorum disk.
So Jan,
To me, that sounds just like desired behaviour.
I think that proves you wrong (no pun intended).
So, IMHO I think either they (=engineering) have added a feature to bootstrapping and/or clustering, or something else is wrong with your configuration. As already advised, you might want to take a crash dump on the "hanging" node if and when this happens.
Regards,
Kris (aka Qkcl)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 10:28 PM
02-17-2005 10:28 PM
Re: Cluster hang when one node under SYSBOOT >
in that case, the other node will form the cluster (with all the required votes and quorum disk vote) and then decide if it can "allow" the first node to join the cluster ?
The point is, that _BEFORE_ the second node can reach the point where it can form a cluster, it will have to get _ITS_ params from disk. And it is not allowed to access that disk... A Chicken-and-egg situation.
That is why I concluded that those systems boot from the same disk.
If you have a config with two (or more) system disks, then, starting from cluster-is-down, if you boot one node into SYSBOOT from disk A, you CAN boot another system from another disk, and THAT can then form the cluster.
Configs like that include multi-architecture clusters, but, also mixed-version clusters (eg. during rolling upgrade)
Proost.
Have one on me.
Jan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 10:47 PM
02-17-2005 10:47 PM
Re: Cluster hang when one node under SYSBOOT >
there's nothing wrong with the idea of having multiple systems at the SYSBOOT prompt just accessing (reading and/or writing) their system parameter files (using the boot driver) on the same disk. It should not cause any problems/conflicts with other nodes booting or running from that same system disk.
There's NOTHING in OpenVMS that PREVENTS you to boot 2 systems with VAXCLUSTER=0 from the same disk.
Philippe, before forcing a crash, you can also use the VERBOSE mode boot flag to obtain maximum information during boot. Consider to capture the console information to a file (using a terminal emulator or console mgmt application):
>>> b- fl x,30001
This will make it easier to find out, which operations have completed and which have not, once the system hangs.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 11:38 PM
02-17-2005 11:38 PM
Re: Cluster hang when one node under SYSBOOT >
with VERBOSE boot messages turned on, it should be quite easy to find out, where the system is hanging. Just make a note of the last message printed before system hangs, then SYSBOOT> CONT the other node and capture the next message issued by the 'hung' (and now continuing) node.
This 'next' message should relate to the 'blocked' resource/access.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 11:39 PM
02-17-2005 11:39 PM
Re: Cluster hang when one node under SYSBOOT >
The apearance looks like the other node has EXPECTED_VOTES to high or no quorumdisk configured. You mentioned you checked it, but do it again for both nodes.
As posted before it should be possible to have more system in the SYSBOOT> whithout holding eachother up. My gues will be check again the VOTES, EXPECTED_VOTES, VAXCLUSTER (SHOW/CLUSTER ) .
AvR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-17-2005 11:51 PM
02-17-2005 11:51 PM
Re: Cluster hang when one node under SYSBOOT >
MCR SYSMAN SET ENV/CLUS
SYSMAN> PARAM SHOW/CLUSTER
and compare the results displayed for each not in particular expected votes and DISK_QUORUM, QDSKVOTES
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-18-2005 12:08 AM
02-18-2005 12:08 AM
Re: Cluster hang when one node under SYSBOOT >
I've asked the operator to force a crash, I will try to boot "verbose" (very good idea why not mine ;-)
It is not a quorum problem I've been working with VMS clusters for too long, it was the first thing I've checked and the hang is not at the moment where cluster is formed (I've checked many many times with many many clusters) it's just at the beginning of the boot sequence (starting cpu # 1)
Thanks to you all Stay tune please ;-)
Thx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-18-2005 02:17 AM
02-18-2005 02:17 AM
Re: Cluster hang when one node under SYSBOOT >
Thanks for your help and patience