- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Two node cluster, but only one at a time is up
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 04:26 PM
04-22-2021 04:26 PM
Re: Two node cluster, but only one at a time is up
>> Management is in the process of doing it and it takes weeks for renewal, bad timing it broke is all.
Good to hear. Too bad they did not plan ahead make use of the prior system managers (life long/) experience to cross-train, but unfortunately that is too often how it goes.
>> But I thought this is the forum to discuss and get help each other as I am ineterested to debug and fix.
And you did great so far, as i indicated. You found the best of the best. But there are limits to what one can convey in a forum and the back-and-forth can take a lot of time.
When you started to describe disks in totally amateuristic terms ""quorum disk was changed from thick to thin ". We were please to see you identified it as a quorum disk though! That was essential.
It wouldn't surprise me if Dave Lennon identified a critical step - was that disk initialized? Or really - when the disk was replaced, what steps were taken to restore its original contents? Backup restored as per your system operations playbook? Backup restored through magic storage actions?
>> It will be discouraging for unix people to learn openvms when I see this.
I don't think so, but to each their own opinion.
Folks have been going out of their way to help get you on track and have been very responsive to problem which originally had NOTHING pertinent to go on beyond "it doesn't work" - no error message, no (screen) output to show what leads you to the conclusion it was not working, barely an identification of the bits and pieces. The epression "Like pulling teeth" comes to mind. Now that you learned a bunch more, I encourage you to read back your original problem report and see how it really needs a mindreader to help you, Fortunately, you found one.
Good luck,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 06:43 PM
04-22-2021 06:43 PM
Re: Two node cluster, but only one at a time is up
As others have stated, there is only so much that can be done in a forum like this. Where are you located? Even a phone consult may help resolve this. There appears to be a fundamental pathway missing here that may more easily be found via phone or a terminal session. Send a private message to us to set it up. There may be a charge as this is how many of us make a living, but if it is important to get this system working,, take advantage of the contact points here.
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-23-2021 12:18 AM - edited 04-23-2021 06:25 AM
04-23-2021 12:18 AM - edited 04-23-2021 06:25 AM
Re: Two node cluster, but only one at a time is up
I agree with the advice given by others: you do need an experienced OpenVMS consultant to diagnose and fix this problem. Maintaining a working OpenVMS cluster does NOT need a full-time OpenVMS consultant, but in a situation like this, you need experienced help - as you've probably learned by now. Go and convince your management. Note that you could contact e.g. Dan (abrsvc) via personal mail in this forum.
Being in the same/similar timezone as 'the problem' also helps - although it gives me a lot of time to diagnose the information you've posted 'last night' and prepare some more questions to further narrow down on the problem. It also allows me more time to re-think and re-edit my reply.
Here is a refined problem description:
2 node Itanium OpenVMS V8.4 Blade SAN cluster with quorum disk - only ONE node can be started at a time, the 2nd one hangs after the following console messages:
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
%CNXMAN, Have "connection" to quorum disk
Google is your friend, but you need experience in OpenVMS troubleshooting to know what to search for...
Start searching for "Have connection to quorum disk" - you'll find a couple of articles with this symptom, none of them will give you a solution, but help you learn about the context. This message is output by the connection manager, if the node cannot create or join the cluster after about 2 minutes after boot.
The important thing here is, what's NOT shown on the console ! Assuming you've literally copied ALL console output, the missing piece is a message like %CNXMAN, have connection to system XXXXXX
This message would indicate, that the booting node is SEEING the 'other' node via one of the cluster communication LAN pathes, in this case one of the LAN failover sets (LLc0). or a physical LAN interface. This currently does NOT seem to be the case and that's preventing the 2nd node from joining the cluster with the other node.
Please try to answer the following questions by providing detailled data:
1) what EXACTLY did happen, when the problem started - as you described - 'One of the nodes in the cluster was down' Please provide the console output from BOTH systems from the time 'when that node went down' - you now have learned how to scroll the console output.
2) try a conversational boot and look at the relevant cluster system parameters of the 'hanging' node
In one of your posts, you showed:
SYSBOOT> set STARTUP_P2 "YES"
SYSBOOT> continue
Although setting STARTUP_P2 "YES" does NOT help in this case, try to repeat whatever commands you've entered to get to the SYSBOOT> prompt (scroll back through the console log to review your commands) and issue the necessary SHOW ... commands to view the critical cluster system parameters (same syntax as with SYSGEN> prompt)
3) find successful previous boot events of both nodes in the console logs
Try to find - and save ! - console messages from the most recent successful boot attempts of both nodes. Keep them as a reference and compare the contents to the current situation
4) find the documentation of the LAN configuration for this cluster
As these seem to be Blade systems - I have no practical experience with Blades, those arrived after my 25 years at Digital/Compaq/HP - the LAN configuration may play a crucial role in this problem.
Please also think about the location of those 2 Blades. Are they in the same rack or at different sites. This information may influence further troubleshooting.
Regards,
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-23-2021 07:50 AM - edited 04-23-2021 08:25 AM
04-23-2021 07:50 AM - edited 04-23-2021 08:25 AM
Re: Two node cluster, but only one at a time is up
Make sure you engage someone who also understands the blade enclosure and all of it's pieces - this could be a problem somewhere in those links.
FWIW, our core business is managing OpenVMS systems in situations just like yours: System Manager retired/left and no one knows VMS. My contact information should be available in my profile if your manager wants to engage someone to fix this problem and/or properly care for these systems long term. And our team includes experts on the blade enclosures who have given talks on them for HPE.
I recieved this quote from a potential customer recently - this guy understood the situation: "Ideally we should have VMS specialists managing our systems rather than Linux and project specialists masquerading as VMS system admins on an ad-hoc basis."
Software Concepts International
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-23-2021 01:54 PM
04-23-2021 01:54 PM
Re: Two node cluster, but only one at a time is up
Well, I don't need professional help on this. I fixed it by myself. Sorry, I thought I could get some help here and most were saying I need to get help from support and if I have support why would I come here?
But I really thank Volker. You are the best and thank you for supporting to encourage people like me. Thank you again and I really appreciate.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-23-2021 02:05 PM
04-23-2021 02:05 PM
Re: Two node cluster, but only one at a time is up
I'm glad that you resolved the problem. I would request however, that you post the solution here (leaving out any site specific information) such that in the future someone else can benefit from your solution.
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-23-2021 09:54 PM - edited 04-24-2021 12:10 AM
04-23-2021 09:54 PM - edited 04-24-2021 12:10 AM
Re: Two node cluster, but only one at a time is up
VMScheck,
I'm glad I could help you solve your problem.
For the benefit of others - and also myself - could you please describe the problem and your solution.
Thanks,
Volker.
- « Previous
-
- 1
- 2
- Next »