- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Two node cluster, but only one at a time is up
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2021 11:12 AM
04-21-2021 11:12 AM
Two node cluster, but only one at a time is up
Hello, we have integrity 2 node cluster and one server at a time is coming up. What might be missing? Or what's need to be checked? Please help!
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2021 12:05 PM
04-21-2021 12:05 PM
Re: Two node cluster, but only one at a time is up
One of the nodes in the cluster was down so tried to bring up but only one server at a time is coming up. I am sure some setting is missing somewhere or misconfigured. So can you anyone please help!
Following settings I can see from server end:
MSCP_LOAD 1 0 0 16384 Coded-valu
TMSCP_LOAD 1 0 0 3 Coded-valu
MSCP_SERVE_ALL 7 4 0 -1 Bit-Encode
TMSCP_SERVE_ALL 1 0 0 -1 Bit-Encode
MSCP_BUFFER 1024 1024 256 -1 Coded-valu
MSCP_CREDITS 32 32 2 1024 Coded-valu
MSCP_CMD_TMO 0 0 0 2147483647 Seconds D
$
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
WBM_MSG_UPPER 80 80 0 -1 msgs/int D
WBM_MSG_LOWER 20 20 0 -1 msgs/int D
WBM_OPCOM_LVL 0 0 0 2 mode D
AUTO_DLIGHT_SAV 1 0 0 1 Boolean D
DELPRC_EXIT 5 5 0 7 Coded-valu D
SHADOW_REC_DLY 20 20 20 65535 Seconds D
SHADOW_HBMM_RTC 150 150 60 65535 Seconds D
MULTITHREAD 8 1 0 256 KThreads D
SHADOW_PSM_RDLY 30 30 0 65535 Seconds D
EXECSTACKPAGES 3 3 2 768 Pages D
GB_CACHEALLMAX 50000 50000 100 -1 Blocks D
GB_DEFPERCENT 35 35 0 1000 Percent D
CPU_POWER_MGMT 2 2 0 -1 Coded-valu D
CPU_POWER_THRSH 50 50 0 100 Percent D
IO_PRCPU_BITMAP 0-1023 0-1023 0 1023 CPU bitmap D
LOCKRMWT 5 5 0 10 Pure-numbe D
SSIO_SYNC_INTVL 30 30 5 65535 Seconds D
SCH_SOFT_OFFLD (none set) (none set) 0 1023 CPU bitmap D
SCH_HARD_OFFLD (none set) (none set) 0 1023 CPU bitmap D
PAGED_LAL_SIZE 0 512 0 2560 Bytes D
$
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2021 12:27 PM - edited 04-21-2021 12:28 PM
04-21-2021 12:27 PM - edited 04-21-2021 12:28 PM
Re: Two node cluster, but only one at a time is up
SYSMAN> PARAMETERS SHOW/LGI
%SYSMAN-I-USEACTNOD, a USE ACTIVE has been defaulted on node XXXX
Node XXXX: Parameters in use: ACTIVE
Parameter Name Current Default Minimum Maximum Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
LGI_CALLOUTS 0 0 0 255 Count D
LGI_BRK_TERM 1 1 0 1 Boolean D
LGI_BRK_DISUSER 0 0 0 1 Boolean D
LGI_PWD_TMO 30 30 0 255 Seconds D
LGI_RETRY_LIM 3 3 0 255 Tries D
LGI_RETRY_TMO 20 20 2 255 Seconds D
LGI_BRK_LIM 5 5 1 255 Failures D
LGI_BRK_TMO 300 300 0 5184000 Seconds D
LGI_HID_TIM 300 300 0 1261440000 Seconds D
SYSMAN>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2021 12:48 PM
04-21-2021 12:48 PM
Re: Two node cluster, but only one at a time is up
Anyone has any suggestion? Please suggest!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2021 09:15 PM
04-21-2021 09:15 PM
Re: Two node cluster, but only one at a time is up
> [...] we have integrity 2 node cluster [...]
Hardware model(s)? VMS version(s)? Cluster interconnect?
> [...] and one server at a time is coming up. [...]
When you do what, exactly?
Console output?
> Following settings I can see from server end: [...]
_Which_ "server"?
Copy+paste with white space works better here with the "</>"
("Insert/Edit code sample") tool.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2021 11:51 PM
04-21-2021 11:51 PM
Re: Two node cluster, but only one at a time is up
Has this been working before ?
What has been changed ?
What has happenend ?
And what happens, if you do what ?
The parameters you've shown have nothing to do with the basic clustering. Consider to show the values for:
VAXCLUSTER, VOTES, EXPECTED_VOTES, DISK_QUORUM, QDSKVOTES, SCSSYSTEMID, SCSNODENAME
from BOTH nodes.
Console messages ?
If it's urgent, consider to log a call with your OpenVMS support organization. This is just a forum...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 06:11 AM
04-22-2021 06:11 AM
Re: Two node cluster, but only one at a time is up
Yes, was working before. Admin who was maintaining this server left and I was asked to look into it. I am more UNIX person and we don't have support .
Console Log:
ogical LAN failover device added to failset, EWK0
%LLC0, Logical LAN event at 20-APR-2021 18:39:53.38
%LLC0, Logical LAN failset device connected to physical device EWD0
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
2,1,2,0 5404006349E10000 0000000000000000 EVN_BOOT_START
***********************************************************
* ROM Version : 01.98
* ROM Date : Fri Sep 11 00:56:00 PDT 2015
***********************************************************
2,0,2,0 3404083709E10000 000000000002000C EVN_BOOT_CELL_JOINED_PD
2,1,2,0 340400B149E10000 000000480205000C EVN_MEM_DISCOVERY
2,0,2,0 340400B109E10000 000000080205000C EVN_MEM_DISCOVERY
2,0,2,0 Start memory test ...... 0/100
.......
2,0,2,0 Memory test progress.... 33/100
.......
CL:hpiLO (+, -, <CR>, C, D, F, L, ?, Q or Ctrl-B to Quit){Pg 1 of 93}->
====================================================================
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 06:22 AM - edited 04-22-2021 09:08 AM
04-22-2021 06:22 AM - edited 04-22-2021 09:08 AM
Re: Two node cluster, but only one at a time is up
The values of the cluster system parameters are crucial here. Post them from both node - as asked above.
$ MC SYSGEN
SYSGEN> USE CURRENT
SYSGEN> SHOW <...>
Also both systems MUST be able to communicate directly via the LAN.
The console output you've posted is from the 2nd node,? Did you wait long enough (let's say: 5 minutes) ? The 1st node is up and running ? Any messages on the console of the 1st node, if you start (or stop) the 2nd node ? If not, it looks like the LAN communication may not work correctly.
The systems are using LAN failover (at least 3 LAN failover sets: LLA, LLB, LLC). If none of physical devices in the LAN failover set are connected to a LAN segment, which allows cluster communication with the other node, the local system cannot join the cluster. After some time, it may report the other node - based on the data in QUORUM.DAT - but it won't be able to join the cluster without cluster-communication via the LAN.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 09:35 AM
04-22-2021 09:35 AM
Re: Two node cluster, but only one at a time is up
Hi, Can you please let me know what exact command I can type to get the system parameters as I am not sure?
$ MC SYSGEN
SYSGEN> USE CURRENT
SYSGEN> SHOW <...>
Thank you..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 10:34 AM
04-22-2021 10:34 AM
Re: Two node cluster, but only one at a time is up
SYSMAN> PARAMETERS SHOW/LGI
%SYSMAN-I-USEACTNOD, a USE ACTIVE has been defaulted on node XXXX
Node XXXX: Parameters in use: ACTIVE
Parameter Name Current Default Minimum Maximum Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
LGI_CALLOUTS 0 0 0 255 Count D
LGI_BRK_TERM 1 1 0 1 Boolean D
LGI_BRK_DISUSER 0 0 0 1 Boolean D
LGI_PWD_TMO 30 30 0 255 Seconds D
LGI_RETRY_LIM 3 3 0 255 Tries D
LGI_RETRY_TMO 20 20 2 255 Seconds D
LGI_BRK_LIM 5 5 1 255 Failures D
LGI_BRK_TMO 300 300 0 5184000 Seconds D
LGI_HID_TIM 300 300 0 1261440000 Seconds D
SYSMAN>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 10:38 AM - edited 04-22-2021 10:43 AM
04-22-2021 10:38 AM - edited 04-22-2021 10:43 AM
Re: Two node cluster, but only one at a time is up
$ MC SYSGEN
SYSGEN> SHOW VAXCLUSTER
SYSGEN> SHOW EXPECTED
SYSGEN> SHOW VOTES
SYSGEN> SHOW DISK_QUORUM
SYSGEN> SHOW QDSKVOTES
SYSGEN> SHOW NISCS_LOAD_PEA0
SYSGEN> EXIT
And please post this data from BOTH nodes - if possible.
Under normal circumstances (2 nodes with a quorum disk), you should have: VOTES=1, QDSKVOTES=1, EXPECTED_VOTES=3, of course VAXCLUSTER=2 and NISCS_LOAD_PEA0=1
If you wait long enough (2-5 minutes !), if the 2nd node can't see the 1st running node, you should get: 'Have connection to...' messages, if cluster communication via the LAN works, but the new node is not allowed to join the cluster. Please carefully check the LAN cabling between the 2 nodes. Consider that LAN failover may automatically switch to a physical device, which presents the carrier signal, but it not correctly connected to the other node.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 10:47 AM
04-22-2021 10:47 AM
Re: Two node cluster, but only one at a time is up
From the node which is up:
$ mc sysgen
SYSGEN> SHOW VAXCLUSTER
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
VAXCLUSTER 2 1 0 2 Coded-valu
SYSGEN> SHOW EXPECTED
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
EXPECTED_VOTES 3 1 1 127 Votes
SYSGEN> SHOW VOTES
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
VOTES 1 1 0 127 Votes
SYSGEN> SHOW DISK_QUORUM
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
DISK_QUORUM "$1$DGA299 " " " " " "ZZZZ" Ascii
SYSGEN> SHOW QDSKVOTES
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
QDSKVOTES 1 1 0 127 Votes
SYSGEN> SHOW NISCS_LOAD_PEA0
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
NISCS_LOAD_PEA0 1 0 0 1 Boolean
SYSGEN>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 10:54 AM - edited 04-22-2021 10:56 AM
04-22-2021 10:54 AM - edited 04-22-2021 10:56 AM
Re: Two node cluster, but only one at a time is up
As you can see, these are actually the values I expected for a 2-node OpenVMS SAN cluster with a quorum disk !
Now boot the 2nd node and wait for 5 minutes, then post the complete OpenVMS console output - starting at the OpenVMS banner message:
HP OpenVMS Industry Standard 64 Operating System, Version ...
of the 2nd node and also the console output of the 1st node, if there is any during these 5 minutes.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 11:11 AM
04-22-2021 11:11 AM
Re: Two node cluster, but only one at a time is up
The Itanium console has a large output buffer. Consider to scroll back to the most recent successful boot of those nodes and carefully check, which physical LAN interfaces have been used in the various LAN failover sets last time it worked, e.g.:
%LLC0, Logical LAN failset device connected to physical device EWD0
Note the physical device for each of the LAN failover sets LLA0, LLB0 and LLC0 and compare them to the physical devices used now.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 11:41 AM
04-22-2021 11:41 AM
Re: Two node cluster, but only one at a time is up
From Console, I see the following from iLOM:
ogical LAN failover device added to failset, EWK0
%LLC0, Logical LAN event at 20-APR-2021 18:39:53.38
%LLC0, Logical LAN failset device connected to physical device EWD0
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
===================================================================
And from Console Log I see the following:
ogical LAN failover device added to failset, EWK0
%LLC0, Logical LAN event at 20-APR-2021 18:39:53.38
%LLC0, Logical LAN failset device connected to physical device EWD0
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
2,1,2,0 5404006349E10000 0000000000000000 EVN_BOOT_START
***********************************************************
* ROM Version : 01.98
* ROM Date : Fri Sep 11 00:56:00 PDT 2015
***********************************************************
2,0,2,0 3404083709E10000 000000000002000C EVN_BOOT_CELL_JOINED_PD
2,1,2,0 340400B149E10000 000000480205000C EVN_MEM_DISCOVERY
2,0,2,0 340400B109E10000 000000080205000C EVN_MEM_DISCOVERY
2,0,2,0 Start memory test ...... 0/100
.......
2,0,2,0 Memory test progress.... 33/100
.......
CL:hpiLO (+, -, <CR>, C, D, F, L, ?, Q or Ctrl-B to Quit){Pg 1 of 93}->
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 11:48 AM
04-22-2021 11:48 AM
Re: Two node cluster, but only one at a time is up
Using '+' or '-', you can scroll back and forward through the console log. The OpenVMS boot starts at the OpenVMS banner message - as shown above...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 11:51 AM
04-22-2021 11:51 AM
Re: Two node cluster, but only one at a time is up
With +, I see the following:
2,0,2,0 Memory test progress.... 66/100
.......
2,0,2,0 Memory test progress.... 100/100
2,0,2,0 1404002609E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,3,0 140400260DE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,2,0 1404002649E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,3,0 140400264DE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,3,1 140400260FE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,2,1 140400264BE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,3,1 140400264FE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,2,1 140400260BE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,2,0 5404020709E10000 000000000011000C EVN_EFI_START
Press Ctrl-C now to bypass loading option ROM UEFI drivers.
2,0,2,0 3404008109E10000 000000000007000C EVN_IO_DISCOVERY_START
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
CL:hpiLO (+, -, <CR>, C, D, F, L, ?, Q or Ctrl-B to Quit){Pg 2 of 93}->
============================================================================
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 11:55 AM
04-22-2021 11:55 AM
Re: Two node cluster, but only one at a time is up
How can I boot from here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 11:57 AM
04-22-2021 11:57 AM
Re: Two node cluster, but only one at a time is up
CL:hpiLO (+, -, <CR>, C, D, F, L, ?, Q or Ctrl-B to Quit){Pg 2 of 93}->
As you can see, there are lots of pages in the console log. You need to find and post the most relevant data. I'm sorry to say, but this may be hard, if you don't have enough OpenVMS and Itanium knowledge...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 12:13 PM
04-22-2021 12:13 PM
Re: Two node cluster, but only one at a time is up
This what I see when I reboot the node which is not up:
SYSBOOT> set STARTUP_P2 "YES"
SYSBOOT> continue
%RAD-I-ENABLED, RAD Support is enabled for 2 RADs
HP OpenVMS Industry Standard 64 Operating System, Version V8.4
▒ Copyright 1976-2019 Hewlett-Packard Development Company, L.P.
PGQBT-I-INIT-UNIT, boot driver, PCI device ID 0x2532, FW 4.04.04
PGQBT-I-BUILT, version X-33, built on Jul 19 2011 @ 16:12:20
PGQBT-I-LINK_WAIT, waiting for link to come up
PGQBT-I-TOPO_WAIT, waiting for topology ID
%DECnet-I-LOADED, network base image loaded, version = 05.17.02
%CNXMAN, Using remote access method for quorum disk
%SMP-I-CPUTRN, CPU #1 has joined the active set.
%SMP-I-CPUTRN, CPU #5 has joined the active set.
%SMP-I-CPUTRN, CPU #2 has joined the active set.
%SMP-I-CPUTRN, CPU #3 has joined the active set.
%SMP-I-CPUTRN, CPU #7 has joined the active set.
%SMP-I-CPUTRN, CPU #6 has joined the active set.
%SMP-I-CPUTRN, CPU #4 has joined the active set.
%VMScluster-I-LOADSECDB, loading
the cluster security database
%EWA0, Link up: 10 gbit, full duplex, flow control disabled
%EWE0, Function is disabled
%EWF0, Function is disabled
%EWC0, Link up: 10 gbit, full duplex, flow control disabled
%EWG0, Function is disabled
%EWH0, Function is disabled
%EWB0, Link up: 10 gbit, full duplex, flow control disabled
%EWD0, Link up: 10 gbit, full duplex, flow control disabled
%EWI0, Link up: 10 gbit, full duplex, flow control disabled
%EWM0, Function is disabled
%EWN0, Function is disabled
%EWO0, Function is disabled
%EWP0, Function is disabled
%EWK0, Link up: 10 gbit, full duplex, flow control disabled
%EWJ0, Link up: 10 gbit, full duplex, flow control disabled
%EWL0, Link up: 10 gbit, full duplex, flow control disabled
%EWA0, Jumbo frames enabled
%EWJ0, Jumbo frames enabled
%LLA0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLA0, Logical LAN failset device created
%LLA0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLA0, Logical LAN failover device added to failset, EWC0
%LLA0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLA0, Logical LAN failover device added to failset, EWL0
%LLA0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLA0, Logical LAN failset device connected to physical device EWL0
%LLB0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLB0, Logical LAN failset device created
%LLB0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLB0, Logical LAN failover device added to failset, EWB0
%LLB0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLB0, Logical LAN failover device added to failset, EWI0
%LLB0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLB0, Logical LAN failset device connected to physical device EWI0
%LLC0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLC0, Logical LAN failset device created
%LLC0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLC0, Logical LAN failover device added to failset, EWD0
%LLC0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLC0, Logical LAN failover device added to failset, EWK0
%LLC0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLC0, Logical LAN failset device connected to physical device EWD0
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
%CNXMAN, Have "connection" to quorum disk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 02:07 PM - edited 04-22-2021 02:12 PM
04-22-2021 02:07 PM - edited 04-22-2021 02:12 PM
Re: Two node cluster, but only one at a time is up
Just tried to reset this node and logs in detail from how I did reset was as follows and stuck there as mentioned before:
- - - - - - - - - - Prior Console Output - - - - - - - - - -
%LLC0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLC0, Logical LAN failover device added to failset, EWK0
%LLC0, Logical LAN event at 20-APR-2021 19:02:02.30
%LLC0, Logical LAN failset device connected to physical device EWD0
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
%CNXMAN, Have "connection" to quorum disk
- - - - - - - - - - - - Live Console - - - - - - - - - - - -
MP MAIN MENU:
CO: Console
VFP: Virtual Front Panel
CM: Command Menu
CL: Console Log
SL: Show Event Logs
HE: Main Help Menu
X: Exit Connection
[ilo-xxxx-04]</> hpiLO-> cm
(Use Ctrl-B to return to MP main menu.)
[ilo-xxxx-04] CM:hpiLO-> rs
RS
Execution of this command irrecoverably halts all system processing and
I/O activity and restarts the computer system.
Type Y to confirm your intention to restart the system: (Y/[N]) y
y
-> SPU hardware was successfully issued a reset.
[ilo-xxxx-04] CM:hpiLO->
MP MAIN MENU:
CO: Console
VFP: Virtual Front Panel
CM: Command Menu
CL: Console Log
SL: Show Event Logs
HE: Main Help Menu
X: Exit Connection
[ilo-xxxx-04]</> hpiLO-> co
[Use Ctrl-B or ESC-( to return to MP main menu.]
- - - - - - - - - - Prior Console Output - - - - - - - - - -
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
2,1,2,0 5404006349E10000 0000000000000000 EVN_BOOT_START
***********************************************************
* ROM Version : 01.98
* ROM Date : Fri Sep 11 00:56:00 PDT 2015
***********************************************************
2,0,2,0 3404083709E10000 000000000002000C EVN_BOOT_CELL_JOINED_PD
- - - - - - - - - - - - Live Console - - - - - - - - - - - -
2,1,2,0 340400B149E10000 000000480205000C EVN_MEM_DISCOVERY
2,0,2,0 340400B109E10000 000000080205000C EVN_MEM_DISCOVERY
2,0,2,0 Start memory test ...... 0/100
.......
2,0,2,0 Memory test progress.... 33/100
.......
2,0,2,0 Memory test progress.... 66/100
.......
2,0,2,0 Memory test progress.... 100/100
2,0,2,0 1404002609E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,3,0 140400260DE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,2,0 1404002649E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,3,0 140400264DE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,3,1 140400260FE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,3,1 140400264FE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,1,2,1 140400264BE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,2,1 140400260BE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
2,0,2,0 5404020709E10000 000000000011000C EVN_EFI_START
Press Ctrl-C now to bypass loading option ROM UEFI drivers.
2,0,2,0 3404008109E10000 000000000007000C EVN_IO_DISCOVERY_START
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
Dual Port Flex10 10GbE BL8XXc i2 Embedded CNIC is detected
HP PCIe 2Port 8Gb Fibre Channel Adapter (driver 2.27, firmware 5.06.006)
HP PCIe 2Port 8Gb Fibre Channel Adapter (driver 2.27, firmware 5.06.006)
2,0,2,0 5404020B09E10000 0000000000000006 EVN_EFI_LAUNCH_BOOT_MANAGER
(C) Copyright 1996-2010 Hewlett-Packard Development Company, L.P.
Note, menu interfaces might only display on the primary console device.
The current primary console device is:
Serial PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x5)
The primary console can be changed via the 'conconfig' UEFI shell command.
Press: ENTER - Start boot entry execution
B / b - Launch Boot Manager (menu interface)
D / d - Launch Device Manager (menu interface)
M / m - Launch Boot Maintenance Manager (menu interface)
S / s - Launch UEFI Shell (command line interface)
I / i - Launch iLO Setup Tool (command line interface)
*** User input can now be provided ***
Automatic boot entry execution will start in 1 second(s).
Booting xxxx Normal Boot $1$DGA300: FGB0.2012-0002-AC00-3D42
PGQBT-I-INIT-UNIT, IPB, PCI device ID 0x2532, FW 4.04.04
PGQBT-I-BUILT, version X-33, built on Jan 16 2015 @ 12:02:52
PGQBT-I-LINK_WAIT, waiting for link to come up
PGQBT-I-TOPO_WAIT, waiting for topology ID
%RAD-I-ENABLED, RAD Support is enabled for 2 RADs
HP OpenVMS Industry Standard 64 Operating System, Version V8.4
▒ Copyright 1976-2019 Hewlett-Packard Development Company, L.P.
PGQBT-I-INIT-UNIT, boot driver, PCI device ID 0x2532, FW 4.04.04
PGQBT-I-BUILT, version X-33, built on Jul 19 2011 @ 16:12:20
PGQBT-I-LINK_WAIT, waiting for link to come up
PGQBT-I-TOPO_WAIT, waiting for topology ID
%DECnet-I-LOADED, network base image loaded, version = 05.17.02
%CNXMAN, Using remote access method for quorum disk
%SMP-I-CPUTRN, CPU #1 has joined the active set.
%SMP-I-CPUTRN, CPU #5 has joined the active set.
%SMP-I-CPUTRN, CPU #2 has joined the active set.
%SMP-I-CPUTRN, CPU #6 has joined the active set.
%SMP-I-CPUTRN, CPU #4 has joined the active set.
%SMP-I-CPUTRN, CPU #3 has joined the active set.
%SMP-I-CPUTRN, CPU #7 has joined the active set.
%VMScluster-I-LOADSECDB, loading
the cluster security database
%EWA0, Link up: 10 gbit, full duplex, flow control disabled
%EWE0, Function is disabled
%EWF0, Function is disabled
%EWG0, Function is disabled
%EWC0, Link up: 10 gbit, full duplex, flow control disabled
%EWH0, Function is disabled
%EWD0, Link up: 10 gbit, full duplex, flow control disabled
%EWB0, Link up: 10 gbit, full duplex, flow control disabled
%EWI0, Link up: 10 gbit, full duplex, flow control disabled
%EWM0, Function is disabled
%EWN0, Function is disabled
%EWO0, Function is disabled
%EWP0, Function is disabled
%EWK0, Link up: 10 gbit, full duplex, flow control disabled
%EWJ0, Link up: 10 gbit, full duplex, flow control disabled
%EWL0, Link up: 10 gbit, full duplex, flow control disabled
%EWA0, Jumbo frames enabled
%EWJ0, Jumbo frames enabled
%LLA0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLA0, Logical LAN failset device created
%LLA0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLA0, Logical LAN failover device added to failset, EWC0
%LLA0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLA0, Logical LAN failover device added to failset, EWL0
%LLA0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLA0, Logical LAN failset device connected to physical device EWL0
%LLB0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLB0, Logical LAN failset device created
%LLB0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLB0, Logical LAN failover device added to failset, EWB0
%LLB0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLB0, Logical LAN failover device added to failset, EWI0
%LLB0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLB0, Logical LAN failset device connected to physical device EWI0
%LLC0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLC0, Logical LAN failset device created
%LLC0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLC0, Logical LAN failover device added to failset, EWD0
%LLC0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLC0, Logical LAN failover device added to failset, EWK0
%LLC0, Logical LAN event at 22-APR-2021 16:44:10.94
%LLC0, Logical LAN failset device connected to physical device EWD0
%SYSINIT-I- found a valid OpenVMS Cluster quorum disk
%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Using local access method for quorum disk
%CNXMAN, Established "connection" to quorum disk
%CNXMAN, Have "connection" to quorum disk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 02:15 PM
04-22-2021 02:15 PM
Re: Two node cluster, but only one at a time is up
>> Yes, was working before. Admin who was maintaining this server left and I was asked to look into it. I am more UNIX person and we don't have support .
It's very gracious of Volker to try to help, but I would urge you to go back to your management and tell them this is beyond basic operations and a simple google searched. You have done a good job reaching out and found this forum, but now the time has come to pay up
Let them hire a consultant for good money and use the opportunity to learn about the system. They got away with it thos far and now let them pay for their sins so to speak. They probably/possibly saved tens of thousands in maintenance and/or hiring/training a person with the rigth skill set. Now it is it time to pay a few thousand to a desrving consultant (not me!)
Good luck,
Hein
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 02:20 PM
04-22-2021 02:20 PM
Re: Two node cluster, but only one at a time is up
Management is in the process of doing it and it takes weeks for renewal, bad timing it broke is all. But I thought this is the forum to discuss and get help each other as I am ineterested to debug and fix. It will be discouraging for unix people to learn openvms when I see this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 03:16 PM
04-22-2021 03:16 PM
Re: Two node cluster, but only one at a time is up
Hi, is it safe to say it is "hanging" right after it mentions the quorum disk? You did give it several minutes (maybe up to 5) to re-establish quorum, right?
I believe the quorum disk needs to be "VMS initialized" before it is used by the clustering software -- I know it will re-create the quorum.dat file in the top-level directory if someone accidentally deletes it (very early on in their career).
Again, not a simple fix for someone not well-versed in VMS, but I'd suggest booting one node "conversationally" or perhaps off the install DVD (both are not up running VMS now, right?) and making sure that quorum disk unit is a-okay, i.e. it can be mounted as a VMS (ODS-2 or ODS-5) volume.
It should not matter if the disk unit is thin or thick provisioned on the SAN storage array, VMS really doesn't care (or know).
Did you say your site will have a position for VMS system manager? Where is it located?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-22-2021 03:47 PM
04-22-2021 03:47 PM
Re: Two node cluster, but only one at a time is up
Yes, I waited long enough and infact it is staying at that stage since a day. I did reset and it came back again and hung there. Would it possible when SAN concoverted to thick and thin, can that change anything to break on system level to cause the issue we are seeing?