- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: VMS Puzzle (Boot Time) for your Christmas enjo...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-25-2007 12:45 AM
12-25-2007 12:45 AM
We moved our datacenter last weekend, and after the move we rebooted the two member cluster.
The puzzle is to explain the following apparent inconsistency of the times that the system formed and joined the cluster with the times reported by boottime and show system/noprocess (uptime).
View of Cluster from system ID 1046 node: OMEGA 25-DEC-2007 02:56:52
+-----------------------+-----------------------------+
| SYSTEMS | MEMBERS |
+--------+--------------+---------+-------------------+
| NODE | SOFTWARE | STATUS | TRANSITION_TIME |
+--------+--------------+---------+-------------------+
| OMEGA | VMS V7.3-2 | MEMBER | 21-DEC-2007 22:46 |
| SIGMA | VMS V7.3-2 | MEMBER | 21-DEC-2007 22:36 |
+--------+--------------+---------+-------------------+
+----------------------------------------------------------------------------------------------+
| CLUSTER |
+--------+-----------+----------+---------+------------+-------------------+-------------------+
| CL_EXP | CL_QUORUM | CL_VOTES | QF_VOTE | CL_MEMBERS | FORMED | LAST_TRANSITION |
+--------+-----------+----------+---------+------------+-------------------+-------------------+
| 3 | 2 | 3 | YES | 2 | 21-DEC-2007 22:36 | 21-DEC-2007 22:46 |
+--------+-----------+----------+---------+------------+-------------------+-------------------+
From the above it appears the cluster was formed by the booting of SIGMA as the founding member of the cluster, with OMEGA joining 10 minutes later, and in fact SIGMA was the first node booted into the cluster, and OMEGA was booted approximately 10 minutes later after we did some preliminary checks for presence of all disks.
Now the puzzle:
$ sysman set env/cluster
%SYSMAN-I-ENV, current command environment:
Clusterwide on local cluster
Username JON will be used on nonlocal nodes
SYSMAN> do show time
%SYSMAN-I-OUTPUT, command execution on node SIGMA
25-DEC-2007 03:01:34
%SYSMAN-I-OUTPUT, command execution on node OMEGA
25-DEC-2007 03:01:36
SYSMAN> do write sys$output f$getsyi("boottime")
%SYSMAN-I-OUTPUT, command execution on node SIGMA
21-DEC-2007 22:35:11.73
%SYSMAN-I-OUTPUT, command execution on node OMEGA
22-DEC-2007 05:59:16.00
SYSMAN> do show system/noprocess
%SYSMAN-I-OUTPUT, command execution on node SIGMA
OpenVMS V7.3-2 on node SIGMA 25-DEC-2007 03:02:12.10 Uptime 2 19:15:37
%SYSMAN-I-OUTPUT, command execution on node OMEGA
OpenVMS V7.3-2 on node OMEGA 25-DEC-2007 03:02:14.75 Uptime 2 19:03:40
SYSMAN> Exit
$ write sys$output f$cvtime("25-DEC-2007:03:02:12.10-2-19:15:37","ABSOLUTE") ! SIGMA current time minus uptime
22-DEC-2007 07:46:35.10
$ write sys$output f$cvtime("25-DEC-2007:03:02:14.75-2-19:03:40","ABSOLUTE") ! OMEGA current time minus uptime
22-DEC-2007 07:58:34.75
$
Normally the boottimes will be close to the times reported by show cluster, and also close to the "current time" minus the uptime of the node. However, if you look at the above reported values you will see they do not correspond.
No exec or kernel mode software was used to write into the system time (EXE$GQ_SYSTIME), boottime (EXE$GQ_BOOTTIME), or uptime (EXE$GL_ABSTIM*) cells, with the exception of the normal updating done by the VMS operating system. There was something "special" done using software from the freeware disk, but it did not write to any of the above data cells. The time was not changed with $ SET TIME.
The question is, what could have been done to get the apparently inconsistent result reported above?
Those are the clues I will give at the start. If no one has any guesses, I will give more clues.
Have fun,
Jon
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-25-2007 08:47 AM
12-25-2007 08:47 AM
Solutionone way to get into this state is:
boot OMEGA with SETTIME=1 and manually enter a time in the future during boot.
EXE$GQ_BOOTTIME will be set when the hardware clock is initialized and this happens before becoming a cluster member. Once OMEGA joins the cluster, the clock will automatically be updated from the time value of an existing cluster member.
This scenario may also happen, if the BBW clock on OMEGA had been set to a future date by a previous boot before booting into the cluster. A failure of the BBW chip may also cause this behaviour.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-25-2007 07:15 PM
12-25-2007 07:15 PM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
Do you have any other servers on your intranet that are used for time synchronization? Is your VMS system time synchronized with other servers?
Merry Christmas and Happy New Year
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-25-2007 08:06 PM
12-25-2007 08:06 PM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
SYSMAN> do write sys$output f$getsyi("boottime")
%SYSMAN-I-OUTPUT, command execution on node SIGMA
21-DEC-2007 22:35:11.73
%SYSMAN-I-OUTPUT, command execution on node OMEGA
22-DEC-2007 05:59:16.00
However, it does not explain the discrepancy between cluster transition times and the times reported by show system/noprocess, specifically why is there such a discrepancy between SIGMA's 21-DEC-2007 22:36 (cluster transition) and 22-DEC-2007 07:46:35.10 (current time minus uptime), and OMEGA's 21-DEC-2007 22:46 (cluster transition) and 22-DEC-2007 07:58:34.75 (current time minus uptime)?
In tabular form with another clue (NODE_SWINCARN):
Node Cluster Transition NODE_SWINCARN BOOTTIME Current - uptime
SIGMA 21-DEC-2007 22:36 22-DEC-2007 05:47:00 21-DEC-2007 22:35:11.73 22-DEC-2007 07:46:35.10
OMEGA 21-DEC-2007 22:46 22-DEC-2007 05:59:01 22-DEC-2007 05:59:16.00 22-DEC-2007 07:58:34.75
RE: "Do you have any other servers on your intranet that are used for time synchronization? Is your VMS system time synchronized with other servers?"
No, there is no time synchronization with other servers. Other than that at boottime, the node joining the cluster has it software clock set to the value from another node in the cluster (in this case, there will only be one other node, so immediately after booting the second node into the cluster, the clocks are very close to being synchronized.
PS. I will award points after the scenario has been described.
PPS. The systems have not been rebooted yet, so it you have any other questions related to this puzzle that you want me to display output from, let me know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2007 09:40 AM
12-26-2007 09:40 AM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2007 10:12 AM
12-26-2007 10:12 AM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
Is TCPIP in use or another IP provider?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-28-2007 03:51 AM
12-28-2007 03:51 AM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-29-2007 01:01 AM
12-29-2007 01:01 AM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
Just to be clear, I KNOW why there is a discrepancy, by puzzle I meant "Brain Teaser"
RE: "What are the votes on each system? Was quorum adjusted?" Each ES47 has 1 vote, and the quorum disk has one vote. Expected votes is 3. Quorum was not adjusted.
RE: "Daylight savings time" Not related, and AUTO_DLIGHT_SAV is 0
RE: "Is TCPIP in use or another IP provider?" TCPIP
RE: "If you have settime enabled in audit you can check if something unexpected happened during the lifetime of the cluster." Good thing to check, we have time auditing enabled. The only entries were from the shutdown where it does a
$ SET TIME="''f$time()'"
Which results in audit alarms like this:
Security audit (SECURITY) on SIGMA, system id: 1045
Auditable event: System time set
Event time: 21-DEC-2007 21:33:19.73
PID: 2040264B
Process name: SHUTDOWN
Username: JON
Process owner: [SYSTEM]
Image name: DSA1407:[SYS0.SYSCOMMON.][SYSEXE]SET.EXE
New system time: 21-DEC-2007 21:33:19.72
Old system time: 21-DEC-2007 21:33:19.72
Posix UID: -2
Posix GID: -2 (%XFFFFFFFE)
Ok, here's another clue:
Before moving anything, we shutdown the cluster; the cluster was down for about two hours. While it was down, we unpresented units from the ES40 and presented them to the ES47's. Before moving anything we rebooted the cluster with only the two processors being moved in phase I, and set the time to 5 minutes after we had shut the cluster down (as there were many scheduled batch jobs we did not want to begin processing). After verifying that all devices were present, and that the applications were working, we shut the cluster down one last time while in the original location. The time from the cluster's point of view was 21-DEC-2007 22:30, wall clock time was around 22-DEC-2007 00:30. When the systems were shutdown, the Battery Backed up Watch chip was updated to 22:30, about two hours off wall clock time. This accounts for the difference between the SW_INCARN times and the (current time minus uptime values). It does not explain the discrepancy between the cluster transition times and the (current time minus uptime), a difference of over 9 hours and 10 minutes.
Here is another clue, after the equipment was physically moved, and reconnected, we Booted SIGMA first, with a conversational boot and set SETTIME 1, and set the time to 21-DEC-2007 21:35, so the Cluster believed it had been down for only 5 minutes. After the cluster booted, I did something to cause the time to be correct by Monday morning, but I did not use the SET TIME command.
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-29-2007 08:48 AM
12-29-2007 08:48 AM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
Change the values of EXE$GL_TICKLENGTH and
EXE$GL_TIMEADJUST perhaps?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-29-2007 09:25 AM
12-29-2007 09:25 AM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
did you use TBO (Freeware V6) ?
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-29-2007 01:21 PM
12-29-2007 01:21 PM
Re: VMS Puzzle (Boot Time) for your Christmas enjoyment.
RE: "did you use TBO (Freeware V6) ?" Yes, Specifically the command was:
$ tbo /direction=forward /range=99360 /delta=33120 /info ! drift the clock forward by 33120 seconds (9 hours 12 minutes)
EXE$GL_ABSTIM_TICS is the number of 1/100th second "soft ticks" since system initialization, and it is incremented based on the number of hardware clock interrupts needed average to 0.01 second. The system uptime displayed in the header line of SHOW SYSTEM is based on EXE$GL_ABSTIM_TICS (for the local system), and it will correct even if the system time is changed.
The reason we did this was because we had batch processing that had needed to be processed at the "correct" time. There were no external dependencies, so this was the easiest way to achieve the goal. NTP or any other software that adjusts the clock must be disabled while this is being done.
For those interested in how all this works, if you have access to the VMS source listings, see [.SYS.LIS]INTERVAL_TIMER.LIS. In the Alpha VMS 7.3-2 listing kit it is on CD 4/5
DISK$AXPVMS732LS4:[V732.SYS.LIS]INTERVAL_TIMER.LIS
The description in the AXP V1.5 IDSM Chapter 12, Section 12.3 System Timekeeping is still basically correct as well.
Jon