Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

System Uptime Calculation

 
SOLVED
Go to solution
Kevin Raven (UK)
Frequent Advisor

System Uptime Calculation

Just noticed the behavour below ! Is this a bug maybe ?
Notice the node that runs the command $SHOW SYS/CL/NOPROC ...Calculates it's own uptime to be a less....
Look at the example below:

$ mc sysman
SYSMAN> set env/cl
%SYSMAN-I-ENV, current command environment:
Clusterwide on local cluster
Username USERNAME will be used on nonlocal nodes

SYSMAN> do sh sys/noproc/clu
%SYSMAN-I-OUTPUT, command execution on node NODEA
OpenVMS V7.3-2 on node NODEA 2-JAN-2008 14:13:11.58 Uptime 26 18:34:48
OpenVMS V7.3-2 on node NODEB 2-JAN-2008 14:13:11.59 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODEC 2-JAN-2008 14:13:11.59 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODED 2-JAN-2008 14:13:11.60 Uptime 26 18:36:03
%SYSMAN-I-OUTPUT, command execution on node NODEB
OpenVMS V7.3-2 on node NODEA 2-JAN-2008 14:13:11.62 Uptime 26 18:36:03 <
OpenVMS V7.3-2 on node NODEB 2-JAN-2008 14:13:11.63 Uptime 26 18:34:40
OpenVMS V7.3-2 on node NODEC 2-JAN-2008 14:13:11.63 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODED 2-JAN-2008 14:13:11.63 Uptime 26 18:36:03
%SYSMAN-I-OUTPUT, command execution on node NODEC
OpenVMS V7.3-2 on node NODEA 2-JAN-2008 14:13:11.62 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODEB 2-JAN-2008 14:13:11.63 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODEC 2-JAN-2008 14:13:11.63 Uptime 26 18:33:45
OpenVMS V7.3-2 on node NODED 2-JAN-2008 14:13:11.63 Uptime 26 18:36:03
%SYSMAN-I-OUTPUT, command execution on node NODED
OpenVMS V7.3-2 on node NODEA 2-JAN-2008 14:13:11.70 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODEB 2-JAN-2008 14:13:11.71 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODEC 2-JAN-2008 14:13:11.71 Uptime 26 18:36:03
OpenVMS V7.3-2 on node NODED 2-JAN-2008 14:13:11.71 Uptime 26 18:35:15
SYSMAN> ex
$

I have changed the node names and username in output above ...but the numbers are as produced in the command.

comments ?
9 REPLIES 9
Volker Halle
Honored Contributor
Solution

Re: System Uptime Calculation

Kevin,

the local node uptime is correctly obtained using EXE$GL_ABSTIM_TICS. The uptime of the remote nodes is calculated by subtracting boot time from the current time (of the local node). This value will be skewed if a SET TIME command had been issued at the remote node.

This is the coded and documented behaviour in [CLIUTL]SHOWSYS.

Volker.
Kevin Raven (UK)
Frequent Advisor

Re: System Uptime Calculation

All the servers in the cluster have the same time. We run NTP to keep the times in sync. NTP on all 4 servers in the cluster uses the same NTP time source. We never use the SET TIME command. Maybe NTP has done it when it drifts the clocks.

Just weird the way it behaves.
A non OpenVMS technical support person noticed it and commented on it. To be honest I have never noticed it.
I just run the $SHOW SYS/CL/NOPROC from any node in the cluster and take the results as being correct. However the support person run the command from each node in the cluster. Why ? It was in some instructions they had.
Petr Spisek
Regular Advisor

Re: System Uptime Calculation

Have you changed the node names correctly to NODEA,B,C,D? I have a guess, that different uptime is for the same node.
Check command:
SYSMAN> do write sys$output f$getsyi("boottime")
Petr
Kevin Raven (UK)
Frequent Advisor

Re: System Uptime Calculation

I have left the output intact. I have deleted the username ...see below ...

$ mc sysman
SYSMAN> set env/cl
%SYSMAN-I-ENV, current command environment:
Clusterwide on local cluster
Username XXXXX will be used on nonlocal nodes

SYSMAN> do sh sys/noproc/clu
%SYSMAN-I-OUTPUT, command execution on node RDTSA
OpenVMS V7.3-2 on node RDTSA 3-JAN-2008 12:02:25.07 Uptime 27 16:23:57
OpenVMS V7.3-2 on node RDTSB 3-JAN-2008 12:02:25.08 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSC 3-JAN-2008 12:02:25.08 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSD 3-JAN-2008 12:02:25.09 Uptime 27 16:25:16
%SYSMAN-I-OUTPUT, command execution on node RDTSB
OpenVMS V7.3-2 on node RDTSA 3-JAN-2008 12:02:25.12 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSB 3-JAN-2008 12:02:25.12 Uptime 27 16:23:48
OpenVMS V7.3-2 on node RDTSC 3-JAN-2008 12:02:25.13 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSD 3-JAN-2008 12:02:25.13 Uptime 27 16:25:16
%SYSMAN-I-OUTPUT, command execution on node RDTSC
OpenVMS V7.3-2 on node RDTSA 3-JAN-2008 12:02:25.15 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSB 3-JAN-2008 12:02:25.15 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSC 3-JAN-2008 12:02:25.15 Uptime 27 16:22:52
OpenVMS V7.3-2 on node RDTSD 3-JAN-2008 12:02:25.16 Uptime 27 16:25:16
%SYSMAN-I-OUTPUT, command execution on node RDTSD
OpenVMS V7.3-2 on node RDTSA 3-JAN-2008 12:02:25.21 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSB 3-JAN-2008 12:02:25.22 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSC 3-JAN-2008 12:02:25.22 Uptime 27 16:25:16
OpenVMS V7.3-2 on node RDTSD 3-JAN-2008 12:02:25.23 Uptime 27 16:24:24
SYSMAN> do write sys$output f$getsyi("boottime")
%SYSMAN-I-OUTPUT, command execution on node RDTSA
6-DEC-2007 19:36:05.00
%SYSMAN-I-OUTPUT, command execution on node RDTSB
6-DEC-2007 19:36:03.00
%SYSMAN-I-OUTPUT, command execution on node RDTSC
6-DEC-2007 19:36:15.00
%SYSMAN-I-OUTPUT, command execution on node RDTSD
6-DEC-2007 19:36:03.00
SYSMAN> ex
$
Highlighted
Petr Spisek
Regular Advisor

Re: System Uptime Calculation

Hm, it's really looks like bug in uptime calculation. Each node see itself with different uptime than any others.
I have the same behavior, but different about 50 minutes :-) .... when I execute command from each of cluster-node: show system/noproc/cluster (in the same time).
Jon Pinkley
Honored Contributor

Re: System Uptime Calculation

I guess it depends on your definition of bug.

By my definition, a bug is behavior the programmer did not expect. This does not fall into that category. It is based on a design limitation.

If you want consistent results with what is reported on the local node, then you will need to use something that gets the uptime value in the context of the local node, for example with

nodeA$ mc sysman set env/node=nodeb
SYSMAN> do sho system/noprocess
SYSMAN> exit

$ show system/cluster/noprocess

uses $getsyi to obtain the information for other nodes in the cluster.

There is no "uptime" item available via $getsyi. To approximate this value, it retrieves the "boottime" from the other node, (note that this boottime is not even "accurate"; it is the CSB$Q_REFTIME cell from the Node's Cluster System Block, and this isn't normally exactly the same as EXE$GQ_BOOTTIME. To make matters worse, the derived uptime is calculated using the local node's EXE$GQ_SYSTIME (local system time), and therefore the uptime reported for nodea by nodeb will be different than the uptime reported for nodea by nodec if nodeb's and nodec's clocks are not synchronized.

Also, while looking at the source code for 7.3-2, the comments claim that the uptime for the local node is derived from SYS$GL_ABSTIM_TICS, but in fact it is SYS$GL_ABSTIM. The only difference between those time cells is the resolution; SYS$GL_ABSTIM_TICS being updated 100 times more frequently than SYS$GL_ABSTIM. The uptime field displayed in show system's header has 1 second resolution, so you won't be able to see the effect of this difference.

See attachment for the comment and the code I am referring to. Also, note the comment specifically states that the reported uptime will be incorrect for a remote node if the time was changed on the remote node. No mention is made of inaccuracies due to differences in the local time and the remote time.

Bottom line. If you want consistent uptimes, get time from the node you are collecting data for. That can be with sysman or a batch job that runs on a node specific queue.

For more info related to boottime and uptime, see the following thread.
http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1187673
it depends
Jon Pinkley
Honored Contributor

Re: System Uptime Calculation

Just curious, how was the cluster booted? It is unusual to see uptime values so close to each other (at least in my experience).

Jon
it depends
Kevin Raven (UK)
Frequent Advisor

Re: System Uptime Calculation

Was rebooted via Reboot as YES during the running of the shutdown script. Taking the option of Cluster,reboot_check.
Thus All 4 nodes shutdown and rebooted at same time.
However I agree that the servers being up to within a second of each other is a bit weird !

Thanks for the answers everyone.

John Gillings
Honored Contributor

Re: System Uptime Calculation

Kevin,

Jon's explanation is correct. When SHOW SYSTEM is used to determine the uptime of another node, the value of EXE$GQ_BOOTTIME from the remote node is subtracted from the current time of the LOCAL node. Depending on the state of clocks at boot time and at the time the SHOW SYSTEM command is executed, this can give anomolous results.

Here's some DCL that approximates the mechanism used by SHOW SYSTEM which may help reveal exactly what's going on:

$ WRITE SYS$OUTPUT F$GETSYI("NODENAME")," ",F$TIME()
$ ctx=""
$ NodeLoop: csid=F$CSID(ctx)
$ IF csid.NES.""
$ THEN
$ bt=F$GETSYI("BOOTTIME",,csid)
$ WRITE SYS$OUTPUT F$GETSYI("NODENAME",,csid), -
" ",bt," ",F$DELTA(bt,F$TIME())
$ GOTO NodeLoop
$ ENDIF



This behaviour has been reported to OpenVMS engineering, but it's unlikely anything will change (apart from perhaps improving the documentation).

The very concept of "uptime" is problematic as there are numerous potential definitions, each with their own arguments for and against. The one that's used is a simple difference in timestamps.
A crucible of informative mistakes