Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

NTPD wierdness

 
Fredrik.eriksson
Valued Contributor

NTPD wierdness

Hi all,

I have some issues getting my NTP service running as it should on 2 clustered Alpha 7.3-2 nodes. For some reason the first node (node1) works as it should when discovering ntp peers. But the other node can't find the same ntp servers.

node2>ntpdate -q ntp01
Server ntp01 (172.xx.xx.xx), stratum 1
offset +0.9998752, delay +0.0275512
node2>ntpq -c peers
remote refid st t when poll reach delay offset jitter
==============================================================================
node1 0.0.0.0 16 u - 64 0 0.000 0.000 4000.00
node2>ntptrace -n ntp01
172.xx.xx.xx: stratum 1, offset 0.110786, synch distance 0.00027, refid 'PPS'
node2>ntpq -pn
remote refid st t when poll reach delay offset jitter
==============================================================================
10.xx.xx.xx 172.xx.xx.xx 2 u 52 64 77 0.488 -79.204 26.446

This is how it comes out on the node that isn't working.

This is how my 2 configuration files looks like. The local stratum is just a test to see if it was working at all.

Configuration for node2
node2>typ sys$specific:[tcpip$ntp]*.conf
SYS$SPECIFIC:[TCPIP$NTP]TCPIP$NTP.CONF;30

server ntp01 prefer
server ntp02

peer 10.xx.xx.xx prefer

driftfile SYS$SPECIFIC:[TCPIP$NTP]TCPIP$NTP_node2.DRIFT
logfile sys$specific:[tcpip$ntp]ntp_log_node2.log


Configuration for node1 follows
node2>typ dsa0:[sys1.tcpip$ntp]*.conf

DSA0:[SYS1.TCPIP$NTP]TCPIP$NTP.CONF;14

server 127.127.1.0
fudge 127.127.1.0 stratum 10

server ntp01 prefer
server ntp02

driftfile SYS$SPECIFIC:[TCPIP$NTP]TCPIP$NTP_node1.DRIFT
logfile sys$specific:[tcpip$ntp]NTP_LOG_node1.LOG

The log files tells me nothing except for alot of these kind of lines:
30 Mar 08:25:00 ntp[661222038]: offset: 0.000000 sec freq: 0.000 ppm poll: 16
sec error: 0.000488

Any suggestions appreciated.

Best regards
Fredrik Eriksson
12 REPLIES 12
Highlighted
Martin Vorlaender
Honored Contributor

Re: NTPD wierdness

Fredrik,

>>>
server ntp01 prefer
server ntp02
<<<

Does the BIND resolver on node2 have the same information on ntp01 and ntp02 as the one on node1 (e.g. $ TCPIP SHOW HOST ntp01)?

cu,
Martin
Fredrik.eriksson
Valued Contributor

Re: NTPD wierdness

Hi Martin,

Yes both machines resolves the same IP's for ntp01 and ntp02.

The configuration for the BIND resolver is identical on both machines (compared it via diff even).

Best regards
Fredrik Eriksson
Martin Vorlaender
Honored Contributor

Re: NTPD wierdness

Fredrik,

did you try the troubleshooting tips in http://h71000.www7.hp.com/doc/83final/6526/6526pro_036.html#trouble_ntp ?

What are the results?

(I don't claim to be an NTP expert, but the output of the associations and readvar commands could be telling.)

cu,
Martin
Fredrik.eriksson
Valued Contributor

Re: NTPD wierdness

The assoc from node1 (the working one) gives me:

ntpq> assoc
ind assID status conf reach auth condition last_event cnt
===========================================================
1 19836 9014 yes yes none reject reachable 1
2 19837 9614 yes yes none sys.peer reachable 1
3 19838 9414 yes yes none candidat reachable 1

which is the local, ntp01 and ntp02.
readvars for ntp01 (from node01):
readvar 19837
status=9614 reach, conf, sel_sys.peer, 1 event, event_reach,
srcadr=ntp01, srcport=123, dstadr=10.101.129.31, dstport=123,
leap=00, stratum=1, precision=-19, rootdelay=0.000,
rootdispersion=0.366, refid=PPS, reach=377, unreach=0, hmode=3, pmode=4,
hpoll=10, ppoll=10, flash=00 ok, keyid=0, offset=-9.268, delay=1.925,
dispersion=19.162, jitter=4.544,
reftime=cd7b5c24.0ac74114 Mon, Mar 30 2009 16:56:36.042,
org=cd7b5c2c.7fe4c6b6 Mon, Mar 30 2009 16:56:44.499,
rec=cd7b5c2c.81794a6e Mon, Mar 30 2009 16:56:44.505,
xmt=cd7b5c2c.80b94530 Mon, Mar 30 2009 16:56:44.502,
filtdelay= 2.90 1.92 1.92 1.92 1.89 1.92 0.94 1.92,
filtoffset= -4.72 -9.27 -13.39 -16.67 -15.08 -11.48 -7.04 -1.62,
filtdisp= 0.49 15.85 31.24 46.63 62.02 77.40 92.76 108.15

Assoc from node2 is as follows:
ntpq> assoc
ind assID status conf reach auth condition last_event cnt
===========================================================
1 44732 9014 yes yes none reject reachable 1

That one is the peer entry that points to my node1.
readvars as follows:
ntpq> readvar 44732
status=9014 reach, conf, 1 event, event_reach,
srcadr=node1, srcport=123, dstadr=10.101.129.32, dstport=123, leap=00,
stratum=2, precision=-11, rootdelay=1.923, rootdispersion=44.907,
refid=ntp01, reach=007, unreach=0, hmode=1, pmode=2, hpoll=6,
ppoll=6, flash=00 ok, keyid=0, offset=-19.749, delay=0.488,
dispersion=1938.945, jitter=0.728,
reftime=cd7b582b.d527e521 Mon, Mar 30 2009 16:39:39.832,
org=cd7b5ecb.56ec8d5c Mon, Mar 30 2009 17:07:55.339,
rec=cd7b5ecb.5bfad299 Mon, Mar 30 2009 17:07:55.359,
xmt=cd7b5ecb.5bbacb42 Mon, Mar 30 2009 17:07:55.358,
filtdelay= 0.49 0.98 0.49 0.00 0.00 0.00 0.00 0.00,
filtoffset= -19.75 -18.99 -20.44 0.00 0.00 0.00 0.00 0.00,
filtdisp= 0.98 1.92 2.87 16000.0 16000.0 16000.0 16000.0 16000.0

This didn't tell me all that much thou... pretty much stuck at the same place as earlier.
I'm wondering if my NTP service is trying to reuse the same resources as my other cluster node. But if that was the case it would have worked when i tried to stop node1's ntp and started node2's.

Best regards
Fredrik Eriksson
Martin Vorlaender
Honored Contributor

Re: NTPD wierdness

Fredrik,

>>>
ntpq> assoc
ind assID status conf reach auth condition last_event cnt
===========================================================
1 44732 9014 yes yes none reject reachable 1

That one is the peer entry that points to my node1.
<<<

If I understand the ntpq page at http://www.eecis.udel.edu/~mills/ntp/html/ntpq.html and the status word description at http://www.eecis.udel.edu/~mills/ntp/html/decode.html#peer correctly, a condition of "reject" means node1's time was "discarded as not valid" - whatever the reason.

HTH,
Martin
Fredrik.eriksson
Valued Contributor

Re: NTPD wierdness

Thanks again Martin

That entry doesn't effect the outcome of my problem. That entry is just a test to see if I could get anything to show up at all, which it apperantly did.

But I guess I should try to get it to work properly since it seems like the only method I have at the moment.

Best regards
Fredrik Eriksson
Martin Vorlaender
Honored Contributor

Re: NTPD wierdness

Fredrik,

sorry for having been distracted.

From the assoc list it seems node2 is not even considering ntp01 and ntp02.

One command which could tell more could be

node2> ntpq -c sysinfo ntp01 ! and/or ntp02

to see whether it can talk at all to ntp01/02.

cu,
Martin
Fredrik.eriksson
Valued Contributor

Re: NTPD wierdness

Since I'm not at work full time I can't test this right away, but...

node2>ntpdate -q ntp01
Server ntp01 (172.xx.xx.xx), stratum 1
offset +0.9998752, delay +0.0275512

Should mean that I can talk to the ntp01 server. Since it reports back and tells me that I'm 1 second behind it.

Best regards
Fredrik Eriksson
Martin Vorlaender
Honored Contributor

Re: NTPD wierdness

Okay, let's try another approach:

Looking into sys$startup:tcpip$ntp_startup.com, I see there are some possibilities to manipulate the service:

- if the logical tcpip$ntp_conf is defined, the file it points to is used instead of sys$specific:[tcpip$ntp]tcpip$ntp.conf.

- if sys$startup:tcpip$ntp_systartup.com exists, it's executed before the service is started (but then the logfile should contain something like "%TCPIP-I-INFO, executing site-specific startup").

And some other thoughts as to what could go wrong:

- SYS$SPECIFIC:[TCPIP$NTP] and all files therein must be owned by TCPIP$NTP.

- SYS$SPECIFIC:[TCPIP$NTP]TCPIP$NTP_*.DRIFT; could have reached version 32767.

cu,
Martin