1834619 Members
2801 Online
110069 Solutions
New Discussion

Re: NTP monitoring

 
SOLVED
Go to solution
Dalin Bruns
Occasional Advisor

NTP monitoring

Is it common practice to monitor NTP status? If so, from ntpq -p, what is the most tell-tale indicator that things are good/bad, i.e. is there one parameter that I can watch such that when that parameter reaches a specific threshold, user intervention is required?

Thanks in advance,

Dalin
17 REPLIES 17
Pete Randall
Outstanding Contributor

Re: NTP monitoring

Dalin,

NTP should be pretty much self healing. You might want to review syslog entries just to assure yourself that all is well:

"grep -i ntp /var/adm/syslog/syslog.log"


Pete


Pete
Uday_S_Ankolekar
Honored Contributor

Re: NTP monitoring

NTP is pretty simple and reliable.
If you have any doubts about your setup use this command to monitor connection
ntpq -p
The "reach" value will move towards 377 if the connection is reliable.
Also check for the /var/adm/syslog/syslog.log file for ntp logs.

and xntpd is the daemon should be always running

-Uday
Good Luck..
Robert-Jan Goossens
Honored Contributor

Re: NTP monitoring

Hi,

Look for entry's in your syslog, like "Lost connection to stratum" if you do not find them in your syslog you are pretty save.

Hope it helps,

Robert-Jan.
Dalin Bruns
Occasional Advisor

Re: NTP monitoring

I agree that NTP takes care of itself running on its own, but recently our time source, some type of satellite link device began to fail, and we didn't realize it until time between several of our servers had begun to drift enough to cause us some database issues. In hindsight, as the device failed, it was over a week or so before we were awakened to this issue by the database and I was hoping to head this sort of thing off at the pass.

Dalin
Robert-Jan Goossens
Honored Contributor

Re: NTP monitoring

Hi Dalin,

How many servers are connected to your ntp gps device ?

Robert-Jan.
Dalin Bruns
Occasional Advisor

Re: NTP monitoring

Interestingly, we routinely see synchronisation lost messages in the syslogs of our machines, following by a resync minutes later. I was, more specifically, wondering, if I monitor the dispersion value from ntpq -p, whether that value would be a good indicator of ntp status.

Dalin
Sergejs Svitnevs
Honored Contributor

Re: NTP monitoring

Check out the following link:

http://nic-ks.greatplains.net/ntp/debug.htm


Regards,
Sergejs
Robert-Jan Goossens
Honored Contributor

Re: NTP monitoring

Dalin,


What do the "ntpq -p" output fields mean?
DocId: KBRC00001347 Updated: 2/14/00 9:51:55 AM
PROBLEM
The details of the "ntpq -p" output fields

http://www4.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000063235965

Robert-Jan.
Pete Randall
Outstanding Contributor

Re: NTP monitoring

Dalin,

We see synch lost and resynch messages as well. I suspect that this is completely normal. For ongoing monitoring, I would suggest a cron job that greps syslog for ntp messages once a day and emails the output to you. You could get fancy and check for a synch lost without a corresponding resynch, but a little manual review would also work without too much effort.


Pete


Pete
Dalin Bruns
Occasional Advisor

Re: NTP monitoring

Well, because of the cost to replace that device, we've switched to getting our time from the internet. But to answer your question, we've got 27-28 machines getting time from 3 servers that are peers. Our hierarchy is sort of complex because we also have all of our 100+ network devices getting time of two distribution switches from which our servers get their time. Hope that makes sense -- a picture would help.
Steven E. Protter
Exalted Contributor

Re: NTP monitoring

It is a good idea to check it once in a while.

Our ntp server at work does not respond properly to the the ntpq -p command, but the clocks are maintained.

I set up a D320 at home and told it to get time from a Linux server and didn;t check it.

Well, I ran the date command last night and saw it was running 7 hours ahead of central time.

The strange part is that at home I leave that firewall port open and the secondary server is reachable on the public internet but not updating time.

Its on my once a month checklist just to make sure the clocks aren't drifting.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jaris Detroye
Frequent Advisor

Re: NTP monitoring

Just a thought here....
Do all your systems get their NTP signals from the GPS? Do they have other sources as well? My experience has been that if you designate one internal system to be a lower stratum source using it's internal clock as a last resort, then even if you loose your GPS source your systems will not drift from each other, only from the world.
I personally only have one system designated to reference external time sources, the rest sync off the primary internally. Also, I have IT/O monitoring the NTP daemon. (Network guys complained about the concept of 150 servers doing external NTP querys vs. one)
Michael Kelly_5
Valued Contributor

Re: NTP monitoring

Dalin,
you could try doing
ntpdate -q
The nice thing about computers is that they do exactly what you tell them. The problem with computers is that they do EXACTLY what you tell them.
Angus Crome
Honored Contributor
Solution

Re: NTP monitoring

Here is a short script (it is not very sophisticated, but it will always tell you when your timesource has gone south). Like when a lighting strike occurs, "thank people like 'Clay' for lightning arrestors!!!"

Use cron to schedule it out at sensible time intervals.
#0,15,30,45 * * * * /usr/local/admin/etc/timecheck > /dev/null 2>&1
There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown
Bill Hassell
Honored Contributor

Re: NTP monitoring

The most important feature of NTP reliability is often overlooked: multiple time servers! Never use a single source. As a minimum, use a preferred time server and then fallback to a local machine that will supply NTP from it's own clock. Whether it drifts a bit or not is not important since all clients will follow the system that is working. When multiple clocks are used, NTP has a very complex but accurate way to gauge how to pick the right time. If your machines have the option to get NTP sources from the Internet (you can test this with ntpq -p server_name) then put 3 to 6 servers in every ntp.conf file.



Bill Hassell, sysadmin
Angus Crome
Honored Contributor

Re: NTP monitoring

Just an additional, since Bill pointed it out.

For those of us, who are not allowed to go to the internet for such things, you can do a little trick to keep time from dying completely in the case of a single point failure.

You can setup a fudge entry on each of your strata servers. If you main source goes south, the system will lock in on its own clock, and keep adjusting. I just have it setup on GPS attached server, so that all the systems' time degrade together. You can also set this up on your stratum 1 timeservers if you want. By setting the stratum of the local clock much lower than your actual stratum, you ensure that only a complete failure will cause it to be used.

Here is a copy of what I have in ntp.conf;

# Main GPS clock configuration
server 127.127.29.0 minpoll 6 prefer
fudge 127.127.29.0 time1 0.00028125
#
## Temporary config during hardware malfunction
#
# Local clock : Allows the server to synchronize to its own clock.
#
server 127.127.1.1
fudge 127.127.1.1 stratum 8 # show poor quality
#
#
There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown
Angus Crome
Honored Contributor

Re: NTP monitoring

Just an additional, since Bill pointed it out.

For those of us, who are not allowed to go to the internet for such things, you can do a little trick to keep time from dying completely in the case of a single point failure.

You can setup a fudge entry on each of your strata servers. If you main source goes south, the system will lock in on its own clock, and keep adjusting. I just have it setup on GPS attached server, so that all the systems' time degrade together. You can also set this up on your stratum 1 timeservers if you want. By setting the stratum of the local clock much lower than your actual stratum, you ensure that only a complete failure will cause it to be used.

Here is a copy of what I have in ntp.conf;

# Main GPS clock configuration
server 127.127.29.0 minpoll 6 prefer
fudge 127.127.29.0 time1 0.00028125
#
## Temporary config during hardware malfunction
#
# Local clock : Allows the server to synchronize to its own clock.
#
server 127.127.1.1
fudge 127.127.1.1 stratum 8
There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown