Re: NTP monitoring

Dalin Bruns · ‎07-28-2003

Is it common practice to monitor NTP status? If so, from ntpq -p, what is the most tell-tale indicator that things are good/bad, i.e. is there one parameter that I can watch such that when that parameter reaches a specific threshold, user intervention is required?

Thanks in advance,

Dalin

Pete Randall · ‎07-28-2003

Dalin,

NTP should be pretty much self healing. You might want to review syslog entries just to assure yourself that all is well:

"grep -i ntp /var/adm/syslog/syslog.log"

Pete

Pete

Uday_S_Ankolekar · ‎07-28-2003

NTP is pretty simple and reliable.
If you have any doubts about your setup use this command to monitor connection
ntpq -p
The "reach" value will move towards 377 if the connection is reliable.
Also check for the /var/adm/syslog/syslog.log file for ntp logs.

and xntpd is the daemon should be always running

-Uday

Good Luck..

Robert-Jan Goossens · ‎07-28-2003

Hi,

Look for entry's in your syslog, like "Lost connection to stratum" if you do not find them in your syslog you are pretty save.

Hope it helps,

Robert-Jan.

Dalin Bruns · ‎07-28-2003

I agree that NTP takes care of itself running on its own, but recently our time source, some type of satellite link device began to fail, and we didn't realize it until time between several of our servers had begun to drift enough to cause us some database issues. In hindsight, as the device failed, it was over a week or so before we were awakened to this issue by the database and I was hoping to head this sort of thing off at the pass.

Dalin

Robert-Jan Goossens · ‎07-28-2003

Hi Dalin,

How many servers are connected to your ntp gps device ?

Robert-Jan.

Dalin Bruns · ‎07-28-2003

Interestingly, we routinely see synchronisation lost messages in the syslogs of our machines, following by a resync minutes later. I was, more specifically, wondering, if I monitor the dispersion value from ntpq -p, whether that value would be a good indicator of ntp status.

Dalin

Sergejs Svitnevs · ‎07-28-2003

Check out the following link:

http://nic-ks.greatplains.net/ntp/debug.htm

Regards,
Sergejs

Robert-Jan Goossens · ‎07-28-2003

Dalin,

What do the "ntpq -p" output fields mean?
DocId: KBRC00001347 Updated: 2/14/00 9:51:55 AM
PROBLEM
The details of the "ntpq -p" output fields

http://www4.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000063235965

Robert-Jan.

Pete Randall · ‎07-28-2003

Dalin,

We see synch lost and resynch messages as well. I suspect that this is completely normal. For ongoing monitoring, I would suggest a cron job that greps syslog for ntp messages once a day and emails the output to you. You could get fancy and check for a synch lost without a corresponding resynch, but a little manual review would also work without too much effort.

Pete

Pete

Dalin Bruns · ‎07-28-2003

Well, because of the cost to replace that device, we've switched to getting our time from the internet. But to answer your question, we've got 27-28 machines getting time from 3 servers that are peers. Our hierarchy is sort of complex because we also have all of our 100+ network devices getting time of two distribution switches from which our servers get their time. Hope that makes sense -- a picture would help.

Steven E. Protter · ‎07-28-2003

It is a good idea to check it once in a while.

Our ntp server at work does not respond properly to the the ntpq -p command, but the clocks are maintained.

I set up a D320 at home and told it to get time from a Linux server and didn;t check it.

Well, I ran the date command last night and saw it was running 7 hours ahead of central time.

The strange part is that at home I leave that firewall port open and the secondary server is reachable on the public internet but not updating time.

Its on my once a month checklist just to make sure the clocks aren't drifting.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Jaris Detroye · ‎07-29-2003

Just a thought here....
Do all your systems get their NTP signals from the GPS? Do they have other sources as well? My experience has been that if you designate one internal system to be a lower stratum source using it's internal clock as a last resort, then even if you loose your GPS source your systems will not drift from each other, only from the world.
I personally only have one system designated to reference external time sources, the rest sync off the primary internally. Also, I have IT/O monitoring the NTP daemon. (Network guys complained about the concept of 150 servers doing external NTP querys vs. one)

Michael Kelly_5 · ‎07-29-2003

Dalin,
you could try doing
ntpdate -q ...
This will give you an indication of the difference between the local time and the time on the server(s).
See man 1m ntpdate for more detail.

HTH,
Michael.

The nice thing about computers is that they do exactly what you tell them. The problem with computers is that they do EXACTLY what you tell them.

Angus Crome · ‎07-29-2003

Here is a short script (it is not very sophisticated, but it will always tell you when your timesource has gone south). Like when a lighting strike occurs, "thank people like 'Clay' for lightning arrestors!!!"

Use cron to schedule it out at sensible time intervals.
#0,15,30,45 * * * * /usr/local/admin/etc/timecheck > /dev/null 2>&1

There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown

Bill Hassell · ‎07-29-2003

The most important feature of NTP reliability is often overlooked: multiple time servers! Never use a single source. As a minimum, use a preferred time server and then fallback to a local machine that will supply NTP from it's own clock. Whether it drifts a bit or not is not important since all clients will follow the system that is working. When multiple clocks are used, NTP has a very complex but accurate way to gauge how to pick the right time. If your machines have the option to get NTP sources from the Internet (you can test this with ntpq -p server_name) then put 3 to 6 servers in every ntp.conf file.

Bill Hassell, sysadmin

Angus Crome · ‎07-31-2003

Just an additional, since Bill pointed it out.

For those of us, who are not allowed to go to the internet for such things, you can do a little trick to keep time from dying completely in the case of a single point failure.

You can setup a fudge entry on each of your strata servers. If you main source goes south, the system will lock in on its own clock, and keep adjusting. I just have it setup on GPS attached server, so that all the systems' time degrade together. You can also set this up on your stratum 1 timeservers if you want. By setting the stratum of the local clock much lower than your actual stratum, you ensure that only a complete failure will cause it to be used.

Here is a copy of what I have in ntp.conf;

# Main GPS clock configuration
server 127.127.29.0 minpoll 6 prefer
fudge 127.127.29.0 time1 0.00028125
#
## Temporary config during hardware malfunction
#
# Local clock : Allows the server to synchronize to its own clock.
#
server 127.127.1.1
fudge 127.127.1.1 stratum 8 # show poor quality
#
#

There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown

Angus Crome · ‎07-31-2003

Just an additional, since Bill pointed it out.

For those of us, who are not allowed to go to the internet for such things, you can do a little trick to keep time from dying completely in the case of a single point failure.

You can setup a fudge entry on each of your strata servers. If you main source goes south, the system will lock in on its own clock, and keep adjusting. I just have it setup on GPS attached server, so that all the systems' time degrade together. You can also set this up on your stratum 1 timeservers if you want. By setting the stratum of the local clock much lower than your actual stratum, you ensure that only a complete failure will cause it to be used.

Here is a copy of what I have in ntp.conf;

# Main GPS clock configuration
server 127.127.29.0 minpoll 6 prefer
fudge 127.127.29.0 time1 0.00028125
#
## Temporary config during hardware malfunction
#
# Local clock : Allows the server to synchronize to its own clock.
#
server 127.127.1.1
fudge 127.127.1.1 stratum 8

There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: NTP monitoring

NTP monitoring