Operating System - HP-UX
1820033 Members
3503 Online
109608 Solutions
New Discussion юеВ

Re: Continually losing sync with Windows NTP server

 
SOLVED
Go to solution
Carl Houseman
Super Advisor

Continually losing sync with Windows NTP server

I've read a couple threads about NTP sync problems and the answer was always "specify multiple time sources to avoid loss of sync due to network issues." But this is not that problem.

I've got both 11.00 and 11.11 configured with two different Windows 2003 NTP servers (both are domain controllers). And all exhibit the same problem, with syslog entries as follows:

Mar 27 21:09:16 hpt xntpd[18283]: tickadj = 625, tick = 10000, tvu_maxslew = 618
75
Mar 27 21:09:16 hpt xntpd[18283]: precision = 11 usec
Mar 27 21:15:10 hpt xntpd[18283]: synchronized to 10.1.1.172, stratum=3
Mar 27 21:16:58 hpt xntpd[18283]: time reset (step) 107.671497 s
Mar 27 21:16:58 hpt xntpd[18283]: synchronisation lost
Mar 27 21:22:18 hpt xntpd[18283]: synchronized to 10.1.1.172, stratum=3
Mar 27 21:22:18 hpt xntpd[18283]: time reset (step) 0.356259 s
Mar 27 21:22:18 hpt xntpd[18283]: synchronisation lost

All these systems are on the same LAN, the same IP network, and there's NO other network issues between these systems. Is there something that can be tweaked either in Windows or HP-UX to get them to sync reliably and quietly?
14 REPLIES 14
Bill Hassell
Honored Contributor

Re: Continually losing sync with Windows NTP server

This isn't normal so there is something wrong with the communication. There is nothing to tweak in HP-UX so I would add both NTP servers to your ntp.conf file and DO NOT specify a prefer server. Let NTP sort out which one is working. Run ntpq -p on a regular basis and monitor the 'reach' value (should always be 377) and low dispersion numbers (single digits like 0.0 to 9.0). However, the 107 second step adjustment is very, very unusual so something is wrong with configuration on the HP-UX side. Start with NTP patches and then make sure you have an ultra simple ntp.conf file:

# cat /etc/ntp.conf
server 0.us.pool.ntp.org # US pool 0
server 1.us.pool.ntp.org # US pool 1
server 2.us.pool.ntp.org # US pool 2
fudge 127.127.1.1 stratum 10 # localhost fallback
driftfile /etc/ntp.drift # monitor drift

(you can delete all the comments -- the original file is safely stored in /usr/newconfig/etc). You need both server to reliably skip over server problems. You may have to switch to xntpd -x in your startup config script. See the man page for xntpd and look especially at the section on SLEW. While -x is not recommended because better accuracy is available from reliable sources, that 107 second step jump is more than one minute, not good at all for today's interlocked database systems.

As far as networking issues, you'll really need Wireshark or spend a lot of time tracing xntpd to determine whether the NTP protocol is making it reliably to the HP-UX boxes. The Windows boxes are notoriously silent about internal problems with serving as an NTP source. By running ntpq -p regularly, you will probably see a dropout in responses to NTP queries as reach and dispersion start degrading.


Bill Hassell, sysadmin
Carl Houseman
Super Advisor

Re: Continually losing sync with Windows NTP server

I should have mentioned those syslog entries were from a server which was not syncing with anything so it really was off 107 seconds to start with.

All of the HP-UX servers exhibit the same symptoms. Here are syslog and ntpq -pn from another (11.11) server. Anything obvious jump out at you?

Mar 27 22:05:20 hpr2 xntpd[12088]: synchronized to 10.1.1.176, stratum=4
Mar 27 22:05:20 hpr2 xntpd[12088]: time reset (step) 0.269780 s
Mar 27 22:10:01 hpr2 xntpd[12088]: synchronized to 10.1.1.172, stratum=3
Mar 27 22:10:02 hpr2 xntpd[12088]: time reset (step) 0.359086 s
Mar 27 22:10:02 hpr2 xntpd[12088]: synchronisation lost
Mar 27 22:10:55 hpr2 above message repeats 3 times
Mar 27 22:14:57 hpr2 xntpd[12088]: synchronized to 10.1.1.176, stratum=4
Mar 27 22:14:57 hpr2 xntpd[12088]: time reset (step) 0.281457 s
Mar 27 22:14:57 hpr2 xntpd[12088]: synchronisation lost
Mar 27 22:20:18 hpr2 xntpd[12088]: time reset (step) 0.362454 s
Mar 27 22:24:59 hpr2 xntpd[12088]: synchronized to 10.1.1.172, stratum=3
Mar 27 22:20:17 hpr2 xntpd[12088]: synchronized to 10.1.1.176, stratum=4

at (approx) 22:23:
remote refid st t when poll reach delay offset disp
==============================================================================
10.1.1.172 10.1.1.1 3 u 50 64 17 -15.35 95.800 1902.37
10.1.1.176 10.1.1.172 4 u 11 64 17 0.27 275.501 1964.40

at 22:25
remote refid st t when poll reach delay offset disp
==============================================================================
#10.1.1.172 10.1.1.1 3 u 44 64 37 -15.35 95.800 987.75
+10.1.1.176 10.1.1.172 4 u 5 64 37 0.27 316.071 957.73

at 22:27
remote refid st t when poll reach delay offset disp
==============================================================================
#10.1.1.172 10.1.1.1 3 u 58 64 77 -15.35 95.800 518.01
+10.1.1.176 10.1.1.172 4 u 19 64 77 0.26 327.720 427.66
Jollyjet
Valued Contributor

Re: Continually losing sync with Windows NTP server

HP-UX NTP Client
======

/sbin/init.d/
./xntpd start

Error Message
/usr/sbin/ntpdate: no server suitable for synchronization found


Solution

HP-UX is automatically a server when xntpd is running. But before you start, make sure your system can talk to the NTP servers you are going to use. Most firewalls deny access on port 123 so you must enable the port on your firewall. To test access, use ntpq -p.

You should see a status report. If you see an error such as "Can't find host" (a DNS problem) or no response, you'll have to fix your networking. Once you get responses, you can edit /etc/ntp.conf, then enable xntpd as mentioned

broadcastclient yes
server 206.236.134.116 version 3 prefer



/sbin/init.d/xntp stop

cd /etc/rc.config.d

# vi netdaemons

export NTPDATE_SERVER=206.236.134.116
export XNTPD=1


ntpdate -d 206.236.134.116

/sbin/init.d/
./xntpd stop
/sbin/init.d/
./xntpd start


The Return message should be as below
/usr/sbin/ntpdate: adjust time server 206.236.134.116 offset -0.1744986

A. Clay Stephenson
Acclaimed Contributor

Re: Continually losing sync with Windows NTP server

It appears that your ntpq -p output was taken too soon after startup of the NTP daemon to be of much use. Can you post the ntpq -p output after the daemon has been running at least a few tens of minutes. If your reach values don't peg at 377 after a few tens of minutes of operation then NTP will never work reliably because of network problems. Note that your dispersion values are also very high --- again, as a result of not enough time elapsing since the start of the daemon.

On the other hand, I have never used Windows boxes as time servers for my shops. I always turn the world around and use UNIX boxes as timeservers and Windows boxes as clients. I would never underestimate Microsoft's ability to "improve" standards that have been in place for decades.

One thing to look for is a possible denial of service attack on the NTP port.
If it ain't broke, I can fix that.
Carl Houseman
Super Advisor

Re: Continually losing sync with Windows NTP server

I'm fairly sure there's no DOS against the Windows servers. After running for some time, the ntpq -pn reports on the 11.00 server:

remote refid st t when poll reach delay offset disp
==============================================================================
10.1.1.172 10.1.1.1 3 u 57 64 3 1.02 68.121 7898.42
10.1.1.176 10.1.1.172 4 u 2 64 3 0.93 144.156 7912.00

I switched one of the 11.11 servers to go to 10.1.1.1 which is a firewall that operates its own ntp server. My syslog is quiet, and the ntpq -pn result after more than 30 minutes is:

remote refid st t when poll reach delay offset disp
==============================================================================
10.1.1.1 0.0.0.0 16 - - 64 0 0.00 0.000 16000.0

I've never seen the reach number get to 377.

As the syslog indicates the time does get synchronized - and immediately after synchronizing the connection is lost. I've never had "way out of sync" timekeeping problems. It's more of a nuisance that my syslog fills up with entries that aren't indicative of a timekeeping problem.

As for "why don't you sync with the 10.1.1.1 server and be done with it", the 10.1.1.1 device is being phased out. I obviously have multiple workarounds, and I'm not here to find out about those. I'm here to find out if anyone else has seen and solved this problem when using Windows (2003 SP1) as the NTP server.

So, if you want points, from this point forward, tell me you've seen this problem and what you did about it.
Carl Houseman
Super Advisor

Re: Continually losing sync with Windows NTP server

Oh yeah, I realize the 10.1.1.1 result prety much indicates it's never synced... not sure what's up with that. In any event, it's unimportant to this discussion.
A. Clay Stephenson
Acclaimed Contributor

Re: Continually losing sync with Windows NTP server

I really couldn't care less if I never get a point from you but here's at least a place to start with diagnosing your problem (and I think I know exactly where it lies):

1) Download and install this Perl module, the installation is in the README file and is trivially simple:

http://search.cpan.org/~willmojg/Net-NTP-1.2/NTP.pm

2) Detach the small Perl script, ntp.pl and make it executable.


Now, you can execute it like this:

ntp.pl timeserver

It will display the NTP statistics for 1000 NTP responses from the designated server. If no server is supplied, the localhost is probed. I'm betting that you are going to see times when the display stops or there are large differences in the receive and transmit timestamps. If you are interested in the various metrics then consult RFC 1305.

http://www.faqs.org/rfcs/rfc1305.html

I am also willing to bet that if you will modify your /etc/ntp.conf file on the HP-UX boxes to point so that it looks something like this:
server 0.us.pool.org.ntp
server 1.us.pool.org.ntp
server 2.us.pool.org.ntp
server 127.127.1.1 stratum 10 # localhost fallback
and restart the xntpd daemon that you will see no strange syslog messages because you will be talking to fully NTP compliant timeservers and reach will be pegged at 377.

If these timeservers work for you then you can now shift your efforts to the Microsoft side of the world and begin looking for patches and workarounds.

If it ain't broke, I can fix that.
Steven E. Protter
Exalted Contributor

Re: Continually losing sync with Windows NTP server

Carl,

ntp is tried and true on HP-UX and other unixes. You will find little you can do to change your hpux box.

I've been through what you went through.

In my prior job, I was told to time sync through windows ntp servers. My boxes kept losing sync and there was no explanation.

After a lot of persuasion the windows time server admin either replaced the time service program or patched the system or both.

Then my HP-UX boxes synced very well.

Moral of this story: Its almost always the Microsoft server at fault here.

The only thing I can think of to make it happen would be a switch or router of firewall blocking port 123 or some other similar issue.

Best bet is to talk to the Microsoft server admin, show them some ntpq data and ask them to check into patching or replacing the windows ntp server. If your firewall admin will open up port 123 you can sync directly to the time sources on the net but thats not likely to happen.

Also note there is some limit to how far off your time can be and still have ntp work. If you are off by more than an hour you will need to use the date command to alter your time (hopefully forward) to get it close to the time from the time source server.

Good Luck,

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Bill Hassell
Honored Contributor

Re: Continually losing sync with Windows NTP server

As mentioned, reach must be 377. If not, you have a very unstable NTP source. Dump the Winjunk and talk to your network admin. You probably have a firewall and most commercial firewalls have the ability to provide NTP services inside using nice reliable NTP sources outside.

If you want to get rid of the messages, that's real easy. Just change a line in /etc/rc.config.d/netdaemons:

from:
export XNTPD_ARGS=""
to:
export XNTPD_ARGS="-l /var/adm/ntp.log"

then stop and start xntpd with:

/sbin/init.d/xntpd stop
/sbin/init.d/xntpd start

Then you can ignore the messages in /var/adm/ntp.log ...


Bill Hassell, sysadmin
A. Clay Stephenson
Acclaimed Contributor

Re: Continually losing sync with Windows NTP server

Sometimes it pays to consult with the guys who developed NTP and see what their support pages turn up. I strolled over to www.ntp.org and found a reference to trying to synchronize to a Win2003 server using a client who speaks real NTP and found this:

http://support.microsoft.com/?kbid=830092

If it ain't broke, I can fix that.
Carl Houseman
Super Advisor

Re: Continually losing sync with Windows NTP server

For A. Clay Stephenson:
I downloaded the perl script but there was no ntp.pl contained therein that would run 1000 queries against an NTP server. Feel free to attach it if you have it and I'll give it a try. There was a test_ntp.pl which I ran repeatedly from the prompt with the Windows server as parameter, and it never failed to produce a result.

As for the MS KB article you referenced, it refers to the NTP client on Windows, not the NTP server, and it applies to no-service-pack Windows Server 2003, which few still run, and is not the case here. The search page at ntp.org must be broken as it returns nothing for searching for very common words. And googling for 'site:ntp.org "windows 2003"' turns up nothing useful. Got a link to the discussion you found?

For Bill Hassell:
I had already noticed the option to log to a different file, but thanks. That's a solution of last resort.

Everybody:
I enabled udp 123 through the firewall and setup two stratum 2 timeservers in ntp.conf, and it works much better. So I'm checking to see if any of the MS MVP's in some other forums have a response. If I find out something I'll post here.
Patrick Wallek
Honored Contributor
Solution

Re: Continually losing sync with Windows NTP server

The script Clay was talking about was attached to HIS POST, not part of the Perl module from CPAN. Click the little paper clip in the post and that will give you the perl script he was talking about.
A. Clay Stephenson
Acclaimed Contributor

Re: Continually losing sync with Windows NTP server

Actually this link specifically mentions applying a fix to the Win2003 server:

http://ntp.isc.org/bin/view/Support/TroubleshootingNTP

See Section 9.10 where it links to the Microsoft patch.

The ntp.pl Perl script is the attachment (the paperclip) of the above post and does 1000 NTP requests; I threw that together in about 3 minutes. The
script does require the Net::NTP Perl module which is what the link references.

If it ain't broke, I can fix that.
Carl Houseman
Super Advisor

Re: Continually losing sync with Windows NTP server

re: ntp.pl attached to post. Got it, thanks. Ran it. No there was no lag or timeout in responses.

Re: section 9.10 at ntp.org. The description is right but the MS KB article referenced doesn't match the problem.