HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

rp2450 clock gains time. Thinks it's a 9000/800/A500-44 not a 9000/800/A500-5X

 
Mike Cresswell
Occasional Advisor

rp2450 clock gains time. Thinks it's a 9000/800/A500-44 not a 9000/800/A500-5X

Hi All

As I've benefited from browsing this forum on many occassions, I thought I'd post one that may help someone else. Sorry if it's too wordy but wanted to let you know what I found.

The initial problem statement was that the Real Time Clock ("date" etc) on an rp2450 running HPUX 11 was gaining time constantly. Approximately 15 to 20 minutes per hour! It was supposed to use ntp to synchronise to either a GPS time source or the system's central server but couldn't synchronise to either one.

The problem seemed to have started after installation of various patches which had required a kernel rebuild and a reboot. However this was one of 14 identical processors patched at the same time and was the only one showing the problem.

I found that:

The processor gained time regardless of whether ntp was on or off.

The clock was gaining exactly 1 minute every 4 minutes ie exactly 25% fast.

Running "model" on this processor gave 9000/800/A500-44 whereas all the others gave 9000/800/A500-5X. Checking on these models indicated that the "44" has a 440 MHz CPU clock frequency where the "5X" has a 550 MHz CPU Clock.

Began to think "If this CPU thinks it has a 440 MHz CPU clock but really has 550 MHz then it would get 550/440 = 1.25 times as many clocks per second as it's expecting. That would fit with the real time clock running 25% fast"

Running the following command to check the processor frequency gave "440" on this CPU versus "550" on all the others.

echo itick_per_usec/D | adb -k /stand/vmunix /dev/mem

But realised that the output from the above command and from “model” both come from interrogating the kernel, not from directly checking the hardware.

As the processor was on a remote site, I couldn't do a physical check to see what was actually in the processor. But based on all the above, I began to suspect that the kernel had somehow been built against the wrong model / CPU clock speed.

So I did a kernel rebuild and reboot. Problem solved. The "model" command now returns 9000/800/A500-5X, the clock is keeping correct time, ntp synchronised successfully with the GPS and Central Server and the application software is working correctly.

I guess the morals of the story are.

Don't always believe what the kernel is telling you.

Remember that many / most commands are actually reading what the kernel says, not communicating directly with the hardware.

Make sure you have someone on site when you rebuild the kernel and reboot the CPU. It didn’t reboot successfully and I had to get a local engineer to go in and power cycle the CPU. After that, it rebooted OK.

If anyone has any idea how / why it had picked up the wrong model / clock speed in the original kernel rebuild, I'd like to hear from you.

Otherwise, I just hope this helps someone else.

Cheers

Mike
2 REPLIES 2
Steven E. Protter
Exalted Contributor

Re: rp2450 clock gains time. Thinks it's a 9000/800/A500-44 not a 9000/800/A500-5X

Shalom Mike,

Computer clocks are notorious for their inability to keep time.

The bset way to deal with the time issue is to configure ntp.

As far as the CPU issue, its possible A cpu that is not rated for your system board is installed. The system should not run at all, but it does.

If you want the system to be reliable, either the system board or the CPU may need to go.

As far as the kernel goes, perhaps you or someone else copied the /stand filesystem or the system files from a system with a different model or cpu. Perhaps this is from a system replication done by Ignite with different hardware, which can be problematic unless Golden Software imaging with Ignite is uesd.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Mike Cresswell
Occasional Advisor

Re: rp2450 clock gains time. Thinks it's a 9000/800/A500-44 not a 9000/800/A500-5X

Hi Steven

Thanks!

I've put my replies to your suggestions below.

SEP: Computer clocks are notorious for their inability to keep time.

MJC: True but I've generally found that the clocks in HP processors are pretty good. I've certainly never seen one gain as much as 25% before.

SEP: The bset way to deal with the time issue is to configure ntp.

MJC: It did have ntp configured but, as far as I can tell, the clock was gaining so rapidly that ntp just couldn't handle it.

SEP: As far as the CPU issue, its possible A cpu that is not rated for your system board is installed. The system should not run at all, but it does.

MJC: Good Point! The processor is supposed to be one of 14 identical "5X" units with 550MHz cpus. But I guess it's possible that, in a previous hardware repair, someone installed the wrong system board. I'll get it checked.

SEP: If you want the system to be reliable, either the system board or the CPU may need to go.

MJC: Yep. As above.

SEP: As far as the kernel goes, perhaps you or someone else copied the /stand filesystem or the system files from a system with a different model or cpu. Perhaps this is from a system replication done by Ignite with different hardware, which can be problematic unless Golden Software imaging with Ignite is uesd.

MJC: As far as I know, the problem started when the kernel was rebuilt automatically by swinstall after installation of patches. I wasn't involved in this so can't be sure. Also, as far as I know, the same patches were applied using the same procedure to all 14 processors and this one was the only one which had a problem. So "It wasn't me your honour" but anything's possible.

Thanks Again and Best Regards

Mike