1827810 Members
1991 Online
109969 Solutions
New Discussion

syslogd timer delays

 
SOLVED
Go to solution
Martin Wells
Frequent Advisor

syslogd timer delays

We had the following message from one of our L-class servers running HP-UX 11

/usr/sbin/syslogd(1M) Syslog OS cmcld: : timers delayed 2.60 seconds

Is this a problem?
7 REPLIES 7
Massimo Bianchi
Honored Contributor

Re: syslogd timer delays

Hi,
yes, as far as i know, it might be a problem.

You use Service Guard, message comes from cmcld.

The daemon has a timer, used for polling and keeping track of the other server, and seeing that it is alive.

If server is overloaded, it can miss some update in this timer, and cause unexpeted TOC in the worst case.

When it finds that it has missed some polls, it prints the above message.


It's better if:

- you change your node_timeout, when you have the chance to stop your cluster

- check why the server is so loaded

- check the lastest patch, is has been issued has a defect, depending on your version of MC/SG. At least patch 23511 or whichever has superseded

HTH,
Massimo
Michael Tully
Honored Contributor

Re: syslogd timer delays

Have a look at this discussion. I would also look into patching.

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0xaab153921f1ad5118fef0090279cd0f9,00.html
Anyone for a Mutiny ?
Martin Wells
Frequent Advisor

Re: syslogd timer delays

This message has been issued once and it was during a heavy step in our overnight batch. Our node time out is set to 8000000 so am I correct in thinking we would be OK unless the delays were over this time?
Massimo Bianchi
Honored Contributor

Re: syslogd timer delays

Yes, with 8 secs you should be O.K., unless you HEARTBEAT_INTERVAL is too high, say 3-4 seconds.

If you knew that system was under high usage, and that it may re-occur, consider my suggestions.

HTH,
Massimo

Martin Wells
Frequent Advisor

Re: syslogd timer delays

Our heart beat interval is set to 2 seconds. What is the relationship between the heartbeat interval and the node time out?
Massimo Bianchi
Honored Contributor
Solution

Re: syslogd timer delays

Hi,

HEARTBEAT_INTERVAL: every how many seconds we poll for the state of the other server

NODE_TIMEOUT: after how many seconds of not-respondig should we think we are the only server alive, and so keep the appropriate action?


So, if you miss enough heartbets, you will reach the node timeout, and fail procedure will begin.

As for true, there are some "security timer", but i don't know exactly how they are calculated. So, if your second node, will not answer for NODE_TIMEOUT, there is an additional grace period, but i can't say for how long.

HTH,
Massimo
Jean-Louis Phelix
Honored Contributor

Re: syslogd timer delays

Hi,

The node will be considered as unreachable if it hasn't got a heartbeat during 'node_timeout' seconds. That's why heartbeat interval should 3 ou 4 times less than node timeout, so that it will allow you to miss some heartbeats.

Regards.
It works for me (© Bill McNAMARA ...)