<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: NODE_TIMEOUT in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434148#M703546</link>
    <description>In your case, you are running the minimum allowed value for NODE_TIMEOUT of 2 X HEARTBEAT_INTERVAL which puts you on the hairy edge eventhough your total timeout (6 seconds) seems reasonable. You are essentially as vulnerable and someone running the absolute minimum of NODE_TIMEOUT = 2 s and HEARTBEAT_INTERVAL of 1 s. The speed of the CPU's should have little to do with this and indeed it is quite common in MS/SG land to have very asymetrical servers making up a cluster especially if old klunkers are used for failover.&lt;BR /&gt;&lt;BR /&gt;My rule (and it's just mine) is to never go below 3 heartbeat misses but obviously I prefer more frequent heartbeats but tolerate more misses.&lt;BR /&gt;&lt;BR /&gt;Finally, just because you (and q4) think this is the reason for the TOC doesn't mean that it is. For example, an operator might have pushed the little button.&lt;BR /&gt;</description>
    <pubDate>Wed, 01 Dec 2004 12:00:06 GMT</pubDate>
    <dc:creator>A. Clay Stephenson</dc:creator>
    <dc:date>2004-12-01T12:00:06Z</dc:date>
    <item>
      <title>NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434143#M703541</link>
      <description>This weekend I had one side of my 2 node cluster TOC- in q4 it suggested that the cause might be that the NODE_TIMEOUT period is to low. It suggeted that I set it at 8 seconds.  I currently have 4-875 CPU's in the same cell on one side and on 4-650 CPU's in the same cell on the other side.  &lt;BR /&gt;&lt;BR /&gt;Is the above setting correct for my systems configuration.  Also should I set the heartbeat_Interval up.</description>
      <pubDate>Wed, 01 Dec 2004 11:13:41 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434143#M703541</guid>
      <dc:creator>Jonathan H.</dc:creator>
      <dc:date>2004-12-01T11:13:41Z</dc:date>
    </item>
    <item>
      <title>Re: NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434144#M703542</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;We have our NODE_TIMEOUT set for 8 seconds and our HEARTBEAT_INTERVAL set for 2 seconds.  Those values seem to work well and we haven't had any random TOCs when the network was busy.&lt;BR /&gt;&lt;BR /&gt;JP&lt;BR /&gt;</description>
      <pubDate>Wed, 01 Dec 2004 11:21:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434144#M703542</guid>
      <dc:creator>John Poff</dc:creator>
      <dc:date>2004-12-01T11:21:14Z</dc:date>
    </item>
    <item>
      <title>Re: NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434145#M703543</link>
      <description>Well, unless I use The Force I have no way of knowing what your current settings are so that makes it a little difficult to make intelligent comments.&lt;BR /&gt;&lt;BR /&gt;I can say that I use a HEARTBEAT_INTERVAL of 1000000 (1 s) and a NODE_TIMEOUT of 8000000 (8 s) and have never had a TOC; of course, I've never had a MC/SG failover in over 5 years that was not manually (and intentionally) triggered. &lt;BR /&gt;&lt;BR /&gt;If you are using the default NODE_TIMEOUT of 2 s, you are really asking for incidents like yours. I do assume you have multiple HEARYBEAT_IP's defined.&lt;BR /&gt; &lt;BR /&gt;</description>
      <pubDate>Wed, 01 Dec 2004 11:23:50 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434145#M703543</guid>
      <dc:creator>A. Clay Stephenson</dc:creator>
      <dc:date>2004-12-01T11:23:50Z</dc:date>
    </item>
    <item>
      <title>Re: NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434146#M703544</link>
      <description>I have the HEARTBEAT_INTERVAL set at 3000000&lt;BR /&gt;and the     NODE_TIMEOUT      set at 6000000&lt;BR /&gt;&lt;BR /&gt;We are currently running several clusters throughout the country and have never had this problem.  Until we upgraded the CPU's on one side.  Do your systems have the same size CPU's?</description>
      <pubDate>Wed, 01 Dec 2004 11:34:19 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434146#M703544</guid>
      <dc:creator>Jonathan H.</dc:creator>
      <dc:date>2004-12-01T11:34:19Z</dc:date>
    </item>
    <item>
      <title>Re: NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434147#M703545</link>
      <description>I don't think it is so much a function of how fast your CPUs are, but the combination of your settings.  With HB at 3 seconds and TO at 6 seconds, that means you only have to miss two heartbeats and it is TOC time.  Our settings of HB at 2 and TO at 8 means you have to miss 4 heartbeats.  With Clay's settings you have to miss 8 heartbeats.&lt;BR /&gt;&lt;BR /&gt;JP&lt;BR /&gt;</description>
      <pubDate>Wed, 01 Dec 2004 11:37:59 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434147#M703545</guid>
      <dc:creator>John Poff</dc:creator>
      <dc:date>2004-12-01T11:37:59Z</dc:date>
    </item>
    <item>
      <title>Re: NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434148#M703546</link>
      <description>In your case, you are running the minimum allowed value for NODE_TIMEOUT of 2 X HEARTBEAT_INTERVAL which puts you on the hairy edge eventhough your total timeout (6 seconds) seems reasonable. You are essentially as vulnerable and someone running the absolute minimum of NODE_TIMEOUT = 2 s and HEARTBEAT_INTERVAL of 1 s. The speed of the CPU's should have little to do with this and indeed it is quite common in MS/SG land to have very asymetrical servers making up a cluster especially if old klunkers are used for failover.&lt;BR /&gt;&lt;BR /&gt;My rule (and it's just mine) is to never go below 3 heartbeat misses but obviously I prefer more frequent heartbeats but tolerate more misses.&lt;BR /&gt;&lt;BR /&gt;Finally, just because you (and q4) think this is the reason for the TOC doesn't mean that it is. For example, an operator might have pushed the little button.&lt;BR /&gt;</description>
      <pubDate>Wed, 01 Dec 2004 12:00:06 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434148#M703546</guid>
      <dc:creator>A. Clay Stephenson</dc:creator>
      <dc:date>2004-12-01T12:00:06Z</dc:date>
    </item>
    <item>
      <title>Re: NODE_TIMEOUT</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434149#M703547</link>
      <description>Suggest:&lt;BR /&gt;NODE_TIMEOUT = 8 seconds&lt;BR /&gt;HEARTBEAT_INTERVAL = 1 second &lt;BR /&gt;  (sends up to 8 heartbeat packets before NODE_TIMEOUT expires)&lt;BR /&gt;&lt;BR /&gt;Consider:&lt;BR /&gt;Create redundant heartbeat paths:&lt;BR /&gt;Review the cluster configuration file - look for STATIONARY_IP.  If this title is related to an ethernet NIC, change it to HEARTBEAT_IP.&lt;BR /&gt;Then, with the cluster down, perform &lt;BR /&gt;  # cmapplyconf -C &lt;CLUSTER.ASCII&gt;&lt;BR /&gt;&lt;BR /&gt;-StephenD.&lt;BR /&gt;&lt;/CLUSTER.ASCII&gt;</description>
      <pubDate>Thu, 02 Dec 2004 09:26:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/node-timeout/m-p/3434149#M703547</guid>
      <dc:creator>Stephen Doud</dc:creator>
      <dc:date>2004-12-02T09:26:14Z</dc:date>
    </item>
  </channel>
</rss>

