Operating System - HP-UX
1834142 Members
2314 Online
110064 Solutions
New Discussion

tcp connections not timing out quick enough

 
SOLVED
Go to solution
Sajith V Mannadiar
Frequent Advisor

tcp connections not timing out quick enough

Hi,

We have an application that runs on HP servers and access Oracle Internet Directory (OID) servers (again HP) through a Cisco CSS (load balancer)

The current configuration is such a way that if the first OID goes down, the CSS immediately fail-over the ldap services to the new OID and any new requests are redirected.

Even though the service status on CSS indicates that the ldap/oid services are failed over to the new server, our application server is unable to connect to the new server because of the existing connections (TCP_ESTABLISHED) with the failed OID not timing out.

If we close all these existing sockets using ndd, application works (fails over) immediately. Otherwise it almost takes 15 minutes to fail-over.

In short, we want the existing TCP connections to close (automatically) as soon as the OID server fails

Is there any parameters that can be set in /etc/rc.config.d/nddconf or in Kernel to achieve this?

Please help.

Thanks,
Sajith
4 REPLIES 4
Steven E. Protter
Exalted Contributor

Re: tcp connections not timing out quick enough

I think you should develop a cron script that detects a condition requiring failover.

That sript should run run at regular intervals and it should upon recieving valid conditions for failover, push the manual process you are running now with the ndd commands and such.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Kent Ostby
Honored Contributor
Solution

Re: tcp connections not timing out quick enough

Document KBRC00006395 discusses how to change the timeout values of TCP .

PROBLEM
How to shorten the time it takes for an application to be notified
that a TCP connection has timed out or that new connection cannot be made.

Is it possible to change the time it takes for TCP to give up when trying to
send data?

Is it possible to change how long it takes TCP to give up when trying to make a
new connection?

RESOLUTION
The ndd command allows you to adjust both of these TCP timeouts on 11.X.

The relevent ndd tunables are:
see man ndd for more details
# /usr/bin/ndd -h tcp_ip_abort_interval

tcp_ip_abort_interval:

Second threshold timer for established connections.

When it must retransmit packets because a timer has expired,
TCP first compares the total time it has waited against two
thresholds, as described in RFC?1122, 4.2.3.5. If it has waited
longer than the second threshold, TCP terminates the connection.
[500,-] Default: 600000 (10 minutes)

# /usr/bin/ndd -h tcp_ip_abort_cinterval

tcp_ip_abort_cinterval:

Second threshold timer during connection establishment.

When it must retransmit the SYN packet because a timer has
expired, TCP first compares the total time it has waited
against two thresholds, as described in RFC?1122, 4.2.3.5.
If it has waited longer than the second threshold, TCP
terminates the connection. [1000,-]
Default: 75000 (75 seconds)

To set them use:

/usr/bin/ndd -set /dev/tcp tcp_ip_abort_interval 120000
/usr/bin/ndd -set /dev/tcp tcp_ip_abort_cinterval 25000


This will only impact connections made AFTER the change. It will not imapct
existing connections. To make the changes take place at boot edit:

/etc/rc.config.d/nddconf

This would look similar to:

TRANSPORT_NAME[0]=tcp
NDD_NAME[0]=tcp_ip_abort_interval
NDD_VALUE[0]=120000

TRANSPORT_NAME[1]=tcp
NDD_NAME[1]=tcp_ip_abort_cinterval
NDD_VALUE[1]=25000

BEWARE that setting these values too low may cause TCP to fail
to deliver data, or make new connections due to a busy or slow
network when it might succeed, given a bit more time. If you
experience connection problems, try restoring the settings to
their default values which should be optimal in most cases.

"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Sajith V Mannadiar
Frequent Advisor

Re: tcp connections not timing out quick enough

Kent,

Thanks very much for your help.
That has resolved our issue (almost).

the fail-over time has now improved from 15 minutes to 2 minutes..

We have set the tcp_ip_abort_interval as 60000 and tcp_ip_abort_cinterval as 10000. We are now testing to see if this has any other implications.
rick jones
Honored Contributor

Re: tcp connections not timing out quick enough

Sigh - that tcp_discon junk in ndd is really not supposed to be used for that sort of stuff.

As for TCP connection timeouts - someone else has already made a rather thurough post on those ndd settings - just one thing to amplify though - keep in mind that those are _system wide_ and will affect all TCP connections.

When an application desires a quick connection disconnect, it _really_ should be implementing an application-level keepalive/timeout mechanism of its own, and not rely on the stack settings. So, please file a defect/enhancement request with Oracle so they can improve their product in this area.

One might also ask why the OSS isn't terminating TCP connections upon failover...
there is no rest for the wicked yet the virtuous have no pillows