1753513 Members
4928 Online
108795 Solutions
New Discussion юеВ

Re: connection limit

 
SOLVED
Go to solution
Marek Podmaka
Frequent Advisor

connection limit

What is TCP/IP connection limit on HP-UX 11.11? We have strange problem on one server (with latest STREAMS and ARPA patches installed) - it is repeating few times a day - no outgoing connection is possible from this server (BUT incoming connections works fine).
Every attempt looks like this:

bbnh2:(/root/home/root)(root)#telnet localhost 113
Trying...
telnet: Unable to connect to remote host: Can't assign requested address

bbnh2:(/root/home/root)(root)#telnet bbnh3 22
Trying...
telnet: Unable to connect to remote host: Can't assign requested address

### but incoming connection to bbnh2 from other machine works:
bbnh3:(/root/home/root)(root)#telnet bbnh2 113
Trying...
Connected to bbnh2...

As it doesn't work also for localhost, it can't be problem with network cables/lan cards. Nothing is printed to syslog/dmesg.

The number of open network connections (netstat -n|wc -l) is not high ~ 800. I have seen it was 2000 when it was working.

What other parameters affect the number of connections?

I have checked with sar and the only difference is number of open files:
file-sz 5836/194058 (when it works)
file-sz 18677/194058 (when it DOESN'T work)
It is a lot higher, but still under the limit...

Can you advise what else to check?
11 REPLIES 11
Jeeshan
Honored Contributor

Re: connection limit

did u check the routing table?
a warrior never quits
TTr
Honored Contributor

Re: connection limit

Are you using any add-on port TCP filtering?

Are you using any special name resolution method? Does it work for IP addresses instead of names?

Either of these may be malfunctioning intermittently (probably because of load) and you get the failures.
skt_skt
Honored Contributor

Re: connection limit

check the logs under /var/adm/nettl*

Here is the startup script which enable this logging.

/sbin/rc2.d/S300nettl.

Even though i dont suspect if that is a cable/card problem.

What about trying the ssh when failing which will try connecting through port 22.

#ssh -vvv localhost@root

it is good to start the nettl trace logging ;then attempt a failing connection; stop the nettl trace logging; this will provide enough logs to check

Below method i used when i had ssh connectity issue(only port 22)

On recurrence

1) attach tusc to running sshd and put it in the background with &

# ps -ef|grep sshd

# tusc -Eeaf -p -v -rall -wall -vall -T '' -o /tmp/sshd_tusc.txt &



2) start nettl trace



# nettl -tn all -tm 10M -e ns_ls_ip -f /tmp/sshd



3) attempt ssh to root@localhost in verbose mode with tusc attached



# tusc -Eeaf -p -v -rall -wall -vall -T '' -o /tmp/ssh_tusc.txt ssh -vvv root@localhost



4) when it fails, stop the nettl trace



# nettl -tf -e all



5) bring the tusc of sshd back to the foreground



# fg



6) detach tusc of sshd

Marek Podmaka
Frequent Advisor

Re: connection limit

ahsan: The routing table is OK and did not change.

TTr: No, IPFilter is not installed. We use /etc/hosts resolution mainly, but at least localhost should work. I will try the IP address also.

Santhosh: I tried logging with tcpdump, but did not receive any packet - probably the connect() call failed with the error specified above, so no packet is sent away.
I am not familiar with tusc (only strace on linux), but I will try...

And do you know of any kernel parameter or something that could be causing this behaviour?
Jeeshan
Honored Contributor

Re: connection limit

can u ping any host that can connect your host?
a warrior never quits
Marek Podmaka
Frequent Advisor

Re: connection limit

Very good question (could isolate the issue to IP instead of TCP/IP). I did not try it. I can also try nslookup (UDP/IP) next time.
But pinging this machine from some other certainly works, because otherwise the monitoring would report the machine as down/not accessible. Also all other incoming connections work.
rick jones
Honored Contributor
Solution

Re: connection limit

There is no fixed limit to the number of TCP connections in HP-UX. There are limits to the number of open file descriptors at both the system and per-process level, but I would _expect_ the error message to be different.

The error message sounds like telnet was trying to bind to an IP/port pair that was already in use. Netstat -n will only show established connections, what you should be using is netstat -an | wc -l and I suspect, based on the much larger number of open files when it doesn't work you will see lots of connections in TIME_WAIT.

The suggestion to tusc things is good, but I'd tusc your telnet commands rather than the sshd.

I suspect what is happening is something is exhausting the supply of "anonymous" (aka ephemeral) ports on the system. That port range is from tcp_smallest_anon_port to tcp_largest_anon_port which goes from 49152 to 65535 respectively. That is 16384 ports, which happens to be a number very close to the number of file descriptors in use when things fail.

The bind() call is probably what is failing on the telnet commands (see your tusc output) . Telnet may be relying on the implicit bind() call made by connect() when there is no bind() call made. That will then rely on the anonymous port space.

A TCP connection is fully named by the four- tuple of local/remote IP and local/remote port. The incoming connections work because the local IP and port have already been picked, and the remotes are supplying the other half.

The outgoing aren't working because there is no more anonymous port to select for the "client" half. The stack (well when bind() is called rather than connect()) has no idea what the remote IP/port will be to make sure that the new connection will have a unique name, so it fails the call.

Ways to workaround this:

*) Find what is churning through so many connections and get it to stop.

*) tune tcp_smallest_anon_port to something like 32768 or lower

*) get applications to start making "explicit" bind() calls that select a local IP and port in the full range of say 5000 to 65535 and from among the more than one (?) IP addresses on the system
there is no rest for the wicked yet the virtuous have no pillows
skt_skt
Honored Contributor

Re: connection limit

"The suggestion to tusc things is good, but I'd tusc your telnet commands rather than the sshd."

that was an example and in his scenario i would say inetd
rick jones
Honored Contributor

Re: connection limit

If the problem were with inbound connections, then tuscing inetd would be indicated. However since the failures are demonstrated on the telnet _client_ side, tuscing the telnet client is indicated.
there is no rest for the wicked yet the virtuous have no pillows