Operating System - HP-UX
1753944 Members
8165 Online
108811 Solutions
New Discussion юеВ

CLOSE_WAIT status, how to properly close the connection

 
SOLVED
Go to solution

CLOSE_WAIT status, how to properly close the connection

our customer has a problem with a very old application which is used since 2 weeks on 11.31 Itanium with Aries. On 11.11 this problem did not exist.

After 2-3 days we have the status CLOSE_WAIT, today on port 3002. Clients are able to open new connections to that port but:

The SW has to be shut down each week for maintenance and this is done with fuser -cuk (!)

after that the software cannot be started again because the tcp port e.g. 3002 shows CLOSE_WAIT.

We tried:
/usr/bin/ndd -set /dev/tcp tcp_discon_by_addr $CONN
the applic. is still not starting

I have not so much experience with the TCP stack but I thought perhaps some socket are not released or something is left in the cache or ?

We have in this case to do a reboot of the server, then we can use that port 3002 again and start the application.

But that cannot be a solution.

Do you know how I can close the ports without a reboot ?

regards
Gabi
10 REPLIES 10
Ganesan R
Honored Contributor

Re: CLOSE_WAIT status, how to properly close the connection

Hi,

We have same kind of issue with one of our application. Whenever the application have some issue we have to forcefully terminate all the ports opened by that application in order to restart the application. we do this steps...

netstat -an |grep

convert all socket values to it's HEXA value

then run this command to terminate the sockets.

#ndd -set /dev/tcp tcp_discon_by_addr

Hope you also follow the same steps..
Best wishes,

Ganesh.

Re: CLOSE_WAIT status, how to properly close the connection

yes this is done. We have a little script written to do the tcp_discon_by_addr.

Afterwards if you do a netstat -an you don't see the CLOSE_WAIT anymore.

But then the application cannot be restarted.

I found something in the forum:
As for "excessive" I suspect that an HP-UX system could handle thousands and thousands of them without problem. The only problem would be the loss of file descriptors in the applciations, and even if you use the massive tcp_discon kludge (which one should almost _never_ use...) the socket (file descriptor) will still be allocated until the application calls close. So, you still need to get the application to call close()...

If the problem is application restart, then the fix is to make sure the application is setting SO_REUSEADDR before trying to bind().

But we cannot change the application to do a SO_REUSEADDR. We have to find another workaround.

Gabi

Re: CLOSE_WAIT status, how to properly close the connection

do anyone know anything ?? Please help.

Re: CLOSE_WAIT status, how to properly close the connection

if we start the application with tusc, we get the following error message:

{2508480} #1 bind (5, 0x7b014dd8, 38) ERR#226 EADDRINUSE
sin_family: AF_UNIX
sun_path: /var/spool/sockets/pwgr/client17053
rick jones
Honored Contributor
Solution

Re: CLOSE_WAIT status, how to properly close the connection

AF_UNIX is not TCP, that would be a Unix Domain Socket. So, using tcp_disconn_by_addr would not be involved - it suggests that when you are using (what I consider a massive kludge but it is there anyway) the tcp_disconn_by_addr "mechanism" it is probably causing the application to take a path where id is not cleaning-up one or more AF_UNIX sockets.

CLOSE_WAIT is the state a TCP endpoint enters when it has recieved a FINished segment from the remote, indicating the remote will be sending no more data. The local application is notified of this by making the associated socket readable, and a read against said socket returning zero bytes.

While CLOSE_WAIT is a perfectly valid "send only" state for a TCP connection, 99 times out of 10 it really just means that TCP is now waiting for the application to call close() (or at least shutdown(SHUT_WR)) against the socket.

If the application is buggy, perhaps with a timing window, and misses or otherwise ignores the read return of zero, it is unlikely to close its TCP endpoint and CLOSE_WAIT could remain there for quite some time. Unless we are talking about thousands of these things, the only effect this has is to consume a file descriptor.

If one terminates the process(es) of the application, all its TCP endpoints will be closed without having to resort to the (IMO) kludge of tcp_discon. You do need to make sure you get all the processes - if the application fork()s there may be more than one process with a reference on the socket, and the close() only happens when the last reference is gone...

If you do get all the processes with a reference to the socket, the CLOSE_WAIT should go away.

If indeed you have gotten all the processes with a reference to the socket, and it remains in CLOSE_WAIT, that indicates a bug in the stack.
there is no rest for the wicked yet the virtuous have no pillows

Re: CLOSE_WAIT status, how to properly close the connection

thank you very much, this makes it much more clear to me. I also sent your answer now to our customer, I don't know when they will shutdown the application the next time, so I can check all processes.

regarding the patch, I opened already a case at HP, but we have a "cheap support" what means, I wait since about 2 weeks for some answers and I am still hoping .... and waiting :-))

Re: CLOSE_WAIT status, how to properly close the connection

do anyone know if there are some trouble with lsof ?

We have the latest patch and all other things look fine, but I don't see even all established ports with lsof -nP
Turgay Cavdar
Honored Contributor

Re: CLOSE_WAIT status, how to properly close the connection

Use "lsof -i tcp" command.

Re: CLOSE_WAIT status, how to properly close the connection

The case at HP is now escalated to level 3.
we did a toc dump last weekend and they want to help us with a workaround. I hope I get a answer soon.