1823713 Members
3843 Online
109664 Solutions
New Discussion юеВ

CLOSE_WAIT issues

 
SOLVED
Go to solution
Joydeep_1
Occasional Advisor

CLOSE_WAIT issues

Hi,

I am facing CLOSE_WAIT issues in 4 servers. Two of them in 11i environment & rest two are in 11 environment.

We have configured two WebLogic Cluster & configured our application.

But when user starts to access that application thru http port after some time there are lots of CLOSE_WAIT spawned which hang that port & made inaccessible of that application. Sometimes we have seen there are more than 2000 CLOSE_WAIT value.

I have checked our servers Patch level by HP & they have certified those environments.

I tuned some networking parameters (tcp_keepalive_detached_interval,
tcp_fin_wait_2_timeout, tcp_time_wait_interval) & configured those parameters as 10 secs.

It did not help to get rid of CLOSE_WAIT.

Please advise me how I can get rid of this problem.

5 REPLIES 5
U.SivaKumar_2
Honored Contributor

Re: CLOSE_WAIT issues

Stefan Farrelly
Honored Contributor

Re: CLOSE_WAIT issues


Take a look at this question, and in particular the .doc attachment on one reply which explains in a bit more detail FIN_WAIT and CLOSE_WAIT.

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x14eaf841489fd4118fef0090279cd0f9,00.html

Dont forget ndd changes dont take effect if you reboot unless you set them in /etc/rc.config.d/nddconf

Typically in my opinion this is an application issue. Either a remote connection has crashed or rebooted or not been closed properly (eg. a PC or a user who shuts it down without closing their applications). The only things you can do on the HP end are kill these connections using a script and ndd (script to do it on the link above) or set the timeouts low - which youve done, or find the users responsible, see what thyere doing, and clobber them if you can.


Im from Palmerston North, New Zealand, but somehow ended up in London...
Joydeep_1
Occasional Advisor

Re: CLOSE_WAIT issues

Thanks to all whoever has replied.

I tried to clean all CLOSE_WAIT value by one script which I got from Shiva's URL.

But unfortunately it could not.

I set tcp_keepalive_interval parameter to 10 secs also. It did not help.

Is it Weblogic issue ? I am using Weblogic 6.1
U.SivaKumar_2
Honored Contributor
Solution

Re: CLOSE_WAIT issues

Hi,

I understand that it is known problem in weblogic from weblogic newsgroup.

Some Fixex are also available
http://edocs.bea.com/wls/docs61/notes/bugfixes2.html

regards,
U.SivaKumar
Innovations are made when conventions are broken
rick jones
Honored Contributor

Re: CLOSE_WAIT issues

The CLOSE_WAIT state is the state a TCP connection enters when it has received and ACKnowledged a FIN from the remote and is now waiting for the local application to call close() or shutdown().

99 times out of 10, a connection "stuck" in CLOSE_WAIT means the application at that end has a bug - it is either ignoring, or forgetting when it was told that the remote has initiated a shutdown of the connection.

The other, much more rare case, is that this is an application that is using TCP to transfer data in one direction only. CLOSE_WAIT is a perfectly valid "send only" state for a TCP connection, assuming that the remote side, in FIN_WAIT_2, got there by doing a shutdown(SHUT_WR) and not a SHUT_RD or SHUT_RDWR or close().

This is why I am not terribly fond of the arbitrary fin_wait_2_timeout for dealing with FIN_WAIT_2. I much prefer to let the "normal" tcp_keepalive_detached_interval, which will deal with a different sort of client bug - when the client uses an abortive (RST instead of FIN) close of the TCP connection. That is a doubleplusungood thing to do, the RST is not retransmitted, and it can leave the server stuck in FIN_WAIT_2. The tcp_keepalive_detached_interval will deal with FIN_WAIT_2 on the server when the server calls close() - at that point it will send keepalives, and if the keepalives get no response, or elicit a RST from the remote, it will terminate the FIN_WAIT_2.

Do not mess with the tcp_time_wait_interval. It should stay at 60 seconds or more. It is only for connections in TIME_WAIT, and TIME_WAIT is an integral part of TCP's correctness heuristics. I also would not suggest a tcp_find_wait_2_timeout of 10 seconds - if you must use it, keep it at least as long as tcp_time_wait_interval.

I'd also probably not make tcp_keepalive_detached_interval a mere 10 seconds. The two minute default should suffice - again 99 times out of 10 :) and if you do need to make it shorter, again I'd not make it any shorter than tcp_time_wait_interval.
there is no rest for the wicked yet the virtuous have no pillows