Operating System - HP-UX
1833696 Members
3536 Online
110062 Solutions
New Discussion

Re: Proactive reaping of sockets from potential SIGBUS throwers

 
SOLVED
Go to solution
Ralph Grothe
Honored Contributor

Proactive reaping of sockets from potential SIGBUS throwers

Hello,

we have a problem with SIGBUS core dumps on a webserver to which clients connect from the internet.
Obviously the webserver is then establishing a connection to a database server which resides behind a firewall to send queries to the db server on behalf of the clients' requests.
Sorry for being so unspecific, but the application on both servers is a black box to me since I have no access to the sources.
(I think even if I had I wouldn't want to wade through Java/C or whatever third party code weeds)
In order to watch this I've written a Perl module (i.e. only a tied array where HP-UX's netstat output for AF_INET sockets is dumped).
What looks suspicious to me is the sheer amount of sockets in FIN_WAIT_2 state as opposed to rather few ESTABLISHED.
(the ratio is 15 on average).
Someone raised their suspicion that the firewall be the culprit since it would sever routes after a certain period of inactivity.
I have no idea what kind of packet filtering the firewallers are doing, and if they are doing stateful inspection at all.
(sometimes I get the impression that they are entangled in their own rule sets ;-)
I thought I could come up with a quick and dirty Perl script that would scan the sockets' states and kill processes on the other end (viz. the db server) that still held a broken/shut socket open.
I'm convinced that this should have been taken care of by the application developers through an appropriate signal handler that safely shut the sockets instead of by anything I could come up with.
Anyway, my problem is to be sure to reap the "right" sockets that sooner or later would raise the dreaded SIGBUS.

Regards
Ralph
Madness, thy name is system administration
4 REPLIES 4
John Poff
Honored Contributor

Re: Proactive reaping of sockets from potential SIGBUS throwers

Hi Ralph,

We ran into a similar problem in-house with an application that was leaving lots of sockets in the FIN_WAIT_2 state. We weren't getting anything like SIGBUS errors, but our application was having trouble reconnecting on a particular socket when there were old socket connections in the FIN_WAIT_2 state. The real answer was to get the application fixed [just as you have mentioned], but since it is third party software we have to wait a few months for the next version.

Our workaround was to set the 'tcp_fin_wait_2_timeout' to be 11 minutes. That helps cleanup the old sockets. You'll have to put an entry in /etc/rc.config.d/nddconf to make it permanent. Do a 'man ndd' to see how to set it. One of the threads I read mentions not setting it any lower than 11 minutes or you'll run into some other problems.


JP
Ralph Grothe
Honored Contributor

Re: Proactive reaping of sockets from potential SIGBUS throwers

John,

thanks for your hint.
I already myself thought about some driver setting through ndd.
Unfortunately I cannot find such an executable, nor manpage, nor init config file on this box.
Perhaps it is done through another tool on this dated OS?:

# uname -srv
HP-UX B.10.20 A
Madness, thy name is system administration
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Proactive reaping of sockets from potential SIGBUS throwers

The ndd equivalent for 10.20 is 'nettune'. You shoould be able to man nettune although nettune itself (like ndd) has plenty of online help.
If it ain't broke, I can fix that.
John Poff
Honored Contributor

Re: Proactive reaping of sockets from potential SIGBUS throwers

Ralph,

For 10.20 you use the 'nettune' command. I haven't used 'nettune' before, but here is a thread where it is discussed:

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x6a9b03bbece8d5118ff40090279cd0f9,00.html


JP