- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Cause of TCP Resets and how to prevent them?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2003 02:31 AM
тАО07-14-2003 02:31 AM
having lured you into this thread I have to confess that the problem affects a Solaris box of ours.
But I guess (hope) that the underlying TCP drivers of both Unices (HP-UX vs. Solaris) don't distinguish too much from one another.
This to me seems somewhat justified as most of the parameter names the ndd utilities of both OSes use are exactly the same.
Thus I relied on HP-UX for the miles better documentation (I once again have to stress that the HP-UX manpages are the best of all Unices I have seen so far) while fiddling with the Solaris ndd (the Solaris manpages are most spartanic).
The problem is that the Solaris box is an Informix DB server to which many clients connect.
While the far majority of clients can work normally only some experience session abortions when they have a session too long open idling.
First I thought this was due to some inferior client side protocol idiosyncracies (viz. Windoze clients, protocols such as SMB, NetBIOS, who knows what unique protocols M$ has implemented?).
But a package traffic dump taken from somewhere in-the-middle (or from the clients' ends?) pointed out explicit TCP Resets initiated by the DB server.
Thus I produced my own dump with the Solaris sniffer tool snoop (similar to tcpdump).
Therefrom I could confirm that the DB server was emitting some TCP resets as stated in the control bits of the flags field of the TCP header.
Because I cannot influence the clients' working attitude (viz. close their sessions when there is no need to talk to the DB server) I thought about other remedy.
Looking at what Solaris' TCP driver has to offer I came accross this one (remember the syntax applies to Solaris)
# uname -srv
SunOS 5.7 Generic_106541-11
# ndd /dev/tcp \?|grep abort
tcp_ip_abort_cinterval (read and write)
tcp_ip_abort_linterval (read and write)
tcp_ip_abort_interval (read and write)
For the sake of clarity I consulted the HP-UX pendant:
# uname -srv
HP-UX B.11.00 U
# ndd -h tcp_ip_abort_interval
tcp_ip_abort_interval:
Second threshold timer for established connections.
When it must retransmit packets because a timer has expired,
TCP first compares the total time it has waited against two
thresholds, as described in RFC??1122, 4.2.3.5. If it has waited
longer than the second threshold, TCP terminates the connection.
[500,-] Default: 600000 (10 minutes)
This seems to me to be the propper screw to turn.
I found out that the Solaris default (from 2.5 and above) is 480000 ms or 8 mins.
So I increased it to 20 mins or 1200000 ms by issuing
# ndd -set /dev/tcp tcp_ip_abort_interval 1200000
# ndd /dev/tcp tcp_ip_abort_interval
1200000
Then we reran the client abortion test while I had another snoop dump produced.
Now the TCP resets decreased dramatically (only one).
So this seems to work somehow.
But before further increasing the interval to satisfy yet the most inert client I would like to make sure up to what value this is advisable.
What else could be the cause for the server's TCP resets?
Since the clients claim that they have the abortions only since June 26th, whereas we rebooted the machine the last time on july 2nd where manually driver settings could have been overwritten by the defaults.
And then the need for the reboot was fiddling on the Emulex driver settings for the FC HBAs to allow SAN connectivity
(thus I wouldn't have wondered if disks weren't visible anymore)
Thanks for your patience
Ralph
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2003 02:52 AM
тАО07-14-2003 02:52 AM
SolutionThere is a really good Solaris doc on how to tune your TCP stack which should cover this well, take a look;
http://www.sean.de/Solaris/soltune.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2003 03:57 AM
тАО07-14-2003 03:57 AM
Re: Cause of TCP Resets and how to prevent them?
Your abort timeout only comes into play when a packet has to be retransmitted and no response has been seen for the timer interval. It is there to allow TCP to drop a nonresponsive connection. So if this fixes your problem it indicates a bad network connection somewhere such a bad serial link, overfilled queue, or transient routing problem. Look at the packets sent just before the reset. Are they the same packet being resent at increasingly larger intervals? (each interval about twice what the other was?) Why didn't the client respond?
Note there is a bug in NT's TCP/IP.sys file but it usually only comes into play when a session is closing normally but the LAST ACK packet gets lost. NT forgets to resend it so the NT box stays in LAST ACK and the server has a socket stuck in FIN WAIT II. There is a fix for it but MS never released it. You have to beg them for it. (I have it if you need it.)
http://av.stanford.edu/books/tcpip/tcp_keep.htm
is a nice article on an idle TCP connection. Note that TCP doesn't normally care if the connection is idle. It's the application which is supposed to worry about it.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2003 04:18 AM
тАО07-14-2003 04:18 AM
Re: Cause of TCP Resets and how to prevent them?
Is there a change you have sunscreen installed on your solaris box ?
This programm terminates all connections after a timeout when remote-ipdresses are not explicitly put in the config.
I have been strugling with this issue for quiet a while, a while ago.
Please let me know.
Regs David
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-15-2003 11:23 AM
тАО07-15-2003 11:23 AM
Re: Cause of TCP Resets and how to prevent them?
Second, while it is indeed the case that most of the ndd tunables between HP-UX 11 and Solaris are named the same, and that most of those even behave the same, it is not 100%, so do be rather careful using our most excellent docs to tune your inferior Solaris systems :)
The best analogy I can think of to describe the two TCP/IP stacks is that they are (increasingly) distant cousins. They share a common ancestor in the same TCP/IP supplier, but Sun did a one-time thing and has been going their own way ever since. HP has maintained a continuing relationship with the TCP/IP supplier. Diverging branches and all that.
Finally, I cannot add much to the discussion of tcp-ip-abort_interval - the bit about it being when no ACK's are coming from the remote TCP is spot-on. These may help with other things:
ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_ndd.txt
ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-15-2003 10:51 PM
тАО07-15-2003 10:51 PM
Re: Cause of TCP Resets and how to prevent them?
I repent my sin of infiltrating HP expertise with inferior Solaris negligibilities.
Sounds like another religious hatchet lurking for excarvation.
I cannot judge on the different TCP/IP stack implementations because I know to little of the subject.
In general I also find HP-UX to be the "better" Unix (or rather the more appealing).
But there are a few things which I feel are more state-of-the-art in Solaris.
The most obvious to me seems to be the modularization of the kernel (which of course on the downside leads to a few intricacies; only count the numbers of files you have to edit on Solaris in order to reassign the host IP).
But the degree of modularization in the HP-UX kernel isn't really convincing, especially when you are used to play with Linux kernel customizations at home.