Operating System - HP-UX
1832274 Members
2181 Online
110041 Solutions
New Discussion

Tracing cause for TCP severances

 
Ralph Grothe
Honored Contributor

Tracing cause for TCP severances

Hello,

unfortunately my knowledge concerning Internet protocols and the working mechanisms of devices and software that implement these is too limited.
We have server client communication routes (WANs) where clients claim to repeatedly encounter disruptions of services.
I have checked NIC, static routes, IP address settings etc. of the involved servers several times but couldn't discover any abnormalities or conspicuous network services degradations.
Since all server side socket endpoints share the same NICs (viz. so called IP aliases or secondary IP addresses as are common for cluster virtual IP addresses (VIPs)) to my understanding there should be "binary" all or none disruptive effect.
But there are many connections that are unaffected.
To get an overview of abnormal severances I set up packet capturing filter that I modeled after Fig. 3 in the RFC793
(http://www.faqs.org/rfcs/rfc793.html)
which depicts a packet's TCP header.
According to this layout my filter would offset 13 octets into the header to fetch the flags field and filter for the RST bit (e.g. 'tcp[13:1] & 0x4').
Now I have a dump of Ether headers (54 octets, no payloads) along with timestamps of incidents when a reset was encountered.
However, I still lack the knowledge of what might have caused the resets.
Probably my filter was too restrictive and I should also have recorded the packets in sequential vicinity of the resets.
Because I suspected some firewall or "black hole" routing component to have been the cause of half closed sockets (in RFC793 there is an instructive section on those),
I asked the FW/NW admins, and they confirmed that they tracked out of sequence packets which there rule sets were to drop.
Now I am wondering what constitutes out of sequence packets in networking argot.
Even in my naive conception I wouldn't expect every packet to be in sequence.
For instance I could imagine that single packets occassionally could travel deviating routes (maybe by some sudden change of metric weighs, or congestion of beaten routes).
Also what about those packets that reuquire retransmission, maybe caused by too large an MSS to fit through a router's MTU.
A well behaved router I think in such a case should, when encountering in the sender's packets' IP header the DF bit set, respond with an ICMP Can't fragment,
in order to give the originator of the packet a chance to decrease the MSS and retransmit.
As far as I know from inspecting the IP stack tunables of my servers, the so called path MTU strategy discovery is per default enabled so that I assume that in all outgoing packets the DF flag is set.
(that of course was also true for the packets my filter captured)
Now I doubt if the PMTU strategy at least for those affected routes is the proper one.
On the other hand I think by this I would also relinquish PMTU to the well behaving routes
(there isn't a way to toggle PMTU on a by IP address basis, is there?).
What about reducing MTU per IP address,
hoping to hit an acceptable compromise that on the one hand would cut down the connection loss while on the other hand leaving the frame size wide enough not to impair throughput?
Are there any "recommended" MTUs for ill behaved routes?
What else could I do on behalf of the servers' network settings
(these are the only screws at my disposal)?
Do you know of further methods to trace the cause of the severances?
What filters could I employ to get a clearer picture?
Your suggestions are very welcome.

Kind Regards
Ralph
Madness, thy name is system administration
2 REPLIES 2
Arunvijai_4
Honored Contributor

Re: Tracing cause for TCP severances

Hi Ralph,

Will this link be helpful ?

http://alive.znep.com/~marcs/mtu/

Just check it out.

-Arun
"A ship in the harbor is safe, but that is not what ships are built for"
Ralph Grothe
Honored Contributor

Re: Tracing cause for TCP severances

Thanks for the link Arun.

Meanwhile I received an excerpt from a dump from our FW admin.
So I have to correct my words,
the exact phrase in their log was "TCP packet out of state".
I presume this relates to one of their stateful packet inspection filters
where they drop any packet that doesn't appear in an established or related state.
I also presume that their FW make is a Checkpoint.
If anyone has some knowledge or experience with that kind of FW I would be glad to hear from them what that could indicate,
or how FW filters and Unix servers' TCP/IP stacks could best be adjusted to cooperate harmoniously.
Madness, thy name is system administration