- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- HP-UX 11.0 TCP Problem
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-18-2003 04:42 PM
тАО02-18-2003 04:42 PM
HP-UX 11.0 TCP Problem
The abc host is the HP-UX host. The xyz host is a Solaris host.
18:31:13.585376 abc.51519 > xyz.7001: S 840842314:840842314(0) win 32768
18:31:13.585599 xyz.7001 > abc.51519: S 3949685170:3949685170(0) ack 840842315 win 24820
)
18:31:13.587426 abc.51519 > xyz.7001: P 1:240(239) ack 1 win 32768
18:31:13.587503 abc.51519 > xyz.7001: P 240:1700(1460) ack 1 win 32768
18:31:13.595863 abc.51519 > xyz.7001: P 1700:2328(628) ack 1 win 32768
18:31:13.595881 xyz.7001 > abc.51519: . ack 240 win 24581 (DF)
18:31:13.595895 xyz.7001 > abc.51519: . ack 1700 win 24820 (DF)
18:31:13.611238 xyz.7001 > abc.51519: P 1:307(306) ack 2328 win 24820 (DF)
18:31:13.611572 xyz.7001 > abc.51519: . 307:1767(1460) ack 2328 win 24820 (DF)
18:31:13.611606 abc.51519 > xyz.7001: . ack 1767 win 32768
18:31:13.613850 xyz.7001 > abc.51519: F 3176:3176(0) ack 2328 win 24820 (DF)
18:31:17.871423 xyz.7001 > abc.51519: FP 1767:3176(1409) ack 2328 win 24820 (DF)
18:31:17.871481 abc.51519 > xyz.7001: . ack 3177 win 32768
18:31:17.872428 abc.51519 > xyz.7001: F 2328:2328(0) ack 3177 win 0
18:31:17.872450 xyz.7001 > abc.51519: . ack 2329 win 24820 (DF)
After the HP box sends an ACK with the RCV.NXT set to 1767, It appears that the HP box receives a FIN from the solaris box with a sequence number of 3176. I looked through the TCP spec and it seems that this should be handled by the TCP stack on HP. It's strange that the Solaris box is sending a sequence number of 3176.
After the first FIN is received by the HP host, a second FIN is received about 4 seconds later with the sequence number set to 1767:3176. The HP host immediately responds.
Is this a bug?? I am by no means a TCP expert.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-19-2003 04:24 PM
тАО02-19-2003 04:24 PM
Re: HP-UX 11.0 TCP Problem
4 seconds seems like a long time to be lost in space so most likely a packet got lost and was resent. That would probably account for the second FIN. The missing packet was probably just a PUSH followed by a packet with just the empty FIN (note that the sequence number is really a count of the payload bytes) and when no reply was received a combined FINPUSH was sent to replace both the missing packet and the FIN packet which was received but not ACK'd.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-20-2003 11:06 AM
тАО02-20-2003 11:06 AM
Re: HP-UX 11.0 TCP Problem
In the TCP header, the sequence number is the sequence number at which the data begins (TCP basically assigns a sequence number to every byte it sends)
Tcpdump, by showing the starting sequence number and ending sequence number and length start:end(length) is simply doing some math for your. It is as you can tell doing other math by subtracting the ISN (Initial Sequence Number) so us mere mortals can deal with data streams that appear to start with sequence number 1.
BTW, where was this trace taken? HP_UX end, Solaris end, or somewhere in the middle? Also, next time, using "HP" and "Sol" as the hostnames instead of abc and xyz will make things a triffle easier to follow :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-20-2003 11:16 AM
тАО02-20-2003 11:16 AM
Re: HP-UX 11.0 TCP Problem
The trace was done on the HP machine. This problem consistently happens on every request in the same place. Does that seem like a network problem?? It seems that a network problem would be more random in where the packets are dropped.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-20-2003 11:27 AM
тАО02-20-2003 11:27 AM
Re: HP-UX 11.0 TCP Problem
If the trace is on the HP, I'm a _triffle_ surprised that there was not an immediate ACK 1767 after the out-of-order bare FIN arrived. However, I do seem to recall some stuff changing in how out-of-order bare FIN's were dealt with, so being up on the latest HP-UX transport patches might be good. I seem to remember similar things for Solaris, be sure to be up to date there too.
Even ignoring that (not that the single immediate ACK could have triggered a quicker retransmit...) there is still the issue of the lost segments themselves.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-20-2003 01:02 PM
тАО02-20-2003 01:02 PM
Re: HP-UX 11.0 TCP Problem
Go out and get every TCP patch on the UX side, install it. Make sure December 2002 Gold Quality pack is installed first.
Then go out and do the Solaris equivalent steps.
If there is a bug on either side, these steps should squash it.
Somebody isn't playing by the rules.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-21-2003 08:18 AM
тАО02-21-2003 08:18 AM
Re: HP-UX 11.0 TCP Problem
There is apparently a bit of ambiguity in RFC 793 about whether to ack an out of order packet or not. See the discussion at:
http://www-mice.cs.ucl.ac.uk/multimedia/misc/tcp_ip/8804.mm.www/0066.html
Mark,
What is the normal ping response time for this link?
Do a standard ping and then a larger one of about 1460 bytes.
Are there any known bottlenecks? (Slow serial links perhaps?) What sort of queuing is used on these links? Are you dropping packets at the queues? I wondering if something is queuing based on packet size and letting little packets through while big packets have to sit in a long overfilled queue. That would explain how your empty FIN gets through while the fat 1409 gets lost. You might get the same effect on a bad link with high collisions or noise. A smaller packet has a much better chance of getting through than a larger. Wouldn't explain why it is always the next to the last packet that gets lost tho (if that is what you are saying).
So it could still be some sort of bug in the Solaris's TCP/IP stack. Perhaps something causes him to lose the next to the last packet of a conversation.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-21-2003 09:57 AM
тАО02-21-2003 09:57 AM
Re: HP-UX 11.0 TCP Problem
Here is a TCP dump from the Sun box perspective.
1 0.00000 HPUX -> SOLARIS TCP D=7001 S=59410 Syn Seq=3769744967 Len=0 Win=32768 Options=
2 0.00006 SOLARIS -> HPUX TCP D=59410 S=7001 Syn Ack=3769744968 Seq=1024955216 Len=0 Win=24820 Options=
3 0.00324 HPUX -> SOLARIS TCP D=7001 S=59410 Ack=1024955217 Seq=3769744968 Len=239 Win=32768
4 0.00004 SOLARIS -> HPUX TCP D=59410 S=7001 Ack=3769745207 Seq=1024955217 Len=0 Win=24581
5 0.00019 HPUX -> SOLARIS TCP D=7001 S=59410 Ack=1024955217 Seq=3769745207 Len=1460 Win=32768
6 0.00004 HPUX -> SOLARIS TCP D=7001 S=59410 Ack=1024955217 Seq=3769746667 Len=795 Win=32768
7 0.00022 SOLARIS -> HPUX TCP D=59410 S=7001 Ack=3769746667 Seq=1024955217 Len=0 Win=24820
8 0.02617 SOLARIS -> HPUX TCP D=59410 S=7001 Ack=3769747462 Seq=1024955217 Len=306 Win=24820
9 0.00010 SOLARIS -> HPUX TCP D=59410 S=7001 Ack=3769747462 Seq=1024955523 Len=1460 Win=24820
10 0.00004 SOLARIS -> HPUX TCP D=59410 S=7001 Ack=3769747462 Seq=1024956983 Len=1412 Win=24820
11 0.00039 HPUX -> SOLARIS TCP D=7001 S=59410 Ack=1024956983 Seq=3769747462 Len=0 Win=32768
12 0.03148 SOLARIS -> HPUX TCP D=59410 S=7001 Fin Ack=3769747462 Seq=1024958395 Len=0 Win=24820
13 4.22073 SOLARIS -> HPUX TCP D=59410 S=7001 Fin Ack=3769747462 Seq=1024956983 Len=1412 Win=24820
14 0.00047 HPUX -> SOLARIS TCP D=7001 S=59410 Ack=1024958396 Seq=3769747462 Len=0 Win=32768
15 0.00237 HPUX -> SOLARIS TCP D=7001 S=59410 Fin Ack=1024958396 Seq=3769747462 Len=0 Win=0
16 0.00004 SOLARIS -> HPUX TCP D=59410 S=7001 Ack=3769747463 Seq=1024958396 Len=0 Win=24820
So, the delay is on the Sun box also in lines 12 & 13. I guess the question is should the HP box be sending an ACK to the first FIN, or is the Solaris box screwing things up by sending an out of order FIN, expecting an ACK to the FIN and not transmitting the remaining payload without receiving the ACK??
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-21-2003 10:19 AM
тАО02-21-2003 10:19 AM
Re: HP-UX 11.0 TCP Problem
PHNE_22216:
1. 100BT is randomly missing packets.
2. System panics due to instruction page fault.
3. ER: Display capabilities of the switch after auto
negotiation completion in nettl log (informative).
4. ER: Provide driver revision in btlan4_ift_t_2 structure
for Q4 support.
5. ER:for Q4 (dump reading tool) : change the name of the
internal btlan structure.
6. ER: Startup scripts and conf files should show the cards
that they support.
7. netfmt gets SIGSEGV signal when HP_APA LAN_MONITOR is
active.
8. Cable disconnect for cards with AUI sends 3 events to APA
as a result of which APA fails.
9. 100BT card receive engine hangs resulting in discard of
all incoming packets.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-21-2003 10:24 AM
тАО02-21-2003 10:24 AM
Re: HP-UX 11.0 TCP Problem
You said that a FIN packet was lost. I don't see that. I see the same FIN's on both ends (HP and Sun) with about the same delay in between the first and second FIN. Am I missing what you're talking about?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-21-2003 11:53 AM
тАО02-21-2003 11:53 AM
Re: HP-UX 11.0 TCP Problem
the HP side should never send an ACK for the "other" side of a hole in the sequence space - when the solaris system sends that FIN that jumps in the sequence space, the HP system must not ack that sequence number until it has receved all the prior sequence numbers (modulo selective ACK, which isn't being used here...)
that we also see the packet missing in the solaris trace implies a couple possibilities:
1) the soalrs box droipped a segment somewhere between tcp and the tracing point (presumeably in the driver). I'd check TCP and IP and link stats (netstat -k
2) the "missing" data was queue waiting for the congestion window to open, the application called close(), and that triggered a FIN to be sent without also making sure that the queued, unsent data was "flushed" out the network.
Some ways to explore that might include setting the intial congestion window to a larger value - i know that UX canbe configured that way, look for a solaris ndd setting similar to tcp_cwnd_initial and increase it. That could conceivably "workaround" the bug with the FIN being sent before the queued data, you would still want to find an apropriate solaris patch
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-21-2003 08:55 PM
тАО02-21-2003 08:55 PM
Re: HP-UX 11.0 TCP Problem
You are right. I was just looking at the two FINs and didn't notice that the second FIN had data in it. I got used to the HP's tcpdump output with the nice P when it was sending data. Here I need to look at the Len. So you have narrowed it down to the Solaris screwing up since it appears to be generating an extra unneeded FIN before it finishes sending its data. It is time to hunt for a Solaris patch. You need to find out what version of Solaris you are running and what driver you are using and visit their site and see what you can find in patches for Ethernet, ARPA, network, TCP/IP or the driver.
For what it is worth TCP/IP is supposed to close after the data is all sent by
FIN
ack
fin
ACK
where the caps indicate one talker and the small letters indicate the other talker. The last data packet can start the closing sequence by including a FIN but you don't send a FIN until you have sent all of your data.
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-23-2003 05:12 AM
тАО02-23-2003 05:12 AM
Re: HP-UX 11.0 TCP Problem
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-24-2003 11:22 AM
тАО02-24-2003 11:22 AM