Operating System - HP-UX
1753786 Members
7442 Online
108799 Solutions
New Discussion

Re: timeouts and help needed in understanding tcpdump o/p

 
DeafFrog
Valued Contributor

timeouts and help needed in understanding tcpdump o/p

Hi ,

 

 

We sometimes face message “timeout detected” in one of our application running on hp ux v3 IA  (server B) , this application connects to another application running on different server (server A , HP UX v3 IA) on port  17010. I enable a tcpdump on server A :

 

/usr/sbin/tcpdump -i lan0 -vv -X -n -s5000 ip host <SERVER-B_IP> | tee /tmp/tcpdump1306.txt

I don’t know how to comprehend the o/p :

 

15:56:33.409272 IP (tos 0x0, ttl 63, id 50006, offset 0, flags [DF], proto TCP (6), length 343) 10.0.7.2.56475 > 10.0.0.3.17010: P, cksum 0x9ac8 (correct), 3

05:608(303) ack 372 win 32768

        0x0000:  4500 0157 c356 4000 3f06 5c46 0a00 0702  E..W.V@.?.\F....

        0x0010:  0a00 0003 dc9b 4272 3f41 97cb c188 c078  ......Br?A.....x

        0x0020:  5018 8000 9ac8 0000 3032 3939 4953 4f35  P.......0299ISO5

        0x0030:  3130 3030 3030 3031 3230 30f6 7561 01a8  10000001200.ua..

        0x0040:  e0a0 0000 0000 0004 0000 0831 3634 3634  ...........16464

        0x0050:  3432 3637 3030 3632 3735 3032 3730 3130  4267006275027010

        0x0060:  3030 3030 3030 3030 3030 3130 3030 3030  0000000000100000

        0x0070:  3030 3030 3030 3130 3030 3030 3632 3931  0000001000006291

        0x0080:  3135 3631 3230 3030 3030 3030 3131 3132  1561200000001112

        0x0090:  3236 3031 3130 3632 3931 3535 3630 3931  2601106291556091

        0x00a0:  3831 3230 3632 3936 3031 3135 3132 3230  8120629601151220

        0x00b0:  3030 3630 3330 3330 3130 3935 3132 3538  0060303010951258

        0x00c0:  3937 3430 3334 3436 3434 3236 3730 3036  9740344644267006

        0x00d0:  3237 3530 3237 3d31 3831 3231 3231 3136  275027=181212116

        0x00e0:  3532 3335 3739 3431 3030 3030 3030 3030  5235794100000000

        0x00f0:  3039 3430 3030 3030 3030 3030 3031 3233  0940000000000123

        0x0100:  3435 3637 3839 3132 3334 3534 3048 424d  4567891234540HBM

        0x0110:  455c 5341 4c41 4c41 4820 4252 204f 4e53  E\SALALAH.BR.ONS

        0x0120:  4954 4520 2020 5c53 414c 2032 2020 2020  ITE...\SAL.2....

        0x0130:  2020 2020 5c35 3132 3531 3231 3430 3130  ....\51251214010

        0x0140:  3430 3036 3237 3530 3030 3130 3039 3531  4006275000100951

        0x0150:  3235 3839 3436 34                        2589464

15:56:33.470025 IP (tos 0x0, ttl 64, id 57794, offset 0, flags [DF], proto TCP (6), length 40) 10.0.0.3.17010 > 10.0.7.2.56475: ., cksum 0x9b84 (correct), 37

2:372(0) ack 608 win 32768

 

1)     How to have the above tcpdump o/p in some-what readable format . a better syntax of tcpdump , that will Produce a human readable format to a text file.

2)     Is there a better way to monitor the tcp/ip established connection and log somewhere as to why that sometimes “timeout detected” happens .

 

The servers load are normal ( >60% ideal cpu , and enough memory ) , N/W team says they are clean ( as always).

 

 

Regards,

FrogIsDeaf
6 REPLIES 6
donna hofmeister
Trusted Contributor

Re: timeouts and help needed in understanding tcpdump o/p

it's probably a wee bit early to be running tcp dump.

 

instead, please do "netstat -p tcp" followed by a wait and then another "netstat -p tcp".

 

send back the results of the two netstats.  if you can do this when you're experiencing timeouts that would be good to do (and let us know).

 

btw -- have you brought this up with your network folks?  have they looked at the switch your machine is going through?

DeafFrog
Valued Contributor

Re: timeouts and help needed in understanding tcpdump o/p

 hi Donna ,

 

                    thanks for the reply .

 

 

$ netstat -p tcp
tcp:
        286508727 packets sent
                265879168 data packets (318488901961 bytes)
                1228840 data packets (1709353228 bytes) retransmitted
                12261656 ack-only packets (6500342 delayed)
                0 URG only packets
                4667 window probe packets
                4321 window update packets
                8367975 control packets
        149878308 packets received
                105770145 acks (for 318520746648 bytes)
                5647594 duplicate acks
                0 acks for unsent data
                59697353 packets (20943683623 bytes) received in-sequence
                0 completely duplicate packets (0 bytes)
                446 packets with some dup data (44769 bytes duped)
                47354 out of order packets (5384885 bytes)
                0 packets (0 bytes) of data after window
                2470 window probes
               6259983 window update packets
                52850 packets received after close
                0 segments discarded for bad checksum
                0 bad TCP segments dropped due to state change
        592013 connection requests
        3611891 connection accepts
        4203904 connections established (including accepts)
        4342456 connections closed (including 138879 drops)
        23417 embryonic connections dropped
        100546474 segments updated rtt (of 100546474 attempts)
        19110 retransmit timeouts
                0 connections dropped by rexmit timeout
        4667 persist timeouts
        20908 keepalive timeouts
                19883 keepalive probes sent
                0 connections dropped by keepalive
        0 connect requests dropped due to full queue
        2522 connect requests dropped due to no listener
        0 suspect connect requests dropped due to aging
        0 suspect connect requests dropped due to rate


============= after some time again ========
tcp:
286567392 packets sent
265920001 data packets (318508964322 bytes)
1228853 data packets (1709357880 bytes) retransmitted
12273600 ack-only packets (6509102 delayed)
0 URG only packets
4667 window probe packets
4322 window update packets
8373863 control packets
149937897 packets received
105806745 acks (for 318540801614 bytes)
5647594 duplicate acks
0 acks for unsent data
59733725 packets (20955990091 bytes) received in-sequence
0 completely duplicate packets (0 bytes)
446 packets with some dup data (44769 bytes duped)
47386 out of order packets (5386245 bytes)
0 packets (0 bytes) of data after window
2470 window probes
6262497 window update packets
52996 packets received after close
0 segments discarded for bad checksum
0 bad TCP segments dropped due to state change
592736 connection requests
3614083 connection accepts
4206819 connections established (including accepts)
4345501 connections closed (including 139011 drops)
23449 embryonic connections dropped
100580286 segments updated rtt (of 100580286 attempts)
19120 retransmit timeouts
0 connections dropped by rexmit timeout
4667 persist timeouts
20921 keepalive timeouts
19894 keepalive probes sent
0 connections dropped by keepalive
0 connect requests dropped due to full queue
2528 connect requests dropped due to no listener
0 suspect connect requests dropped due to aging
0 suspect connect requests dropped due to rate

FrogIsDeaf
donna hofmeister
Trusted Contributor

Re: timeouts and help needed in understanding tcpdump o/p

The thing with network statistics to compare state #1 with state #2.  So using what you supplied, i get:

        58665 packets sent
                40833 data packets (20062361 bytes)
                13 data packets (4652 bytes) retransmitted
                11944 ack-only packets (8760 delayed)
                0 URG only packets
                0 window probe packets
                1 window update packets
                5888 control packets
        59589 packets received
                36600 acks (for 20054966 bytes)
                0 duplicate acks
                0 acks for unsent data
                36372 packets (12306468 bytes) received in-sequence
                0 completely duplicate packets (0 bytes)
                0 packets with some dup data (0 bytes duped)
                32 out of order packets (1360 bytes)
                0 packets (0 bytes) of data after window
                0 window probes
               2514 window update packets
                146 packets received after close
                0 segments discarded for bad checksum
                0 bad TCP segments dropped due to state change
        723 connection requests
        2192 connection accepts
        2915 connections established (including accepts)
        3045 connections closed (including 132 drops)
        32 embryonic connections dropped
        33812 segments updated rtt (of 33812 attempts)
        10 retransmit timeouts
                0 connections dropped by rexmit timeout
        0 persist timeouts
        13 keepalive timeouts
                11 keepalive probes sent
                0 connections dropped by keepalive
        0 connect requests dropped due to full queue
        6 connect requests dropped due to no listener
        0 suspect connect requests dropped due to aging
        0 suspect connect requests dropped due to rate

 the number of retransmit packets is very small relative to the number of transmitted packets.

 

to my partially-trained eyes, the only thing that's somewhat interesting is the number of packets retransmitted (13) to the number of retransmit timeouts (10).  with the numbers being so small it's hard to say if they're meaningful or not.

 

is there anything in /etc/rc.config.d/nddconf?

have you talk to your network people?

DeafFrog
Valued Contributor

Re: timeouts and help needed in understanding tcpdump o/p

 

Thanks(again).

 

yes i did had a chat with n/w people, they see good on the port to which the server nic is connected.The time-out are un predictable and they exists, though not very frequent , one of the solution was to have the tcpdump run in background , with a cron job to null out the log file thus generated , untill we are hit again  with timeout . But question is , is there a way to decipher the o/p what tcpdump provides in some what human understandable form ( conversioon from HEX to ?).

 

Regards,

FrogIsDeaf
donna hofmeister
Trusted Contributor

Re: timeouts and help needed in understanding tcpdump o/p

you might be able to read it in ethereal/wire-shark.  i know that nettl output can be read with this tool.

Doug O'Leary
Honored Contributor

Re: timeouts and help needed in understanding tcpdump o/p

Hey;

 

There is a way to use tcpdump to monitor you network traffic; however, you *really* want to limit what you're capturing.  With the arguments you gave below, you're basically collecting everything from the IP layer up for every host your system can see or hear.  You don't necessarily want all that - particularly since it will produce a *huge* amount of noise that you'll have to go through.

 

So, try this out first:

 

at now 

tcpdump -w ${dir}/timeout -C 500 -s 1564  (ip.host == ${host_a} || ip.host == ${host_b} ) && tcp.port == 17010

^D

 

Run that tcpdump via an at/now script.  Make sure ${dir} has *lots* of space.  You'll be recording files called timeout, timeout1, timeout2, timeout3 ... timeout## where each file will be roughly 500 meg.  The capture filter will grab the entire 1564 bytes of each packet  coming from or going to either host A or host B with either a destination or source port of 17010.   That *should* catch only the information you're looking for.  

 

You'll also want to either install wireshark or move these tcpdump files over to a system that has wireshark installed.  You can read the pcap files (what the dump files are called, generically); however, the wireshark gui provides a rather large number of bells and whistles that'll make your analysis life a lot easier.  

 

Even with that, though, the tcp/ip protocol stack is quite involved.  If you're looking for something quick, you should probably hand that off to a network admin guy or the SA that can do binary math in his head but has a large problem talking to real people.  

 

If all that capture filter doesn't catch what you need, you can broaden it out by eliminating the tcp.port arg.  

 

Anyway; you're about to dive into some *very* deep water; however, there's lots of cool stuff here, so go ahead and jump... just realize that you have a significant learning curve.

 

HTH;

 

Doug

 


------
Senior UNIX Admin
O'Leary Computers Inc
linkedin: http://www.linkedin.com/dkoleary
Resume: http://www.olearycomputers.com/resume.html