- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Question about tcp_keepalive_interval (user sessio...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-21-2016 12:56 PM
06-21-2016 12:56 PM
Question about tcp_keepalive_interval (user sessions dropping)
Hello,
We use Humminbird Host Explorer to connect to HP RP3410 running HP-UX 11.11
User sessions are dropping. Sometimes inactive sessions, sometimes active sessions.
I'm wondering if the tcp_keepalive_interval setting would help keep these sessions from dropping?
7,200,000 is the default value, equal to 120 minutes.
Our server is set to 1,800,000, or 30 minutes.
The part I don't get- and this could be a silly question- is do I want this value set high or low to make sure the server and the users on Host Explorer stay connected all day and don't get dropped?
Thanks much!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-21-2016 02:40 PM
06-21-2016 02:40 PM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Need a bit more information. I assume that this program is a high end terminal emulator using telnet or ssh, correct? How are the users interacting with the system: a simple shell running various commands like ps, bdf or vi? Or are they running some menu program? Did the sysadmin set a shell timeout (hint: echo $TMOUT) for automatic logout for idle sessions? Are there error message in syslog pointing to these disconnects? DOes dmesg report anything about networking?
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 06:23 AM
06-22-2016 06:23 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
OK, Hummingbird Host Explorer is a terminal emulator that allows the users at the remote location to connect to the unix server using telnet.
The remote users see a custom menu system that let's them run some business software. They don't get a command prompt or run unix commands. The users log on in the morning and stay on the system all day.
I can log on as root from work and from home using the same Hummingbird software and stay on all day with no dropped connections. My laptop even went into sleep mode once and I still stayed connected to the server.
TMOUT is set to zero.
Syslog is clear, dmesg is clear. I turned on nettl logging / netfmt and that is clear. There isn't a single clue on the server itself pointing to any problems. Everything on the server looks happy. I have a feeling this might turn out to be more of a networking problem at the remote location instead of something wrong with the server.
Just thought I would try tinkering with tcp_keepalive_interval. I just can't tell if a big or small number (say 7,200,000 vs 1,800,000) would help?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 06:54 AM
06-22-2016 06:54 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
I found this bit of info-
"By default keepalive is set to 7,200,000. This means that every two hours the server tests the idle TCP connection by pinging the client. If the server gets no response from the client the keepalive terminates the idle connection."
Based on that I changed the value from 1,800,000 to 7,200,000. That way it will check every two hours instead of every 30 minutes, since we don't want idle connections dropped. Will see if that helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 08:04 AM
06-22-2016 08:04 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Using ping as a connection tester is crude at best, especially if there a single missed ping is a failure. Ping, unlike a TCP connection ignores dropped packets, that is, it will not retry. So a single missed ping is a bad test for connectivity and since your problem connections are remote, it may completely normal for occasional dropped packets.
So a very long tcp_keepalive_interval would be recommended for wide area connections.
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 08:18 AM
06-22-2016 08:18 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Please do a "netstat -s" wait 5 minutes then do another one. Please post the results here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 09:29 AM
06-22-2016 09:29 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Here goes, netstat -s
tcp:
4447178 packets sent
3213910 data packets (786369784 bytes)
37563 data packets (4911575 bytes) retransmitted
1233466 ack-only packets (1158742 delayed)
0 URG only packets
0 window probe packets
2 window update packets
442587 control packets
4035776 packets received
2595631 acks (for 788833202 bytes)
1610 duplicate acks
0 acks for unsent data
2272119 packets (231529981 bytes) received in-sequence
1 completely duplicate packet (119 bytes)
87 packets with some dup, data (32057 bytes duped)
8151 out of order packets (5139464 bytes)
27 packets (3284241333 bytes) of data after window
0 window probes
17108 window update packets
18 packets received after close
7 segments discarded for bad checksum
0 bad TCP segments dropped due to state change
59830 connection requests
17612 connection accepts
77442 connections established (including accepts)
123247 connections closed (including 45820 drops)
44865 embryonic connections dropped
2428882 segments updated rtt (of 2428882 attempts)
211803 retransmit timeouts
44754 connections dropped by rexmit timeout
0 persist timeouts
158864 keepalive timeouts
152585 keepalive probes sent
87 connections dropped by keepalive
0 connect requests dropped due to full queue
1898 connect requests dropped due to no listener
0 suspect connect requests dropped due to aging
0 suspect connect requests dropped due to rate
udp:
0 incomplete headers
0 bad checksums
0 socket overflows
ip:
4307267 total packets received
0 bad IP headers
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
0 packets forwarded
0 packets not forwardable
icmp:
79 calls to generate an ICMP error message
0 ICMP messages dropped
Output histogram:
echo reply: 78
destination unreachable: 1
source quench: 0
routing redirect: 0
echo: 0
time exceeded: 0
parameter problem: 0
time stamp: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
0 bad ICMP messages
Input histogram:
echo reply: 55753
destination unreachable: 12
source quench: 0
routing redirect: 0
echo: 78
time exceeded: 0
parameter problem: 0
time stamp request: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
78 responses sent
igmp:
0 messages received
0 messages received with too few bytes
0 messages received with bad checksum
0 membership queries received
0 membership queries received with incorrect fields(s)
0 membership reports received
0 membership reports received with incorrect field(s)
0 membership reports received for groups to which this host belongs
0 membership reports sent
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 09:38 AM
06-22-2016 09:38 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
And five minutes later-
tcp:
4448640 packets sent
3214860 data packets (786533237 bytes)
37569 data packets (4911581 bytes) retransmitted
1233978 ack-only packets (1159250 delayed)
0 URG only packets
0 window probe packets
2 window update packets
442623 control packets
4036936 packets received
2596376 acks (for 788996672 bytes)
1610 duplicate acks
0 acks for unsent data
2272784 packets (231531798 bytes) received in-sequence
1 completely duplicate packet (119 bytes)
87 packets with some dup, data (32057 bytes duped)
8151 out of order packets (5139464 bytes)
27 packets (3284241333 bytes) of data after window
0 window probes
17111 window update packets
18 packets received after close
7 segments discarded for bad checksum
0 bad TCP segments dropped due to state change
59833 connection requests
17615 connection accepts
77448 connections established (including accepts)
123256 connections closed (including 45822 drops)
44867 embryonic connections dropped
2429609 segments updated rtt (of 2429609 attempts)
211821 retransmit timeouts
44756 connections dropped by rexmit timeout
0 persist timeouts
158879 keepalive timeouts
152600 keepalive probes sent
87 connections dropped by keepalive
0 connect requests dropped due to full queue
1898 connect requests dropped due to no listener
0 suspect connect requests dropped due to aging
0 suspect connect requests dropped due to rate
udp:
0 incomplete headers
0 bad checksums
0 socket overflows
ip:
4308442 total packets received
0 bad IP headers
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
0 packets forwarded
0 packets not forwardable
icmp:
79 calls to generate an ICMP error message
0 ICMP messages dropped
Output histogram:
echo reply: 78
destination unreachable: 1
source quench: 0
routing redirect: 0
echo: 0
time exceeded: 0
parameter problem: 0
time stamp: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
0 bad ICMP messages
Input histogram:
echo reply: 55757
destination unreachable: 12
source quench: 0
routing redirect: 0
echo: 78
time exceeded: 0
parameter problem: 0
time stamp request: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
78 responses sent
igmp:
0 messages received
0 messages received with too few bytes
0 messages received with bad checksum
0 membership queries received
0 membership queries received with incorrect fields(s)
0 membership reports received
0 membership reports received with incorrect field(s)
0 membership reports received for groups to which this host belongs
0 membership reports sent
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-22-2016 09:53 AM
06-22-2016 09:53 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Thanks for the info on netstat -s, I've never tried that particular version of that command. :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2016 08:40 AM
06-23-2016 08:40 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Thanks!
So the thing with netstat results is looking at them once is pretty meaningless. However, running like how I asked begins to give a picture of what's happening on the system. When you subtract the first set of numbers from the 2nd/later you can see the delta/change.
The other piece to the puzzle is this white paper: http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=c02020743&lang=en-us&cc=us -- and App. A in particular.
Here's the tcp deltas:
tcp:
1462 packets sent
950 data packets (163453 bytes)
6 data packets (6 bytes) retransmitted
512 ack-only packets (508 delayed)
0 URG only packets
0 window probe packets
0 window update packets
36 control packets
1160 packets received
745 acks (for 163470 bytes)
0 duplicate acks
0 acks for unsent data
665 packets (1817 bytes) received in-sequence
0 completely duplicate packet (0 bytes)
0 packets with some dup, data (0 bytes duped)
0 out of order packets (0 bytes)
0 packets (0 bytes) of data after window
0 window probes
3 window update packets
0 packets received after close
0 segments discarded for bad checksum
0 bad TCP segments dropped due to state change
3 connection requests
3 connection accepts
6 connections established (including accepts)
9 connections closed (including 2 drops)
2 embryonic connections dropped
727 segments updated rtt (of 727 attempts)
18 retransmit timeouts
2 connections dropped by rexmit timeout
0 persist timeouts
15 keepalive timeouts
15 keepalive probes sent
0 connections dropped by keepalive
0 connect requests dropped due to full queue
0 connect requests dropped due to no listener
0 suspect connect requests dropped due to aging
0 suspect connect requests dropped due to rate
At the time that you ran netstat, it appears there wasn't a lot happening (your above numbers a kinda small) and so I don't think there's any "Ah! Ha!" moments..... Having said that, you can see that there might be some "interesting" numbers. In particular 2 sessions were dropped. Why? It could be that the system could no longer reach <whatever> and so closed the connection. Why was <whatever> no longer there? Someone could have closed their laptop and left the building. It could also be that someone thought their session was hung (when it was really unresponsive) and ungracefully exited ("x-ing" out of hummingbird).
My advice is
- find a time when the system has more users on it
- run netstat -s
- <wait> (how long isn't really important, but at least 5 minutes)
- run netstat -s again
Do <something> to compare the numbers (drop them into excel? diff?) and compare the results to the disussion in the above white paper.
Finally, keep in mind on a really busy network -- the issue is NOT your system, rather it's the network itself. Think of trying to get onto the interstate during rush -- that's exactly what's happening with your packets. Armed with a netstat analysis, it may be possible to go to your networking team and say "Hey!" (as wonderful as networking teams can be they do seem to have a reputation for saying "there are no networking problems" :-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2016 06:15 AM
06-24-2016 06:15 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
Thanks for the links to the white papers. WIll give them a read. This server barely gets any use- 18 users max- which is another reason why I don't get why the remote user sessions drop.. It isn't from a heavy load.
Will keep track of netstat -s and keep an eye on the certain interesting numbers you mentioned.
Here's a fresh one from today, June 24-
------------------------------------------------------------------------------------------------------------------------
tcp:
4546858 packets sent
3285808 data packets (807571499 bytes)
39086 data packets (5025188 bytes) retransmitted
1261252 ack-only packets (1184862 delayed)
0 URG only packets
0 window probe packets
2 window update packets
452435 control packets
4127369 packets received
2651991 acks (for 810062267 bytes)
1651 duplicate acks
0 acks for unsent data
2323447 packets (237362168 bytes) received in-sequence
1 completely duplicate packet (119 bytes)
89 packets with some dup, data (33129 bytes duped)
8339 out of order packets (5251771 bytes)
27 packets (3284241333 bytes) of data after window
0 window probes
17433 window update packets
18 packets received after close
7 segments discarded for bad checksum
0 bad TCP segments dropped due to state change
60942 connection requests
17996 connection accepts
78938 connections established (including accepts)
125648 connections closed (including 46724 drops)
45686 embryonic connections dropped
2480824 segments updated rtt (of 2480824 attempts)
216572 retransmit timeouts
45582 connections dropped by rexmit timeout
0 persist timeouts
162795 keepalive timeouts
156413 keepalive probes sent
149 connections dropped by keepalive
0 connect requests dropped due to full queue
2159 connect requests dropped due to no listener
0 suspect connect requests dropped due to aging
0 suspect connect requests dropped due to rate
udp:
0 incomplete headers
0 bad checksums
0 socket overflows
ip:
4405083 total packets received
0 bad IP headers
0 fragments received
0 fragments dropped (dup or out of space)
0 fragments dropped after timeout
0 packets forwarded
0 packets not forwardable
icmp:
79 calls to generate an ICMP error message
0 ICMP messages dropped
Output histogram:
echo reply: 78
destination unreachable: 1
source quench: 0
routing redirect: 0
echo: 0
time exceeded: 0
parameter problem: 0
time stamp: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
0 bad ICMP messages
Input histogram:
echo reply: 56777
destination unreachable: 14
source quench: 0
routing redirect: 0
echo: 78
time exceeded: 0
parameter problem: 0
time stamp request: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
78 responses sent
igmp:
0 messages received
0 messages received with too few bytes
0 messages received with bad checksum
0 membership queries received
0 membership queries received with incorrect fields(s)
0 membership reports received
0 membership reports received with incorrect field(s)
0 membership reports received for groups to which this host belongs
0 membership reports sent
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2016 07:44 AM
06-24-2016 07:44 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
I don't like that you have tcp checksums....but the number is so low that it's likely the "frankengram" scenario....
Your re-transmit rate is about 4% which may be enough for this application to have issues. I still urge you to raise this with your networking team. Perhaps they can do some tuning on their side.....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-27-2016 06:12 AM
06-27-2016 06:12 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
I think the problem is on the network at the remote location, the user's PC, or something with the Hummingbird Host Explorer software. I just don't see anything on the Unix server I can fix to help with this.
Learned a lot about netstat and ndd though, so thanks everyone!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2016 02:01 AM
06-28-2016 02:01 AM
Re: Question about tcp_keepalive_interval (user sessions dropping)
https://confluence.eits.uga.edu/display/HDSH/Hummingbird+Issues
has the description to set up keepalive signal to be sent from the client side under "Keep Alive Signal" section. This would be the perfect solution, I suppose, though tcp_keepalive_interval can be set upto 10*24*3600000.
If this action is set by the Hmmingbird side, then, you'll never get the session timed out.