- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: TCP requests dropped due to full queue
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2006 05:07 PM
04-09-2006 05:07 PM
TCP requests dropped due to full queue
I work for a field operations group that sets up and tears down HP-UX 11i networks by the month. But for years now we have a problem with one application that operates over NFS. The application is just a GUI that reads and writes to a flat DB file. It updates by reading the file every 30 seconds (I believe). So here is what happens:
As more and more computers mount to the NFS server there will come a time when the application on one of the systems hangs.
bdf will hang when trying to access this mount point. When I go to the server and look at the TCP statistics with 'netstat -ptcp' I see that there are many "connection requests dropped due to full queue". I thought that I had come up with a hard number of 51 active nfsd sockets,
as shown by 'netstat | grep nfsd' when the problems occurs. But recently the problem has come up with fewer nsfd connections. These connections can be "ESTABLISHED" , "CLOSE_WAIT" or "FIN_WAIT..."
Originally all the mounts were NFSv3, so I tried to make all of them NFSv2 but that did not help either. I have many other servers, all with the same OS load, with many more NFS mounts to them that do not have this problem.
I have read many posts on how the applications call to listen() may not have enough 'backlog' space, but I can not see how this would relate to an app that is just reading a file. It has no idea it is over NFS.
I tried increasing the connection_requests_max but that just means that the queue takes longer to fill before the stack throws requests away.
I did an experiment last year were I watched the sockets being established then wathced them go through all their states until they were cleared. NFSv2 took about 5mins and NFSv3 never cleared. That was why I initially thought it was an NFSv3 issues. But, now that I am only using v2 on that systems, and still have the problem, I have ruled that out.
I have used many combinations of nfsstat, rpcinfo and netstat and still can not find something that tells me why tcp is dropping all these requests. I tried increasing nfsd, biod (although I have read that setting biod to 0 on the client mught be better) and neither helped. But I still think it is a server issue.
Today I discovered nfsstat -m. That shows me what version of NFS a mount is using and what protocol is being used. I found that all the nfs mounts that showed statistics from nfsstat -m said they were using udp. All the ones that had no statistics showed tcp. I thought that under NFSv3, tcp was tried first and if a connection could not be made then NFS tried udp. So I am confused on that one.
If anyone has made it this far in the post and thinks they have any ideas please let me know.
I would really like to know if there is a way to find out what is leaving the server tcp queue. If it is filling up, how do I find out what is in it and what the processes that are suppose to be servicing it are doing????!?!!?
Anyways, that was many years of ranting. Sorry.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-10-2006 04:38 AM
04-10-2006 04:38 AM
Re: TCP requests dropped due to full queue
You can increase max number of kernel thread from NFS server side.
increase "max_thread_proc" value.
Best Regards,
Sung
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-10-2006 11:37 AM
04-10-2006 11:37 AM
Re: TCP requests dropped due to full queue
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-10-2006 12:51 PM
04-10-2006 12:51 PM
Re: TCP requests dropped due to full queue
If you bump it and you still get connection requests being dropped, it suggests that your application is being prevented from calling accept() with sufficient frequency. Perhaps the time it takes to write() to the NFS-mounted DB file is at issue.
There is a "queue" per listen socket. It used to be possible on 10.20 to see which specific listen endpoints were filling but that information isn't in netstat with 11.0 and later. It would require unpublished internals knowledge. Still, probably worth submitting an ER via the RC.
Ah, I just went back and saw where your application is not accepting TCP connections, it is simply writing to a file. Then you can mostly disregard the above :) :( :)
Is there anything else besides NFS happening on the server? NFS mounts _should_ be fairly static - when over TCP the connections shouldn't be coming and going with any great rapidity. I don't know that an NFS server will initiate a connection close, I thought it was the client, but then the server has to deal with dead clients somehow. Just what is the rate of TCP connection establishment and tear-down?
Perhaps the disc(s) serving the filesystem(s) being exported by the NFS server are getting saturated and the nfsd's et al are backing-up on that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-12-2006 04:39 PM
04-12-2006 04:39 PM
Re: TCP requests dropped due to full queue
I found an old HP Document that seems to reference my exact problem, and the date is appropriate also (we are using a 3rd party vendor load which is the Feb.2001 release of 11i). It is titled "NFS Performance tuning for HPUX 11.0 and 11i systems" and is dated July, 2002 (written by Dave Olker) .On page 118 it stated that there was a bug in the initial release of the 11i NFS server that makes the server stop responding to nfs/tcp requests after the max number of threads has been reached. It also references an "NFS Patch" that was released "in the Summer of 2001" to fix this. I have searched 'NFS' in the ITRC patch section and can not find one that directly addresses this problem.
Rick, (and I am assuming that the hp logo in the forum means you work for HP) is there any way you can do a search from your side and see if you can find it. There was one in Sept. of 2001 that talks about NFS deadlock and one from Apr.2002 about threads and NFS but niether references the tcp issue.
To Sung, there was a comment on the same page that a work around is to increase max_thread_proc so thank you for that suggestion. I rebuilt the kernal today and we will see what happens.
Another question if I may. Every mount to this server is done as follows:
mount -o vers=2 ... ...
BUT, the mount takes 75 seconds to complete and on the server I see 6 TCP connection requests dropped. Then, once the server times out, the connections is made with UDP. I thought that NFSv2 was only over UDP and only v3 tried TCP first, then dropped down to UDP. Why is "mount -o vers=2" trying v3 first ?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-12-2006 05:44 PM
04-12-2006 05:44 PM
Re: TCP requests dropped due to full queue
( SR:8606167053 CR:JAGad36339 )
An NFS/TCP client operation receives "NFS server not
responding still trying" messages while attempting to access
the server, even though the server system is up. The server
displays "vmunix: WARNING: tcpd_thread_create: thread_create
failed: 11" messages in /var/adm/syslog/syslog.log.
The patch with the fix does two things:
(1) prints the warning "vmunix: WARNING: tcpd_thread_create: thread_create failed: 11" in syslog.log when max_thread_proc limit is reached
(2) nfsktcpd doesn't stop servicing requests
This means that even with the patch installed (on your NFS server and all clients) you still need to adjust max_thread_proc (per-process limit) and possibly nkthread (system-wide limit).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-13-2006 04:38 AM
04-13-2006 04:38 AM
Re: TCP requests dropped due to full queue
It just happens that HP-UX didn't include support for NFS over TCP until it included support for NFS v3. IIRC.
As for the patches, I'd be inclined to simply install whatever you found for latest NFS client and/or server patches - and their dependencies of course :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-13-2006 04:41 AM
04-13-2006 04:41 AM
Re: TCP requests dropped due to full queue
Must be on vacation - this is his domain - for sure.
NPP,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-15-2006 05:23 PM
04-15-2006 05:23 PM
Re: TCP requests dropped due to full queue
I am currently examining all the patches that relate to NFS for 11.11 for dependencies.
But why is it that if I search the Patch database for HPUX on 700s with 11.11 I do not get 23502 in the list?? There are other patches in the list that have been superseded.
As a note, since I have changed the max_thread_proc kernal value I have not had any problems. I hoping this is not just because it takes longer to tie up 256 threads than 64.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-16-2006 12:20 PM
04-16-2006 12:20 PM
Re: TCP requests dropped due to full queue
I found PHNE_24909 which makes three references to my issues:
1. SR 8606168123, JAGad37405 which talks about sockets stuck in the CLOSE_WAIT state (This I see a lot of).
2. SR 8606167053, JAGad36339 which talks about the inability of the nfsktcpd to create new threads, and
3. SR 8606144478 JAGad13818 which refernces clients sending FIN signals, but nothing happens.
UNFORTUNATELY, this patch is for 11.00!
In PHNE_23502 (which is for 11.11) there is only a refernce to number 2 above. How can I search the patch database for these SR numbers? I would really like to find out the number of the patch for 11.11 that references number 1 above. Is there some other place (on the web) that you can search though the full patch details?
Thanks again all.