Operating System - Tru64 Unix
1821638 Members
3026 Online
109633 Solutions
New Discussion юеВ

FTP session in CLOSE_WAIT - never exits

 
J117
Occasional Advisor

FTP session in CLOSE_WAIT - never exits

I am transferring a large number of files across our WAN using FTP. The files are sent one at a time with a new ftp session created for each file (mput * does not work because of the number of files in the source directory).

The files are small. 50k-90k.

The script will chug along and successfully FTP anywhere from 300 to 5000 files before it hangs in the FTP session. A netstat -a reveals the following:

# netstat -a | grep tanort
tcp 118 0 ussc50.4443 tanort-ap02.thcg.net.ftp CLOSE_WAIT

It will stay in this state indefinitely. The above scenario occurs when the source system is Tru64 Unix 5.1A and the target system is Windows 2003 server. The FTP is initiated from the Tru64 side.

If on the other hand the source system is Tru64 and the target system is also Tru64 it doesnt hang but the problem does manifest itself in a different way - the target system will end up with an empty file and the script will eventually move on to the next FTP. Subsequent FTPs will be successful for an unkown/variable number of interations until the problem is again encountered.

A pseudo code representation of what I am experiecing looks like this:

Loop A
---Open FTP connect
---put filename
---close
---get next filename
End loop A - execute until run out filenames.

Tru64 -> Windows results in indefinite CLOSE_WAIT

True64 -> Tru64 CLOSE_WAIT eventually closes but target file is 0 bytes.

Right now I dont know if the problem is in the network (router/switch) or on the Tru64 box and/or Windows box.

As a side note: I notice that FTP connections that are opened and then closed - do not actually close according to netstat -a for apprx 90 seconds. I tried putting a 90 second delay every 100 files thinking that perhaps I was exhausting the connection limit - although this did reset the number of open connections from 100 down to 0 every 100 files it did not prevent the problem from occuring eventually.

This last factoid may be irrelevant to the situtation.
9 REPLIES 9
Steven Schweda
Honored Contributor

Re: FTP session in CLOSE_WAIT - never exits

I can't explain the problem, but why
open-close for every file instead of just
open-put-put-put-...-put-close? (Not slow
enough the easy way?)

Without more debug data from the side with
the problem, it may be difficult to determine
where the problem lies. The network hardware
is probably the least likely trouble-maker,
I'd guess.

If I had to move a large number of files,
I'd tend to tar-bzip2/gzip (or Zip) them,
ship the collection, and un-whatever the
pile at the other end. That tends to
preserve the date-times better, too. Why
are you trying to do it this way?
J117
Occasional Advisor

Re: FTP session in CLOSE_WAIT - never exits

Steve Schweda wrote:
I can't explain the problem, but why
open-close for every file instead of just
open-put-put-put-...-put-close? (Not slow
enough the easy way?)
---end of quote

The quantity of files being moved is in excess of 174,000. The open-put-put-put method might actually work except that I need to be able to process the file names individually in order to bypass the limit that results in arglist too long errors. I'm not sure how to do that from inside of ftp. There is not to my knowledge a method to specify some sort of xargs redirection from inside of ftp. If there is I'm certainly all ears.

While I could probably do some awk'ing to produce a 174,000 line ftp script file; it does seem a little bit of the brute force method.

I also wish to identify is there is something specifically wrong in the environment.

This is not the first time I have seen behavior similar to this - the most common culprit has been a duplex mismatch between one of the hosts and the switch port - however the network group assures me that the Windows port is clean, and the unix netstat command shows clean stats as well.

Space constraints make the generation of a gzip file problematic.

Thanks for the suggestions.
Steven Schweda
Honored Contributor

Re: FTP session in CLOSE_WAIT - never exits

I'm too lazy to work out the details, but I'd
think that it ought to be possible to pipe
the output from a suitable "find" command
into an FTP client. That ought to avoid the
174000-line script, and the repeated
open-close annoyance. The feasibility here
might depend on the directory structure,
however, of which I know nothing.

> I also wish to identify is there is
> something specifically wrong in the
> environment.

I'd prefer to blame Windows (or the FTP
server being used there), and then move on to
find a method which works. (But I'm lazy.)

I'd also say that 174000 open-close
operations is more brutish than a
174000-line file full of "put" commands, but
that's only an opinion.
Al Licause
Trusted Contributor

Re: FTP session in CLOSE_WAIT - never exits

Before placing blame on one side or the other try running tcpdump on the session. It will be long so if you are going to direct it to a file, chose an area with plenty of disk space.

The good thing is you probably don't need to look at the entire output....only the end. What you want to look for is first dropped or lost packets, duplicate acks, retransmissions....all of which could indicate a network problem.

But look at the last few packets before it stops sending....see who is waiting on who.
Look for a possible 0 window size situation.
If one system sent a packet an received no response from the other system, the problem is very likely on the other system that did not respond.
J117
Occasional Advisor

Re: FTP session in CLOSE_WAIT - never exits

Update: I have determined that the hang occurs whenever the initiating side uses port 4443 (possibly the problem is 4444 due to the pairing).

netstat -an reveals that:
tcp 118 0 10.85.32.207.4443 10.85.125.66.21 CLOSE_WAIT

The Rec-Q always has 118 (Requests? Packets?) in it. This hang at port 4443 is consistent. The variable number of successful FTPs depended on where I jumped into the circular queue of ports. As soon as it hit 4443 it would hang in CLOSE_WAIT.

I did not yet use tcpdump to do any of the troubleshooting. Its active, I have the packetfilter set up and can put the interface into promiscuous mode but I wasnt certain of the format of the command to use.

So...anyone have any insights on why this might be happening? I thought that FTP selected a range of ports that were reserved to FTP. Is there any way to restrict which ports ftp will select?

Thanks,
John
Steven Schweda
Honored Contributor

Re: FTP session in CLOSE_WAIT - never exits

Hmmm. A quick Google search offers Oracle
as a potential user of ports at and above
4443.

Selecting passive FTP changes the way ports
are chosen/used, so that might be an
interesting thing to try.

Any firewalls in between these systems which
might be blocking particular ports in this
neighborhood?
Philip Lawrence_1
Occasional Advisor

Re: FTP session in CLOSE_WAIT - never exits

Verify that there is no 'lurking' connection on port 4443 prior to running the ftp script. This sort of thing has bitten me a few times with large numbers of ftp transfers.
J117
Occasional Advisor

Re: FTP session in CLOSE_WAIT - never exits

My research on this got placed on the back burner for a little while. However, I do have these additional notes to offer:

In order to reduce the chances of there being a process lurking out there on the UNIX port I booted the system to single user mode and then started just the basic inetd daemon so that I could run network connections without starting any other products that might be on the box.

The connection was being made from Tru64 (Single user mode) FTP connection to Windows 2003 Server. The Windows 2003 was in normal mode.

End result: Even from single user mode with networking started it still chokes on port 4443/4444.

I don't see either of those ports configured in any of the configuration files in /etc/.

Can source system port selection conflict with something on the remote box??

i.e. If system A receives an ftp request from system B on a specific port does system A ever care what port the request is initiated from on system B?? I've always thought the port number was encapsulated in the initiating packet so that the system A receiving the connection request would not care what port was being used for the connection on the remote system B.

Steven Schweda
Honored Contributor

Re: FTP session in CLOSE_WAIT - never exits

> I don't see either of those ports
> configured in any of the configuration
> files in /etc/.

What, you mean "/etc/services"?
"/etc/services" maps service names to port
numbers. It tells you nothing about which
ports are (or may be) in use. "netstat"
could probably tell you which ports are in
use.

Did you ever try FTP passive mode?