1753512 Members
5258 Online
108795 Solutions
New Discussion юеВ

Re: FTP problem

 
ianvt
Advisor

FTP problem

We are encountering a problem with ftp, when you ftp from a VMS host to another host, but cannot connect, the job does not abort. It tries to connect and holds up the STATUS$BATCH queue until the job is deleted.

Has anyone encountered this before?
15 REPLIES 15
Wim Van den Wyngaert
Honored Contributor

Re: FTP problem

If a TCP connection cannot be established within a period of time, TCP will time out the connection attempt. The default timeout value for this initial connection establishment is 75 seconds. The TCP_KEEPINIT option specifies the number of seconds to wait before the connection attempt times out. For passive connections, the TCP_KEEPINIT option value is inherited from the listening socket. The value of TCP_KEEPINIT is an integer between 1 and n, where n is the value for the systemwide parameter tcp_keepinit . The default value of the systemwide parameter tcp_keepinit , specified in half-second units, is 150 (75 seconds).
To display the values of the systemwide parameters, enter the following command at the system prompt:

$ sysconfig -q inet


Is the setting way too high on your site ?

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: FTP problem

BTW : we have it on 40 (* 0.5 seconds).
Tested it and ftp stops indeed after 22 seconds (to a node that is down).

Wim
Wim
ianvt
Advisor

Re: FTP problem

Hi Wim

The TCP_KEEPINIT value is 150. Attached is the output of $sysconfig -q inet.

Thanks
Ian
Wim Van den Wyngaert
Honored Contributor

Re: FTP problem

And how long is ftp in "hang" ? And what kind of problem did you have / was the remote node up ?

You can try a tcptrace to find out if what it was doing.

Could you post sys$specific:[tcpip]sysconfigtab.dat to see what was changed compared with the d4efaults ?

VMS / TCP version ?

Wim
Wim
Hoff
Honored Contributor

Re: FTP problem

There's not enough detail here for a particular response (eg: IP stack and stack version, OpenVMS version and platform, and a small demonstration or reproducer with the commands could all be useful here); this could easily be some timers or some firewall or such.

As for an alternative guess, this could (also) be an attempt to script an ftp transfer here rather than using the DCL command COPY /FTP (and this command assuming the box is running V6.2 or later with a compatible IP stack), and that there's some sort of a DCL coding error here. There are various ways to get an ftp transfer job to hang, and timers are just one possibility.

Then there's that ftp is a protocol-level problem and a security problem and a firewall problem, but that's fodder for another discussion. sftp is my preferred choice for use on an open network...
ianvt
Advisor

Re: FTP problem

Here is some more info:
The ftp "hangs" until aborted. The problem with the remote host is unknown.

HP TCP/IP Services for OpenVMS Alpha Version V5.4 on a hp AlphaServer GS1280 7/1150 running OpenVMS V7.3-2.

type sys$specific:[tcpip]sysconfigtab.dat gives error:
%TYPE-W-SEARCHFAIL, error searching for SYS$SPECIFIC:[TCPIP]SYSCONFIGTAB.DAT;
-RMS-E-DNF, directory not found
-SYSTEM-W-NOSUCHFILE, no such file

Example of the script:
$ ON ERROR THEN GOTO ERROR
$ @ACK$PROC:NAT-BATCH-SETUP
$ FTP XXX.XX.XX.XX /USER=XXXXXX/PASSWORD="XXXXXXX"
EXIT
$ @ACK$PROC:NAT-CHECK-STATUS $STATUS SELF
$ IF STATUS .NES. "0" THEN GOTO ERROR
$ @ACK$PROC:NAT-JOB-END NONE
$ @ACK$PROC:CHANGE-JOB-STATUS LOG461U P AXP2READY
$ @ACK$PROC:CHANGE-JOB-STATUS LOG461C P AXP3READY
$ EXIT
$ERROR:
$ @ACK$PROC:NAT-JOB-ABORT
$ @ACK$PROC:CHANGE-JOB-STATUS LOG461C P AXP3READY
Wim Van den Wyngaert
Honored Contributor

Re: FTP problem

Sorry it's SYS$SPECIFIC:[TCPIP$etc]SYSCONFIGTAB.DAT;

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: FTP problem

And what's the other side of the connection (HW, version OS) ? Any firewalls ?

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: FTP problem

Have you got
DEFINE /SYSTEM/EXEC TCPIP$FTPD_KEEPALIVE 1

I just tested after doing ucx set servi /sock=keep. That's accepted but doesn't work. But the logical works.
Have the impression that keepalive for the initial connection always works. And I guess too that your remote site answered but the connection was lost. And without the logical it hangs forever. Note that with the logical you still have to wait a very long time before it aborts.

This is the contents of my config to avooid this long timeout.

inet:
# detect broken connection after 5 minutes instead of 2 hours
tcp_keepcnt=5
tcp_keepidle=120
tcp_keepintvl=120

Wim
Wim