Operating System - OpenVMS
1839278 Members
2736 Online
110138 Solutions
New Discussion

Re: TCPIP$FTP_CLIENT hangs

 
SOLVED
Go to solution

TCPIP$FTP_CLIENT hangs

Hi,

we have problem with TCPIP Services 5.3-ECO2 running on VMS 7.3-1...

We have several applications on the OpenVMS systems that need to send ASCII-Files to Windows machines sitting in remote locations (in stores, serving till systems).

In the past, we had intermittent problems with sending the files such that TCPIP$FTP_CLIENT enters a state where it hangs and wouldn't die for days if we wouldn't realise (with only one sending application). If we realised, we just stopped the process (most of the times, STOP/ID/IMAGE helped).

Interestingly there was no connection problem to the remote location (e.g. pings, traceroutes and subsequent FTP's worked).

Recently this problem got more severe, causing many other jobs to stall as well. We tried to switch to COPY/FTP/PASSIVE which didn't help.

Checking the process with SHOW PROC/ID/CONT shows that nothing happens (no DIO, BIO, CPU or PC). Checking with SDA, SHOW PROC/CHAN shows, that the process is busy on the BG-device for the ftpdata connection. A TCPIP SHOW DEVICE BGnnn /FULL reveals the following:

Device_socket: bg3003 Type: STREAM
LOCAL REMOTE
Port: 62996 21
Host: 10.20.1.31 10.23.245.31
Service: FTP

RECEIVE SEND
Queued I/O 0 0
Q0LEN 0 Socket buffer bytes 0 0
QLEN 0 Socket buffer quota 61440 61440
QLIMIT 0 Total buffer alloc 0 0
TIMEO 0 Total buffer limit 491520 491520
ERROR 0 Buffer or I/O waits 1 0
OOBMARK 0 Buffer or I/O drops 0 0
I/O completed 17 12
Bytes transferred 658 171

Options: None
State: ISCONNECTED PRIV
RCV Buff: WAIT
SND Buff: None


And the WAIT state irritates me... even though I don't know exactly what it means.

While the process is hanging, the connection to the remote site still works...

Does anyone have any idea what this could be? I seeked Google and the ITRC forums, but couldn't find anything!

Any help is greatly appreciated!

Best regards,
Matthias DjurkoviC
H&M Germany
16 REPLIES 16
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Matthias,
it's very hard understand what cause your problem. For my experience, intermittent trouble are due mainly to autonegotiation of hub-switch or NIC.
So, because you have server ftp on windows PC you can fix 100Mbs (or 10Mbs) on Windows NIC. It very simple make it and you don't need reboot neither vms neither PC.
After of this you see if you meet less trouble.

Antonio Vigliotti
Antonio Maria Vigliotti

Re: TCPIP$FTP_CLIENT hangs

Hi Antonio,

thanks for your reply! Anyhow, this wouldn't help, as the network is configured to fixed speeds, both on the OpenVMS box and on the Windows boxes.

If it had something to do with the connection speeds or duplex settings, I suppose the connections wouldn't work at all! But: while TCPIP$FTP_CLIENT hangs, I can ping and even connect to the server on the other side and there is probability that it works the next time that transfer starts!

So long and thanks again!

Matthias
Volker Halle
Honored Contributor
Solution

Re: TCPIP$FTP_CLIENT hangs

Matthias,

found this in Ask Compaq:

http://h18000.www1.hp.com/support/asktima/communications/CTI_SRC020328001915.html

Do you have the TCPIP$FTP_KEEPALIVE logical defined ?

Volker.
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Hi Matthias,
for my experience FTP meets trouble while other transport work fine.
I don't know why.

However, you need more information about this problem. If your win server are winxp or win2k or win2k3 you can view event register to see any message about connection. In your message you posted send from 1 vms client to many win server. Do you meet trouble with every server? Do you make reverse operation (from windows to vms)?
Do you have other OS in network?

Antonio Vigliotti
Antonio Maria Vigliotti
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Hi Matthias,
if on your PC you have SMC Elite Ultra NIC read here
http://support.microsoft.com/default.aspx?scid=kb;en-us;131865

if your PCs have win2k without SP1 read here
http://support.microsoft.com/default.aspx?scid=kb;en-us;269425

if your PCs have NT4 without SP4 rea here
http://support.microsoft.com/default.aspx?scid=kb;en-us;224793

Antonio Vigliotti
Antonio Maria Vigliotti

Re: TCPIP$FTP_CLIENT hangs

Hi Volker,

thanks for the article with TCPIP$FTP_KEEPALIVE! I couldn't find it, even though I searched the net...

It seems as if this has worked! We had defined the logical to equal to 1 and today we could see:

%TCPIP-E-FTP_NOTCOPIED, file BO$ROOT:[RTI.RX]BR006572B20.0239A1;1 not copied
-TCPIP-W-FTP_REPLYTEXT, 426 Connection closed; transfer aborted.
%TCPIP-E-FTP_NETERR, I/O error on network device
-SYSTEM-F-CONNECFAIL, connect to network object timed-out or failed
426 Connection closed; transfer aborted.


so this seems to have worked, I'll see it in the coming week when the other jobs work as they should as well!

Thanks again!
Matthias

Re: TCPIP$FTP_CLIENT hangs

Hi Antonio,

thanks again for your efforts!

I am pretty sure that it was not solely FTP. Yes, we're connecting to many different NT hosts from the VMS world, but also to some other VMS machines and IBM as well. The problem appears only with Windows-Machines as partner (when I think about it now). The interesting point is: when the problem appeared, it is so that the machine is reachable and even when there is FTP connection stalling other FTP transfers immediately before or after do work! So we were having connection drops on single connections but not the whole line. Btw: it couldn't have been a firewall problem, because the lines are open.

So, I think the article Volker has sent was explaining it pretty good.

Your other findings: no, I don't think we're using SMC NIC's on the servers. I think that there are Intel NIC's running in them, though I am not sure, since I have no access to the windows-boxes and do not manage them.

Thanks again!
So long,
Matthias
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Hi Matthias,
I'm confused: you see
%TCPIP-E-FTP_NOTCOPIED, file BO$ROOT:[RTI.RX]BR006572B20.0239A1;1 not copied
-TCPIP-W-FTP_REPLYTEXT, 426 Connection closed; transfer aborted.
%TCPIP-E-FTP_NETERR, I/O error on network device
-SYSTEM-F-CONNECFAIL, connect to network object timed-out or failed
426 Connection closed; transfer aborted.
and you post it works :-?

Antonio Vigliotti
Antonio Maria Vigliotti
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Hi,
sorry if I post again ...
Do you transfer file bigger than 4Gb?
http://support.microsoft.com/default.aspx?scid=kb;EN-US;186101

You can meet trouble even if NT server goes in sleep mode/standby; in this case, NIC wake up the server but answer to sollicited request with delay breaking handshake.

Antonio Vigliotti
Antonio Maria Vigliotti

Re: TCPIP$FTP_CLIENT hangs

Hi Antonio,

I can understand that you are confused ;-)

Problem is: there is a job that is running approximately 15 hrs a day and checking for new files to transfer (all files between 3 - 60 Blocks in size) every 5 minutes. These files have to be on the Windows machines at least nearly "in time".

If we now have a stalling transfer and don't realise, the job will stall forever (because of 1 single file) and the remaining files will not reach their target. This is obviously a problem.

Now, if we get the feedback from FTP_CLIENT that it doesn't work, the stalling/hanging is away and the job will retry the next time. And this is exactly what happened last night:

we got the message, and the transfer aborted. 5 minutes later, the job found the same file again and sent it to the target (because 5 minutes later the FTP worked again).

So, yes: it "worked"! At least it reported back the error instead of hanging around for ages!

I know, that this is not the "real" solution to the problem but at least the files will reach their target and we can concentrate on solving the "real" issue (like network problems and such).

So long, and thanks very much for your efforts!

Matthias

Wim Van den Wyngaert
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Matthias,

If I (and Antonio ?) understand correctly, windows NT is not reacting as it should with the result that a transfer goes into hang. You solved the hang by setting keepalive. The real problem is on NT and is not solved. Correct ?

Wim
Wim

Re: TCPIP$FTP_CLIENT hangs

Hi Wim,

you are probably correct.

However, I believe it to be incorrect behaviour for TCPIP$FTP_CLIENT to stall forever (and this is really forever) if there is something wrong with the connection while waiting to receive data!

So it is probably on the Windows side but could be on the network side as well. In fact, I believe it to be something with the network, because of the sudden increase of incidents. We usually had this problem maybe once or twice a month and only yesterday it started to become a real and urgent problem.

So we had to find a solution/work-around on the VMS-side. I think it's the same in most companies: Windows-Side is always correct and OpenVMS-side isn't...

So long,
Matthias
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Matthias,
if you are patient and you have got time you can execute TCP/IP troubleshooting techniques decribed here
http://h71000.www7.hp.com/doc/732final/6631/6631pro_contents.html#toc_chapter_1

Good Luck
Antonio Vigliotti
P.S.
Obviuosly you can post again :-)
Antonio Maria Vigliotti
Wim Van den Wyngaert
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Matthias,

You should examine all tcp sockets for keepalive.
If you use a telnet without keepalive, it will hang forever when the other side stops existing.
All applications should request it (also check inet subsystem attribute tcp_keepalive_default to enable keepalive functionality for programs not requesting it).

Wim
Wim
Antoniov.
Honored Contributor

Re: TCPIP$FTP_CLIENT hangs

Hi Matthias,
on freeware v5 http://h71000.www7.hp.com/openvms/freeware/index.html
you can find a ftp mirror; you can find some inspiration from it.

Antonio Vigliotti
Antonio Maria Vigliotti

Re: TCPIP$FTP_CLIENT hangs

This was open for too long time now. We had no real idea how to solve the problem, as it reappeared all the time.

Since I have found the hang to suddenly disappear while analysing the process with SDA (at least it seems so) I have now decided to contact support to have them help us.