Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Varing response times for FTP file transfer process

 
SOLVED
Go to solution
John A. Beard
Regular Advisor

Varing response times for FTP file transfer process

Hi,

I am noticing an anomaly in transfer times when it comes to pushing files (FTP) between two VMS servers across our European WAN.

Anything over 40,000 blocks is taking a very very long time to complete. When I tried to test things using files (smaller) of differing sizes, the job tended to complete in a much more timely manner.

When I specified a file with c. 1,300 blocks, it took less than 50 seconds. I then worked with a file of about 5,000 blocks, and this took about 9 minutes. It appears that the larger the file, the greater the problem becomes. Is there some issue with FTP and the actual size of the file being transferred?.

These are two old Alpha 1000s, and the version of TCP/ip is Digital TCP/IP Services for OpenVMS Alpha Version V4.1 - ECO 10. These, and many other similar European based servers are due for decommisioning within 12-18 months, so business does not want these boxes upgraded. The FTP file transfer process has been functioning without incident for many years. The source machine is currently down, but here is the interface spec of the target server.

LANCP> sh dev/char

Device Characteristics EWA0:
Value Characteristic
----- --------------
Normal Controller mode
External Internal loopback mode
CSMA/CD Communication medium
16 Minimum receive buffers
32 Maximum receive buffers
No Full duplex enable
No Full duplex operational
TwistedPair Line media
10 Line speed (megabits/second)
Glacann fear críonna comhairle.
20 REPLIES 20
WWarren
Advisor

Re: Varing response times for FTP file transfer process

Your question isn't easy to answer; there are a lot of possibilities. FTP shouldn't care about the file size (within architectural limits).
I'd recommend looking at the network with a network analyzer to get a better picture, and find out how the WAN is constructed (if a private WAN). It may be as simple as congestion, or as difficult as IP stack bugs. Some WANs have traffic shaping devices that will shift FTP traffic down to lower priority.

I can recommend a couple of things to try, though:
- How does 'pulling' the data (FTP get) compare to 'pushing' (FTP put) the data?
- How does this compare to FTP times for data between other operating systems at these endpoints?
- Ping the remote system during a transfer and see how the times compare to when there is little activity.
- If the stack supports it, the FTP hash command can provide some idea (when run interactively) of whether the entire transfer is slow or whether the slowness is intermittent.
labadie_1
Honored Contributor

Re: Varing response times for FTP file transfer process

Hello

May be the route is not always the same, and sometimes you pass through a saturated router.

Can you do
$ @sys$startup:tcpip$define_commands
and then
$ traceroute ftp_node
for both nodes.

$ netstat -m
and
$ netstat -s
can show, for both nodes, interesting things.

John A. Beard
Regular Advisor

Re: Varing response times for FTP file transfer process

Thanks for both your replies. I will endevour to carry out your suggestions once the source server is brought up again.

Glacann fear críonna comhairle.
Jim_McKinney
Honored Contributor
Solution

Re: Varing response times for FTP file transfer process

And it could be that your issue isn't TCP/IP or even network related. Is it possible that your large files used for testing are extremely fragmented? Is the recipient system able to quickly create a file and continually able to extend it on disk? You might also try transferring those files to the NL: device on the remote end and see if the transfer speed remains constant regardless of file size.
Robert Gezelter
Honored Contributor

Re: Varing response times for FTP file transfer process

John,

I concur with the attempt to transfer to NL: for the purpose of guaranteeing that the problem is not file extends. I would also suggest raising the buffering and blocking factors on the local end to ensure that disk delays and fragmentation have less of a [potential] impact.

Do you have any statistics for the links on both sides?

- Bob Gezelter, http://www.rlgsc.com
Hoff
Honored Contributor

Re: Varing response times for FTP file transfer process

You indicate long times for big files, but times only for smaller files. How long for the big file?

You indicate two hosts, but do discuss the presence of other hosts. Do these transfer times involve just these two hosts, or are multiple hosts and network paths involved?

You show a 10 megabit Ethernet controller, but provide no indication of the connection between the two sites. How big is the pipe? 10 MbE all the way?

Disk fragmentation is certainly a potential cause, as will be file creation settings in the context of the target system.

Here are some old tuning documents:

http://h30266.www3.hp.com/odl/vax/network/tcpipv42/manage/6526pro_018.html

Look to something akin to this in the target system login context or (as shown) system-wide:

$ DEFINE /SYSTEM /EXEC UCX$FTP_WNDSIZ 4096
$ DEFINE /SYSTEM /EXEC UCX$FTP_FILE_ALQ 500
$ DEFINE /SYSTEM /EXEC UCX$FTP_FILE_DEQ 500

FTP is not fast to start with, has issues when firewalls are involved, and tends to degrade badly in the presence of network errors or high-latency connections.

You can up the window size, but that doesn't usually help all that much.

Since you're not swapping anything, see what a multi-step process might give you over the same link. Try Mac OS X to Mac OS X or another local client, for instance, over the same IP link, pushing the same file size around. (And the obvious one-plus, if that transfer looks good -- see if zipping the file (with "-V") and tossing it over to the Mac (or whatever) and sending the file from there makes this transfer go faster.

You're on very old software and very slow hardware, and the no-upgrade policy basically trumps most discussions here. And if these are really AlphaServer 1000 boxes (and not 1000A boxes), do make sure nobody is using the graphics console when these FTP transfers are running.

Do you have DECnet-Plus installed on both ends? If so, DECnet over IP might be an option; another path and protocol to test.
Jon Pinkley
Honored Contributor

Re: Varing response times for FTP file transfer process

John,

Since you are running TCP 4.1, what version of VMS is running (6.x?).

1300 blocks in 50 sec is 650K/50s = 13KB/sec ~ 104Kbps
5000 blocks in 540 sec is 2500K/540 = 4.6KB/sec ~ 37Kbps

Time to transmit 5000 blocks at 13KB = 2500KB/13KB/SEC = 192 seconds

While fragmentation on the receiving end, could contribute to the delays, if seems unlikely that it would contribute 346 seconds in a 5000 block file, since we are just adding to the end of the file, and worst case, we would have something like 70 extension file headers (assuming 1 block cluster size and the unlucky event that each file extension is only able to get 1 block at a time). More likely would be some other disk activity at the time of the transfer (my opinion).

How long does it take to create a 5000 block file using something like this on the receiving end?

$ open/write testing123 test.dat
$ cnt = 0
$ write sys$output "Starting at ''f$time())'"
$ rec = f$fao("!508*z") ! 508 "z"s, with 2 byte VFC + 2 byte rec length, this is 1 rec per block
$_loop:
$ write/symbol testing123 rec
$ cnt = cnt + 1
$ if cnt .lt. 5000 then goto _loop
$ close testing123
$ write sys$output "Done at ''f$time()'"
$ exit

If that takes 346 seconds (or even 200) to run, then fragmentation may be a significant factor.

Given only the limited info you have provided, and assuming that the only variable is the size of the transfer, I think that traffic shaping is a good possible explanation. The transfer of the data in FTP is in a single FTP connection, and it is reasonably easy for routers to give lower priority to "bulk" traffic like FTP. I don't think that doing a ping to the other end while the transfer is in progress will necessarily give you an accurate indication of the limiting that the FTP is seeing, since ping normally uses ICMP packets, and these would not be treated as part of the FTP data.

The only reliable way to see rate limiting is with a protocol analyzer, like the free Wireshark.

Traffic shaping is similar in concept to the "priority adjusting" programs that lower priority of CPU hogs. FTP tends to use as much bandwidth as possible, and the packets are generally at the MTU. Traffic shaping tries to give the light users a break by letting them go to the front of the line. You may be able to get better performance by buying a better class of service, or possibly by trying to make your traffic look like something else, (encapsulation) or reduce the size of the files (zip) and make it a bunch of shorter TCP connections (trying to defeat traffic shaping) by splitting the file into more small pieces (zip followed by zipsplit). Depending on the traffic shaping, this may or may not have any effect, as it may be limiting based on source/destination ip, or may know that the multiple data connections are related to the same ftp control connection.

You could try an FTP to the null device, but you should probably watch the data with Wireshark while you do that to see if the rate gets throttled back as the transfer progresses.

Jon
it depends
Wim Van den Wyngaert
Honored Contributor

Re: Varing response times for FTP file transfer process

Labadies comamnd will not work on your ucx version.

As nothing changed on VMS, I would suspect the network changed. May be they replaced the switch and you should have duplex enabled now. mc lancp show dev/cou will show many collisions if that is the case.

Wim
Wim
Jon Pinkley
Honored Contributor

Re: Varing response times for FTP file transfer process

RE:"As nothing changed on VMS, I would suspect the network changed. May be they replaced the switch and you should have duplex enabled now. mc lancp show dev/cou will show many collisions if that is the case."

Why would that affect large transfers more than small ones?

But that does bring up another possible test, can an FTP transfer be done between two system on the same LAN, that would tell you if the problem was in the WAN or the LAN or system (fragmentation).

I can't remeber: was LANCP available in 6.x?

Jon
it depends
Jon Pinkley
Honored Contributor

Re: Varing response times for FTP file transfer process

If I would have paid more attention to the original message I would have seen

"LANCP> sh dev/char"

So it is obvious that LANCP was available on the system John was using.
it depends
Phil.Howell
Honored Contributor

Re: Varing response times for FTP file transfer process

I would set up a test using ttcp.exe
this should show up any misconfiguration and you can then record the throughput for various transmission sizes
Phil
Wim Van den Wyngaert
Honored Contributor

Re: Varing response times for FTP file transfer process

Why would that affect large transfers more than small ones?

Memories. Very small files pass well and larger files don't pass at all. Just did some tests with a forced nofull while switch is expecting full.

Small file 8 block file : 3.7 MB per sec. 101 blocks : 2.5 MB/s
1469 blocks file : 0.38 KB/s.

Something caused by the number of collisions ? A back-off strategy ?

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Varing response times for FTP file transfer process

Took a tcptrace and found that retransmission delay increases from a second to about 30 seconds (test of about 14 min in total). But with a randomizer picking values up to the maximum.

Wim
Wim
Jess Goodman
Esteemed Contributor

Re: Varing response times for FTP file transfer process

Jon Pinkley,

I lost your email address. Please send it to me at my last name at accuweather.com.
I want to send you a new tool I wrote that I think you will find very usefull in your environent.

(Why is there no way in these forums to send a private message to another member?)

Jess Goodman
I have one, but it's personal.
John A. Beard
Regular Advisor

Re: Varing response times for FTP file transfer process

Once again, thanks to everyone for the replies.

I need to clear a few things up before I open myself up to further responses, so please excuse the additional text placed here. I am based in Canada assisting in the support of VMS nodes located in North America. The issue at hand is occuring on legacy servers located in Europe. Owing to a number of factors (age, pending retirement, etc), the 'local' support for these boxes has virtually disappeared. Myself and a few others here offer 'best effort' support when things go amiss across the pond, but we have no European VMS counterparts to communicate with. I also have to deal with a slight language barrier when I communicate with any on-site people, as the location of the boxes ranges from Austria, Poland, Germany thought to France. I can request that certain information (network stuff) be channeled back to me, but it is an extremely slow process to follow... hence delays in responding to your questions. I am also dealing with a 6 hour time difference. We have no access to any documentation on these boxes, and they have pretty well survivied (14 odd years) on their original configurations up to this point. As of this point in time, I do not have any specific information about the switch or the pipe connecting the various locations.

Back to the main issue. There are a total of three servers located in different countries connecting through a private network. One machine, MIAXP1, receives daily export files that are zipped before being FTD's to it. The source servers are AULIMS (the problem location) and EFVAXB (everything appears pretty normal).

The EFVAXB file is about 100MB and takes approximately 15 minutes to transfer. I cannot quantify at this moment (Europe gone for the day) how long the transfer from AULIMS normally took, but the 34MB job was still running after several hours this morning before the operator cancelled the run.

As for some of your recommendations, I used Jon's procedure to create 5,000 (+ 10,000) block files. the 5K file took about 90 seconds to create and the 10K one took about 130 seconds. I tried to reverse the FTP process and copied the resultant 10K file to AULIMS, and it took 80 minutes to complete.

I am going to post twice here, as it appears I cannot attach more than one file at a time.
I have included some brief explanations at the begining of each file, though I'm sure you will probably want to obtain more info along the way.

I am on a course tomorrow (Wednesday), so there will be a slight delay before I can get back to you again. I have asked the operator in Europe to see if he can take the file from AULIMS and FTP it to a different server altogether. There are several simialar VMS servers that also have files being FTP'd all over the place.

Thanks again for your patience and assistance... John
Glacann fear críonna comhairle.
John A. Beard
Regular Advisor

Re: Varing response times for FTP file transfer process

This is the output associated with the file creation process recommended by Jon
Glacann fear críonna comhairle.
John A. Beard
Regular Advisor

Re: Varing response times for FTP file transfer process

The attached is the zipped export file from EFVAXB .. the one that only took about 15 minutes.
Glacann fear críonna comhairle.
Jon Pinkley
Honored Contributor

Re: Varing response times for FTP file transfer process

Since the problem seems to only apply to the transfers from AULIMS to MIAXP1 and not to transfers from EFVAXB to MIAXP1, it seems we can focus on AULIMS and its connection to MIAXP1.

Is the "private network" a shared network (like frame relay, MPLS, X.25) where the bandwidth is being shared with other entities? It does seem that something has changed in the link between AULIMS and MIAXP1, unless it has always been slow, or the amount of data has recently increased substantially. Does anyone have any access to reports of utilization of the link at the AULIMS site? Is it used for other systems besides the AULIMS server? If so, is there some other system at the site that is using it for a new use? Pending retirement of AULIMS suggests there is something that is replacing it. Is this new system doing "remote backups" or file synchronization over the same link?

Is this slowdown only affecting the AULIMS system? Does you company have any visibility into the "WAN" routers, or is this "owned" by the provider? Since you can't upgrade the VMS system, I am not aware of any tools you can use from UCX 4.1 (tcpdump isn't available as far as I remember, if it were you could at least capture a trace file to analyze with Wireshark).

If the link is overutilized, the answer is to buy more bandwidth. You are already compressing the data with zip, so any attempt to use a device to get additional throughput via compression isn't going to help with the zipped data.

Re:"Small file 8 block file : 3.7 MB per sec. 101 blocks : 2.5 MB/s 1469 blocks file : 0.38 KB/s. Something caused by the number of collisions ? A back-off strategy ?"

I was meaning 1300 blocks as small. I would expect a 1300 blocks file to cause "steady state" as far as ethernet timers are concerned, even the TCP connection should have passed the transitory effects, as long as there isn't some other traffic shaping in effect. Also, my experience with really small files is that the numbers reported are pretty bogus, as the VMS clock resolution becomes significant. (i.e. the ability to accurately measure the time for the transfer).

So it is definitely worth verifying that there isn't a duplex mismatch, but I will be surprised if that turns out to be the cause of the problem.

Cisco routers can be configured with congestion avoidance, a fancy name for dropping packets to cause TCP connections to back off before things melt down. Here's a pointer to a Cisco page about congestion avoidance, and how Cisco implements it. http://www.cisco.com/en/US/docs/ios/12_2/qos/configuration/guide/qcfconav_ps1835_TSD_Products_Configuration_Guide_Chapter.html Perhaps something like WRED or DWRED is in play, and affecting the large consumers more than the small consumers.

Jon
it depends
Wim Van den Wyngaert
Honored Contributor

Re: Varing response times for FTP file transfer process

John,

Could you do "ucx sho prot" before and after the slow ftp and post that ? (don't remember if ucx showed all protocols, so test the syntax first).

Wim
Wim
John A. Beard
Regular Advisor

Re: Varing response times for FTP file transfer process

Hi again,

Just got back into work and had the following e-amil message from the European Operator. He essentially FTP'd the same sized daily ZIP file to a different server (Austria->France)

"Hello John,

I copied the file from AULIMS to FCAXP1 and the time is much better (17min to complete). So I've updated my process to set as AULIMS destination copy the server FCAXP1.
I haver to check the next days this closely, as I already have to update the destination in the past (but I don't remember if it was with AULIMS again or some other server)."

I have attached the dump output from the test target server, FCAXP1. At a glance, the frgmentation on FCAXP1 appears as bad as MIAXP1, but obviously it did not hold up the output creation to any major degree."

I will need to ask the Operator to apply the UCX show prot request in a few minutes (using the original source/target machines)
Glacann fear críonna comhairle.