Operating System - OpenVMS
1829456 Members
1701 Online
109992 Solutions
New Discussion

Re: The everlasting joys of FTP

 
SOLVED
Go to solution
Jan van den Ende
Honored Contributor

The everlasting joys of FTP

This morning we have been having a lot of "big fun".

Let me start with some background info (those parts the I guess might be relevant).

The node in question had an uptime 294 days, running VMS 7.3-2, TCPIP 5.4 ECO-2.
This node does its share of our cluster's workload, plus, one rather heavy application restricted to single node (Unix db port).
No issues worth mentioning.

There was a requirement for a new version OVSAM, which required some higher patches than we had loaded. (and a user-friendly reboot cycle of the cluster takes all of a week).
So, all available patches (exept SYS-V0600 of course!) were done in one go, together with the implementation of IPfailsafe.
(IPfailsafe had been activated on one node 2 months ago, when that node was down for maintenance.
No issues whatsoever found for that.)

The night before yesterday we moved our single-node applic to another node that already had been upgraded. Again, no issues.
Yesterday we upgraded and IPfailsafe-d this particular node, and last night we moved the db engine back.

-- relevant implementation detail:
users on OA desktops (that means nearly all users) choose their printer and their applic at the desktop. That kicks off a script that uses ftp to get that info to the vms node offering the applic (DNS steered), and then telnet's to that node. SYS$SYLOGIN recognises the files ftp'd from the desktop, set the user's printer, and starts the applic.
Just to show the importance of ftp for us.

And now for the big fun:

just about when most users had started work, we got complaints about NOT being able to start the single-node applic.

We found that ftp _TO_ our node just times out.

On the node, we found 15 processes TCPIP$FTP_ for 15 useraccounts, and
TCPIP$FTP_1 ====> in MUTEX!

Killing the user's FTP's made no difference.
Killing the MUTEXed did not work (as expected)
SDA showed "Timer entries remaining 0/15 "
Shutting down FTP services made no difference.
Starting again gives process TCPIP$FTP_2.
That does not result in new connectivity, and the process quickly died.

I intended to try and remove the MUTEX with DELTA, but, when I broke off a process that still hung trying to ftp from another node, the MUTEX process suddenly disappeared (coincidence, or not?).

For the time being we decided to modify the UAF TCPIP$FTP account's TQE from 15 to 50.
We restarted ftp, and up till now everything is running again.

But,

-- anybody recognise this?
-- can this be caused by FTP ECO 4?
-- can this be caused by UPDATE_V0400?
-- can this be caused by any patch after UPDATE 4, released before 12 may?
-- can this be caused by IPfailsafe?

-- anyone any other ideas?

Until it is explained, I have the unpleasant feeling of sitting on a bomb with a burning fuse.

Proost. (I need a stiff one now!)

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
25 REPLIES 25
Ian Miller.
Honored Contributor

Re: The everlasting joys of FTP

As you probably know processes shown as MUTEX state by SHOW SYSTEM are generally found to be in a MWAIT state waiting for a pooled quota (quota shared between all processes in a job). TQELM is one of those quotas. So by increasing TQELM you may have fixed it.

SDA SHOW SUMMARY in recent versions of VMS does a better job of interpreting the MWAIT state.

I don't recogise the problem and there is nothing much in the eco4 release notes.

What where you going to do with DELTA to remove the MUTEX state - increase the quota?
Have you considered using AMDS/Availability manger for this sort of thing as it is a bit less error prone as the user interface is slightly more friendly than DELTA (more than one error message for a start - Eh?)
____________________
Purely Personal Opinion
Jan van den Ende
Honored Contributor

Re: The everlasting joys of FTP

Ian,

-- increasing TQLM may have fixed it.
Yes, that is my best guess also, and that is why I did it. But "may have" is a bit meager to sleep quiet when in Nashua...

-- SHOW SUMM in SDA etc. Well, that and a lot more is what I hope to learn next week in the Nashua SDA training!

-- nothing like that in the release notes.
Do you think that if I found something like that, I would even have considered installing them? 8-(

AMDS/Avail better/easier for the job?
Yeah, probably, but I _DO_ have experience with the DELTA route, and I still want to get more at ease with the AMDS manipulations.

As an aside, I DO think that DELTA is named much too innocent-looking.
Way back in my (DME, that is pre-UX) ICL life, we had a more or less comparable utility: online patching of images or memory locations.
Just to let you know what you were playing with, the utility was called DYNAMITE.

So far so good, but:
ANYBODY ANY ANSWERS,PLEASE??

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: The everlasting joys of FTP

Haven't seen this problem as I don't use VMS much these days, but I ran into what after quite some diagnostic turned out to be a name resolver bug many month ago, so I know that feeling and can only offer my sympathy.
.
Jan van den Ende
Honored Contributor

Re: The everlasting joys of FTP

Uwe,

unless you know of name resolver problems that _DO_ affect ftp, but at the same time, from the same remote (multiples of that), _DO NOT_ affect telnet, then it does not apply here!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: The everlasting joys of FTP

No, I don't think so - this was a letter of condolence ;-)

You can have my dose of alcohol - I need to work tomorrow all day long.
Have a nice and quiet evening!
.
labadie_1
Honored Contributor

Re: The everlasting joys of FTP

Hello

It seems that $ Tcpip sh serv ftp/ful and
$ mc authorize sh tcpip$ftp are not coherent.

I have long asked that
$ tcpip set service ftp/limit=xxx
modifies the settings of the tcpip$ftp username, but is seems I have not asked long enough :-)

Here is an extract from a post I did in comp.os.vms (google groups , +comp.os.vms +labadie +ftp +mutex should give only one hit)

I had done some tests with Tcpip 5.3 Eco 1
the Ftp process alone (just started) consumes
196 096 in Bytlm
8512 in Pgflquota
1 in Tqelm

each other user will need
about 66 000 in Bytlm
128 in pgflquota
at least 1 in tqelm


I remember a good laugh at the Itanium porting days: Itanium intermediate version we had on the RX2600 showed a default of 1000 users for the service FTP, but could not go beyong 9 or 10 because of the settings of the Tcpip$ftp username...
Jan van den Ende
Honored Contributor

Re: The everlasting joys of FTP

Merci beaucoup, Gerard.

So, you do have an experience like ours, and in another version at that!
So it might have been a coincidence that we now for the first time bumped into that quota?
Inprobable, but not impossible.
There HAD been an interruption, and shortly before the reported problems probably quite a few people were trying to reconnect.
But that HAS happened before (several times, alas!)

When back at work I will look into these settings.

John, Hein, Volker:
Is it a correct assumption that setting the account ENQL higher than the ftp service limit will prevent recurrence of this issue, or is that thinking too simplistic?

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
labadie_1
Honored Contributor

Re: The everlasting joys of FTP

I think an Ftp connection may consume different resources when you do a simple LIST, PUT, GET, or DELETE. Just with pquota from the freeware, you can monitor this easily (of course Amds or Availability Manager have the advantage of fixing online a quota problem !). For HP, here is a SPR: modify tcpip set service FTP/limit=xxx, so that the quota of the user tcpip$ftp are modified (and issue a message about Channelcnt and others, if concerned...)
Wim Van den Wyngaert
Honored Contributor

Re: The everlasting joys of FTP

Jan

That's why I monitor all quota of all processes. An alarm is easier than investigating.
May be you should do it also.

Have a few on me.

Wim
(btw : why jpe ?)
Wim
Jan van den Ende
Honored Contributor

Re: The everlasting joys of FTP

Wim,


That's why I monitor all quota of all processes.

The idea sound nice. But monitoring all quota for 1000 + processes (regular peak value) requires in itself quite some power.
And, if you are to catch this kind of runaway situations, you have to do it very often as well! I do not think the cost vs gain balans would be positive.

Re jpe: way back when, in primary school, one of my classmates was also called Jan van den Ende. To differentiate, the teachers decided to add my second name, which I did not particularly like.
In highschool the classmate issue went away, but the naming stuck. Until someone abbreviated the double first name to the initials JP. In thoose days, that was rather cool, and my (then) girlfriend (now wife) liked it very much. And when she found an elegant suit lapel pin with stylised "JPE" and gave that to me, it was that forever.

Proost.

Have one on me.

J.P. van den Ende (aka jpe)
Don't rust yours pelled jacker to fine doll missed aches.
Volker Halle
Honored Contributor

Re: The everlasting joys of FTP

Jan,

it looks like the TCPIP$FTP_1 process consumes 1 TQE on it's own and then another one for each current FTP user session.

If your TCPIP$FTP_1 process is now running with Timer entries remaining nn/50 (after changing the TQELM of the TCPIP$FTP account to 50), you should not be able to deplete the TQEs again, if the service limit stays below 49. Assuming that there would be no TQE leak inside the FTP server...

Volker.
labadie_1
Honored Contributor

Re: The everlasting joys of FTP

Volker I would say at least 1 TQE
Jan van den Ende
Honored Contributor

Re: The everlasting joys of FTP

Volker,

"looks like" is as far as I could get myself..
Now, I would like to KNOW for sure, if this IS the issue, and the WHOLE issue.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
labadie_1
Honored Contributor
Solution

Re: The everlasting joys of FTP

Jan

to answer your questions

- anybody recognise this?

yes, see previous posts

-- can this be caused by FTP ECO 4?
not seen up to now, and I would say, yes but improbable

-- can this be caused by UPDATE_V0400?
same as previous

-- can this be caused by any patch after UPDATE 4, released before 12 may?
same as previous

-- can this be caused by IPfailsafe?
may be, but never seen, and I do not see the relationship

-- anyone any other ideas?
yes
take a Vms node with no activity, note the tcpip$ftp username, the ftp service limit, monitor the quota of the ftp process, put the ftp timeout parameter to 45 minutes or better 10 hours, connect to the Vms node (ftp, open, list or put or any command), note the quotas used/remaining of the ftp process, connect again until you get your process in Mutex. You have now a reproducer for this bug :-)
Volker Halle
Honored Contributor

Re: The everlasting joys of FTP

Jan,

if you want to KNOW, then you have to ask those who KNOW, in this case TCPIP Engineering.

This problem has caused some problems in your production environment, so why not escalate it to TCPIP engineering and thereby creating the 'SPR' (nowadays those things are called PTRs) as suggested by Gerard.

That's the way the system is supposed to work...

Volker.
Jeremy Stubbs
Occasional Advisor

Re: The everlasting joys of FTP

IMHO this is NOT a bug. For many years now its been standard practise to increase the quotas of the TCPIP$FTP account if the FTP service limit is increased above the default of 10. (Admittedly this "standard practise" has yet to appear in the docs)

In general the following rules apply:

Increase the BYTLM, TQELM, and ASTLM on the UCX$FTP account on UCX
or TCPIP$FTP account on TCPIP

TQELM should be set above the FTP service limit(rule of thumb is TQELM = 2*service limit)
ASTLM should be equal or greater than TQELM
BYTLM should be increase by a similar factor to TQELM
e.g.

UAF>mod tcpip$ftp /BYTLM=500000 /TQELM=50
Jan van den Ende
Honored Contributor

Re: The everlasting joys of FTP

Jeremy,


its been standard practise to increase the quotas of the TCPIP$FTP account if the FTP service limit is increased above the default of 10

I LOVE the term "standard"!!!
_HOW_ am I supposed to find out about such "standard", when (in your own words) "it still has to appear in the docs"?

Yes, in hindsights it is not all too illogical, but others like me (and how many before me?) are bound to get hurt by this.
Should it not be automatic, or at the very least by very visibly signaled when becoming potential, trying to avoid potential problems in advance?
_THAT_ would be VMS style, this is more like "welcome to the world of unpleasant surprises!".

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
labadie_1
Honored Contributor

Re: The everlasting joys of FTP

Hello Jed

Thanks for giving us this pertinent info

Géra
Ian Miller.
Honored Contributor

Re: The everlasting joys of FTP

Jeremy, if you have a software support contract see if you can log a call to get this added to the documentation as it sould be there.

or parhaps the documentation feedback page
http://h71000.www7.hp.com/doc/fb_doc.html

only by a hp customer telling hp directly does the documentation get changed.
____________________
Purely Personal Opinion
Galen Tackett
Valued Contributor

Re: The everlasting joys of FTP

"The everlasting joys of FTP" -- gotta love that title, JP!

While I haven't run into this problem myself [yet] the discussion here has been instructive. It's got me (and probably others) thinking about potential similar issues involving resource utilization, and that's a good thing. Maybe it will help us all to avoid them.

Thanks to all who've contributed to this discussion!

Galen
labadie_1
Honored Contributor

Re: The everlasting joys of FTP

Ian

Jeremy does not have a contract, as he is from HP, Vms/Tcpip support

:-)
Paul Janssen
Advisor

Re: The everlasting joys of FTP

One of the purposes of coming to Nashua is to find out the solution to such a problem, but now, by coincidence(?) this gets solved.

I suffered last year in August from such a problem. More than 8 connections per sec made the ftp server stop to respond. No way to start and stop the service. Only a reboot helped!

So I asked around (also support), posted on itrc, upgraded from 7.3 to 7.3-2, added eco's, increased memory, ran autogen and finally decided to change the app. Only the latter one solved the issue.

I think more of this should go in autogen (and in the docs), so we do not have to go through the unpleasant feeling of having to reboot a VMS server!

Paul Janssen



Anton van Ruitenbeek
Trusted Contributor

Re: The everlasting joys of FTP

Paul,

For this you don't have to reboot your system. Only stop/start FTP proces.
yk: @SYS$STARTUP:TCPIP$FTP_SHUTDOWN/@SYS$STARTUP:TCPIP$FTP_STARTUP.COM

AvR
NL: Meten is weten, maar je moet weten hoe te meten! - UK: Measuremets is knowledge, but you need to know how to measure !
Paul Janssen
Advisor

Re: The everlasting joys of FTP

Well that do not help. You could start and stop, but it did not let the server pick up connections, so a reboot was needed...