1751781 Members
4063 Online
108781 Solutions
New Discussion юеВ

Slow SMTP

 
SOLVED
Go to solution
Richard W Hunt
Valued Contributor

Slow SMTP

I've got a US Dept. of Defense system which is on an isolated network because (of course) the DoD network gurus don't trust anything that isn't Windows. Typical of the government to let the fox guard the chicken coop, eh?

Anyway, it's an OpenVMS 7.3-2 cluster with TCP/IP servers 5.4 ECO 7. I have no incoming SMTP traffic due to a firewall in the way, but I can sent SMTP to a Mail Relay server.

I searched the forums for issues relating to slow SMTP performance. We are talking 4 to 8 hour backlogs in some cases, like today. I've already made the queues balance against each other so that one cluster member can pass jobs to the other cluster member. Load balancing helps a little. But sadly, not enough.

One highly relevant article relates to having zones and alternate gateways defined when you must use an external mail relay server. Before anyone asks, I have the zone and alternate gateway defined.

I'm pretty sure the problem is related to a slow Mail Relay server that is not under my control. I'm thinking that a possible way to improve matters is to open up more SMTP queues on my cluster. I'm currently running 3 "SMTP execution" queues per node. I know that there will be a trade-off at some point where the queues compete with each other for attention on the relay server and eventually saturate it. If I have to play with tuning it that way, I will.

Before I twiddle the number of execution queues, I'm wondering if anyone recalls an OpenVMS limit or other issue they have found with SMTP "execution" queues. What sort of SMTP queue widths have you folks implemented safely with your OpenVMS systems and clusters? Any other thoughts you might offer would be greatly appreciated.
Sr. Systems Janitor
7 REPLIES 7
Jim_McKinney
Honored Contributor

Re: Slow SMTP

Maybe best not to assume that the problem is with the gateway system (though it likely is). You might use TCPDUMP to watch a transfer or two and see who is being slow. Or you might even just use telnet and simulate an SMTP session to get a feel for where the slowness is.

$ TELNET/PORT=25 gateway.domain
helo
mail from:
rcpt to:
data
put all of your
message text here
.
quit

How quickly do you get a HELO back in response to your HELO? (this might indicate a backlog of connections on the server side) After you get the HELO is it still slow? (this might indicate a lack of resources on the server side) If it's really slow then there's not much you can do about except complain.
Richard W Hunt
Valued Contributor

Re: Slow SMTP

Jim, thanks for replying. I've done a manual operation a couple of times and the delays are measured in single-digit seconds tops. Rarely more than 2 or 3, usually 1 or just a fraction thereof.

I've turned up debugging to see what I can see. I was going to analyze that output tomorrow after the backlog got eaten away overnight. I'm at log level 2, with TCPIP$SMTP_SYMB_TRACE defined and TCPIP$SMTP_NOSEY defined. I should see something of value.

I'm having an interesting time trying to decipher the timestamps I've seen so far in the logs. They don't appear to be linear with respect to processing, which makes me wonder where the logged time comes from. Maybe they are logging the time the messages were enqueued as opposed to when the processing in the "execution queue" started. But that would be a bit bizarre even for OpenVMS. Anyway, tomorrow is another day and by then, my log files should be full of perfectly confusing data to sample.
Sr. Systems Janitor
John Gillings
Honored Contributor
Solution

Re: Slow SMTP

Richard,

This may be way off the mark... but I've found the most common cause of mysterious delays in any TCPIP service is DNS lookups. These can appear in all kinds of unexpected places. Make sure every system in the chain can see a valid DNS somewhere. This can be a problem in isolated networks - you can't just point to an external source, so you need to actively make certain your systems can make sense of random addresses.

Also remember there maybe both forward and backward translations, and in complex networks, and systems with multiple adapters, requests may appear from unexpected paths. For example, suppose your system is sending the request to the Mail Relay on a secondary adapter, for which there is no back translation. That may introduce a DNS timeout in the Relay for every message you send.
A crucible of informative mistakes
Jim_McKinney
Honored Contributor

Re: Slow SMTP

> delays are measured in single-digit seconds tops

Is this true from beginning to end of a whole SMTP mail delivery or just in the handshaking? What is the volume of mail messages that you're handling?

Is your primary goal to just get them to depart from your system quickly or to get them delivered to their destination quickly - if the former then I suppose that you could just forward them to some other local systems that isn't configured to deny relaying and then it becomes their problem ;)
Richard W Hunt
Valued Contributor

Re: Slow SMTP

My goal here is to minimize the time involved in sending messages when I have a backlog.

I have to connect to another department's network for security reasons. The detailed configuration is too complex to describe in words. Let's just say they put me in my own little world where I can't do diddley-squat without going through external proxies, relays, or other servers. So what I've done for e-mail is pointed to the ONLY mail relay server available to me that has access to external sites.

OK, at to timing: With debug level 2 enabled, I can get information from the SMTP log files to let me determine certain timing intervals and trends.

1. When no backlog, latency from creating the message to getting the execution queue to start transmitting it is on the order of 0.05 to 0.15 seconds. Including the fact that ALL messages go through the generic queue even if backlog is zero. So it isn't a queue latency issue on my system.

2. When there IS a backlog, I am able to see how fast each transmission queue accepts the next job. I was flat-out astonished to note that with a backlog, the time between jobs in ANY of my execution queues was 06:24 or 06:25 (mm:ss) like bloody clockwork.

I'm not going to disagree that there is a problem here and wouldn't bet against it being a DNS issue at the relay server level. However, my engineers and I are not sure whether this near-constant delay is a timeout for a sequence of failing DNS lookups or a denial of connection from the relay server itself. From the log file, I don't THINK I'm seeing denials of connection but can't be sure.

The engineer also asked me a question that I cannot answer, and therefore I'll toss it up for consideration. Does anyone know SMTP's behavior when connecting to a relay server and a backlog of messages exists?

My engineer's specific question was "Does VMS try to keep an SMTP connection open between transmissions or does it close the channel after each message and open a new channel (to the same relay server)?"

I'm not certain but I think it does a "keep open" strategy. Thing is, I can't find anything in the manuals or RFC 821 to define this particular implementation detail. Any comments? Any ideas how I would be able to tell?
Sr. Systems Janitor
Jim_McKinney
Honored Contributor

Re: Slow SMTP

> SMTP's behavior when connecting to a relay server and a backlog of messages exists?

It isn't really an SMTP behavior in play here. It's configuration issue that depends upon how the server side is configured. I would expect that they would have configured their SMTP service to accept connections beyond what they can actually service and hold them in a sort-of queued state to be serviced on a fifo basis (their "backlog"). They could alternatively not permit any backlog and just reject connections beyond their capacity. The delay that you describe suggests that they have a configured to accomodate a "backlog".

> keep an SMTP connection open between transmissions
> Any ideas how I would be able to tell?

I suspect not - I'm more familiar with the MultiNet implementation and it does not. You can determine this if you have access to a sniffer or TCPDUMP program - just watch traffic between these two systems and see if you maintain a local static port assignment as multiple messages are passed.
Richard W Hunt
Valued Contributor

Re: Slow SMTP

Jim, thanks for the comment about "open ports." It rang a bell.

I have a utility that will tell me what ports are open outbound as well as inbound. So if I run my port scanner and find that I have open ports to SMTP while at the same time I have no SMTP backlog, I should be be able to answer the question definitively for my engineering team.

I should have thought of that myself. Guess I'm in the wrong side of the forest/trees conundrum, eh?
Sr. Systems Janitor