Operating System - OpenVMS
1753428 Members
5024 Online
108793 Solutions
New Discussion

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

 
SOLVED
Go to solution
Rick Dyson
Valued Contributor

SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

I have an I64 OpenVMS v8.3-1H1 system that now has TCPIP v5.7 ECO 5 as of last Monday.  I have limited e-mail that is outbound, mostly automated notifications of issues or successes of batch jobs.  No users (except me) ever use VMSMail to send anything outbound.  So this morning, 72 hours after the patch and reboot, I am getting e-mail that went out that morning bouncing.  It reports this error:

 

%TCPIP-E-SMTP_UNREACHABL, cannot connect to remote host, healthcare.uiowa.edu
-SYSTEM-F-UNREACHABLE, remote node is not currently reachable

 

That is an Exchange server that I am sending through via their explicit Relay server.  My SMTP uses a relay for all outbound traffic and has been working since I went live in 2012, until Monday.

 

Anyone know what would have changed in ECO 5 to break this?

 

Thanks,

Rick

8 REPLIES 8
Steven Schweda
Honored Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

 > [...] now has TCPIP v5.7 ECO 5 [...]

 

   Coming from what?  Any other configuration changes?

 

> %TCPIP-E-SMTP_UNREACHABL, cannot connect to remote host, [...]

 

   I know nothing, but, to me, "unreachable" suggests a low-level
network problem, like, say, no route, or a DNS problem.  Can you connect
to there manually?:

 

      telnet /port = 25 healthcare.uiowa.edu

 

(It fails for me (SYSTEM-F-TIMEOUT), but I'm in Minnesota, so I might
expect that.  How selective is that system about SMTP connections?)

Hoff
Honored Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

While it may well be ECO5 here, it could also be some other OpenVMS or local network error that surfaced secondary to the patch and restart...

 

Use ping, traceroute and dig from the OpenVMS box to check connectivity and DNS translations and addresses, do check the NIC speed and duplex settings (just saw a VMS box misnegotiate), check the SMTP send queues for any backlogged messages, check the server logs on both ends for any relevant errors, enable available diagnostics on the remote server and on the OpenVMS server, then ring up HP support...  Then depending on how fast you hear back, ponder falling back to an earlier ECO kit.

 

Reposting...  Hello...  "Your post has been changed because invalid HTML was found in the message body. The invalid HTML has been removed. Please review the message and submit the message when you are satisfied."   Um, what HTML?

Rick Dyson
Valued Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

We were TCPIP v5.7 ECO 4 prior to the Monday downtime.

 

The SMTP config on my VMS box was setup 5 years ago.  SMTP was setup to use a relay server for all outbound, which is part of an Exchange-based system.

 

!TCPIP$SMTP.CONF file created at : Fri Jul 29 08:20:00 2011

 

!TCPIP SMTP configuration data:

Initial-Interval : 0 00:30:00.00
Retry-Interval : 0 01:00:00.00
Retry-Maximum : 3 00:00:00.00
Receive-Timeout : 5
Retry-Address : 16
Hop-Count : 16
Send-Timeout-Init : 5
Send-Timeout-Mail : 5
Send-Timeout-Rcpt : 5
Send-Timeout-Data : 3
Send-Timeout-Term : 10
Header-Placement : Bottom
Eightbit : FALSE
Relay : FALSE
Alternate-Gateway : relay.healthcare.uiowa.edu
!General-Gateway : < not defined >
!Zone : < not defined >
Substitute-Domain : lis.healthcare.uiowa.edu

 

 

The mail jobs are hanging in the TCPIP$SMTP*<node>  queues and then dropped back into the generic TCPIP$SMTP queue, but only if the address is for <any user>@healthcare.uiowa.edu.  Any other user/destination I have tried works fine.

 

I have stopped all the SMTP services on my cluster, deleted all the pending batch queue jobs and deleted the mail messages.  Then started the SMTP services on the nodes and sent one test and it still does not work.

 

Generic server queue TCPIP$SMTP

  Entry Jobname Username Blocks Status
  ----- ------- -------- ------ ------

   1307 15061114060992_DYSON-21E00437
                DYSON        5  Holding until 11-JUN-2015 14:38:44.21

 

Server queue TCPIP$SMTP_FERB_1, idle, on FERB::, mounted form DEFAULT

 

Server queue TCPIP$SMTP_FINEAS_1, idle, on FINEAS::, mounted form DEFAULT

 

 

 If I telnet to the relay server and hand enter a msg using the SMTP protocol it works fine from a DCL command line.  FWIW, "healthcare.uiowa.edu" is not a SMTP server, it is a round robin DNS pointer to our Domain Controllers. 

 

$ telnet relay 25
%TELNET-I-TRYING, Trying ... 129.255.116.89
%TELNET-I-SESSION, Session 01, host relay, port 25
220 relay.healthcare.uiowa.edu Microsoft ESMTP MAIL Service ready at Thu, 11 Jun 2015 14:25:49 -0500
ehlo relay.healthcare.uiowa.edu
250-relay.healthcare.uiowa.edu Hello [129.255.116.5]
250-SIZE 52428800
250-PIPELINING
250-DSN
250-ENHANCEDSTATUSCODES
250-AUTH
250-8BITMIME
250-BINARYMIME
250-CHUNKING
250-XEXCH50
250 XSHADOW
mail from:dyson@lis.healthcare.uiowa.edu
250 2.1.0 Sender OK
rcpt to:dysonr@healthcare.uiowa.edu
250 2.1.5 Recipient OK
data
354 Start mail input; end with <CRLF>.<CRLF>
subject: manual test

Hello world

.

250 2.6.0 <1cce9580-f961-435a-b600-8ed16950fc69@HC-EDGE1.healthcare.uiowa.edu> [InternalId=180392432] Queued mail for delivery

 

Is there any way to turn on more troubleshooting info for SMTP?  Some logical names perhaps?

 

Rick

Hoff
Honored Contributor
Solution

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

I'd ring up HP support here, as this whole area is somewhat of a morass — the V5.7 bits are not at all well-documented, and I've gotten bagged as have others.   (VSI is reportedly replacing the IP stack soon, so...)


We were TCPIP v5.7 ECO 4 prior to the Monday downtime.


You were running TCP/IP Services ECO 4 prior to the problems that commenced on Monday, following the downtime for the patch.   You will hopefully have a backup of the older release prior to the upgrade, as you may need to roll that back in or manually downgrade to ECO 4.

The SMTP config on my VMS box was setup 5 years ago.  


You have been running this OpenVMS configuration for the last five years or so.   Which unfortunately means little — it's not working NOW, and why is not clear.

SMTP was setup to use a relay server for all outbound, which is part of an Exchange-based system.


You are relaying from OpenVMS to a Microsoft Windows Server running Exchange Server.   Or not, as it appears that there are connectivity issues or DNS issues here.

 If I telnet to the relay server and hand enter a msg using the SMTP protocol it works fine from a DCL command line.  FWIW, "healthcare.uiowa.edu" is not a SMTP server, it is a round robin DNS pointer to our Domain Controllers.

 


The healthcare.uiowa.edu is also an alias domain for your mail activities — it's the MX record associated with that domain that matters here, not the Microsoft Windows Server Active Directory servers — hopefully you're not still using domain controllers.   In this case (if everything is working), your mail goes to your relay host SMTP server relay.healthcare.uiowa.edu, and then to the hosts associated with the MX record.
DNS looks OK from out here, but you'll want to repeat some basic checks on your internal network, on the off change there's a difference in your internal DNS and the public DNS:
$ dig +short -x 129.255.116.89
relay.healthcare.uiowa.edu.
$ dig +short -x 129.255.116.89
relay.healthcare.uiowa.edu.
$ dig +short MX healthcare.uiowa.edu
5 hc-edge1.healthcare.uiowa.edu.
5 hc-edge2.healthcare.uiowa.edu.
$ dig +short hc-edge1.healthcare.uiowa.edu.
129.255.126.22
$ dig +short -x 129.255.126.22
hc-edge1.healthcare.uiowa.edu.
$ dig +short hc-edge2.healthcare.uiowa.edu.
129.255.126.23
$ dig +short -x 129.255.126.23
hc-edge2.healthcare.uiowa.edu.
$
Presumably those colon HTML tags in your posting are forum artifacts, and not in the actual file.
As the installer does not appear to update the template file (at least I'm looking at a stale copy here), fetch a fresh copy of the configuration file template file:
$ lib sys$share:TCPIP$TEMPLATES.TLB;/extr=TCPIP$SMTP_CONF/out=x.x
And have a look at the relay checks comments and the logging settings, and specifically these two:
Try-A-Records: IFNOMX
Altgate-Always: TRUE
And have a look at the debugging settings in the configuration file, too — see if enabling those and restarting the SMTP server gets more details.
Also check your local MX record translations — you may have a different set than what I'm seeing here.


Is there any way to turn on more troubleshooting info for SMTP?  Some logical names perhaps?


From the previous reply: the link to the TCP/IP Services SMTP logging information — whether that's what is used or the settings in the CONF file, I've not checked — the handing of the updated SMTP server is unfortunately largely undocumented, beyond what little is included in the release notes.  (Other than the TCPIP$SMTP_FROM logical name, reportedly all of the TCPIP SMTP commands and all of the SMTP logical names related to SMTP are now ignored.)

 

Rick Dyson
Valued Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

Thanks a lot Hoff for the continued help!

 

I have experimented a little on a test server I have.  BTW, my Conf template was identical to the one I pulled out of the library. 

 

I also added those other two conf options, too and cycled the SMTP server.

 

Try-A-Records : IFNOMX
Altgate-Always : TRUE

 

Now I am getting e-mail through again!!!!

 

Not sure which one fixed it, but the help is greatly appreciated!!!  HP Support was next but they are not always as quick as venues like this...

 

 

Hoff
Honored Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

FWIW...  I suspect this is a difference between the "gateway" and "alternate gateway" mechanisms, where the mail server here used the alternate gateway when delivering outside your domain, and used direct delivery when within.

 

If your domain was something.example.com, then the mail server assumes that anything example.com can be directly delivered, but that messages outside your domain such as one sent to example.org would go via the alternate gateway.  

 

Something then is likely blocking direct delivery within your network, hence the error you were encountering.  In short, you probably want the gateway setting for your mail, and not the alternate gateway setting.  

 

This matches your observed behavior — that mail everywhere else works, but internal mail traffic fails.

 

Looking at all this again, I also don't see any way to specify authentication and an alternate port (e.g. TCP 587), so this whole mechanism probably won't work all that well in the general case.  Not without some added pieces.  But then VSI is reportedly replacing the whole IP stack, too.

 

The "gateway" and "alternate gateway" names are comparatively poor names for these settings now, as this is seemingly more commonly called a relay in other contexts.   It'd make more sense (to me) to have just a gateway specification and a flag or a setting to attempt direct, local delivery — or just send everything via the gateway, as that's probably where you have malware filtering and the rest implemented, and sending direct probably won't save all that much except in very large and very busy networks from a very busy OpenVMS mail server.   (A combination which probably isn't common.)  

 

Postfix allows setting up a transport map here, so you have rather more control over your relays than any of this. 

 

The comparative lack of documentation in the configuration file and the relative lack of mapping from the old settings to the new UI doesn't help. 

 

I'm guessing this whole V5.7 SMTP change was implemented and this parser was written because somebody didn't want to deal with reorganizing the data file.  Can't blame them for that, as the RMS record stuff is pretty ugly to deal with.  The downside is that the wholesale replacement of the UI with an entirely different one is extra work on the end-users, and the project typically requires documentation to explain the new UI.  (That and the default behavior here — when the configuration file is not found — is no error messages and an open relay, which is really bad news for anyone running a mail server.)

 

In short, you probably want to specify the gateway here and not the alternate gateway.  But explicitly sending everything via the gateway does appear to solve this, as you've found.

 

Reposting...  Hello my old friend "Your post has been changed because invalid HTML was found in the message body. The invalid HTML has been removed. Please review the message and submit the message when you are satisfied."  Glad to see you again.  Might this be UTF-8 or some other detritus and not HTML?  Because I didn't insert any HTML here. 

Dennis Handly
Acclaimed Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 applied

>what HTML?

 

You may want to look at this topic:

http://h30499.www3.hp.com/t5/Community-Feedback-Suggestions/Invalid-HTML-Removed/m-p/6128857

 

(But still no solutions.)

Craig A Berry
Honored Contributor

Re: SMTP mail issue after TCPIP v5.7 ECO 5 Appiled

FWIW we had similar symptoms with the same ECO.  The sequence here was that once upon a time we had an MX record for gateway selection and this was working fine until some TCP/IP services change broke it.  It might've been the original move to the configuration file in 5.7, though I don't remember for sure.  Back then we got things working again by adding the Alternate-Gateway and General-Gateway options in the configuration file and from that point forward only maintained the configuration file.

 

Then ECO 5 came along, which has the following in the release notes:

 

        19.2 SMTP mail always uses alternate gateway even in local domain.

29-Aug-13        Integrity servers and Alpha

        Problem:

        SMTP is ALWAYS using the alternate gateway setting even if SMTP mail is
        sent to local domain nodes.

        Deliverables:

        [SYSLIB]TCPIP$SMTP_MAILSHR.EXE

        Reference:

        QXCM1001304009

 

The fix to stop using the alternate gateway for local domain nodes effectively resurrected the old MX record, but that record did not point anywhere valid since there had been various other changes to the environment during intervening years and the old mail server wasn't there anymore.

 

Once diagnosed, this was pretty easily fixed by deleting the old MX record and creating a new one pointing to the current mail server.  From reading this thread it looks like setting Altgate-Always : TRUE might've also gotten us working again by restoring the broken behavior we were depending on.