Operating System - OpenVMS
1839193 Members
3725 Online
110137 Solutions
New Discussion

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

 
Markus Waldorf_1
Regular Advisor

What's wrong with SMTP in TCP/IP 5.1 eco 5?

Hello,

Before I turn on debugging and go through a lot of data... has anyone had the problem below and knows how to fix it?

Problem:

I'm running tcpip 5.1 eco 5 on Openvms 7.2-1 with all the latest patches. For the last couple of day I'm trying to get smtp working The local mail system is working fine and I can send send local mail between the nodes in the cluster, but SMTP does not work well.

System@host or any other user account to any or the same user account on the same machine, even getting mail from outside, takes a long time to process, 10 - 20 min. There is no delay when entering the user@ name at the "mail to" prompt, so resolving does not seem to be the issue. Smtp email from outside fill up the smtp queues and I always have message get pending and on hold. Sending smtp mail to outside seems to work.

I tried the the host name and mail exchanger and also IP as host for the user@host, but no difference. I checked DNS, mail exchanger, nslookup everything works and resolves.

Running Analyze mail shows that it does not find queue entries for files and a building number of files with no queue entries. ?!

What I have done so far:

I stopped the smtp server, cleaned out all the mail files and restarted SMTP, which recreated the queues. Analyze mail would show no errors, but as soon as mail comes in it's the same story. I also deleted all the TCPIP accounts in the system and also the TCPIP configuration files. Since I copied Rightlist.dat, sysuaf.dat and vmsmail_profile.dat from a another system after I setup TCPIP before, maybe somthing was missing. I ran another TCPIP$config, which recreated all accounts like TCPIP$SMTP and queues, but the same problem persists.


Thanks,
Markus

P.S.

SYSTEM@NW2>@TCPIP$EXAMPLES:TCPIP$RESTART_SMTPQ.COM
SUB_Q = "SYS$BATCH"
SUB_TIME = "+00:10:00.00"
PAGFILCNT_MIN = 1000 Hex = 000003E8 Octal = 00000001750
TERMINATE_SRCHSTR = ""
SMTP_EXEC_Q = "TCPIP$SMTP_NW2_01"
SMTP_EXEC_Q = "TCPIP$SMTP_NW2_02"
There are 1 SMTP execution queues.
SMTP_EXEC_Q = "TCPIP$SMTP_NW2_01"
There are 0 stopped SMTP execution queues.
Queue(s) running.
SMTP_SYMBIONT = "SMTP_NW2_01"
1-JUL-2009 15:59:27.09 PID of SMTP_NW2_01 2540029E Pages left: 4241616
%PURGE-W-SEARCHFAIL, error searching for SYS$COMMON:[SYSMGR]TCPIP$RESTART_SMTPQ.LOG;*
-RMS-E-FNF, file not found
%SUBMIT-F-OPENIN, error opening SYS$SPECIFIC:[TCPIP$SMTP]TCPIP$RESTART_SMTPQ.COM; as input
-RMS-E-FNF, file not found
SYSTEM@NW2>


TCPIP> sho conf smtp

SMTP Configuration
Options
Initial interval: 0 00:30:00.00 Address_max: 16 EIGHT_BIT
Retry interval: 0 01:00:00.00 Hop_count_max: 16 RELAY
Maximum interval: 3 00:00:00.00 NOHEADERS

Timeout Initial Mail Receipt Data Terminate
Send: 5 5 5 3 10
Receive: 5

Alternate gateway: not defined
General gateway: not defined

Substitute domain: not defined
Zone: not defined

Postmaster: TCPIP$SMTP
Log file: SYS$SPECIFIC:[TCPIP$SMTP]TCPIP$SMTP_LOGFILE.LOG

Generic queue Queues Participating nodes

TCPIP$SMTP_NW2_00 1 NW2


%TCPIP-I-ANA_NOENTR, no queue entry found for file SYS$SPECIFIC:[TCPIP$SMTP]09070115455478_SYSTEM-2540029E.TCPIP_NW2;1

%TCPIP-I-ANA_COMPLE, ANALYZE completed on host NW2

%TCPIP-I-ANA_FEPAIR, found 74 file-queue entry pairs
%TCPIP-I-ANA_DELQEN, deleted 4 queue entries
%TCPIP-I-ANA_FILNOQ, found 17 files with no queue entries
%TCPIP-I-ANA_FILHLD, holding 0 files in directory
%TCPIP-I-ANA_FILDEL, deleted 0 files from the Postmaster directory
%TCPIP-I-ANA_SUBFIL, submitted 0 files to the generic queue
%TCPIP-I-ANA_FILACE, encountered 0 file access errors
%TCPIP-I-ANA_NONCFF, found 0 files in Postmaster directory
%TCPIP-I-ANA_FILCOR, found 0 corrupted control files in Postmaster directory

SYSTEM@NW2>type tcpip$smtp_logfile.log

%%%%%%%%%%%% 1-JUL-2009 16:08:32.35 %%%%%%%%%%%%
%TCPIP-I-SMTP_LOGSUC, using log file SYS$SPECIFIC:[TCPIP$SMTP]TCPIP$SMTP_LOGFILE.LOG

%%%%%%%%%%%% 1-JUL-2009 16:08:32.36 %%%%%%%%%%%%
%TCPIP-I-SMTP_SYMBRUN, symbiont is running the queue TCPIP$SMTP_NW2_01
SYSTEM@NW2>type tcpip$smtp_logfile.log


SYSTEM@NW2>nslookup
Default Server: ng2lan.yy.xxx.org
Address: 192.168.100.201

> nw2.cz.rferl.org
Server: ng2lan.yy.xxx.org
Address: 192.168.100.201

Name: nw2.yy.xxx..org
Address: 172.18.152.129

> exit
SYSTEM@NW2>mail

MAIL> s
To: system@nw2.yy.xxx.org
CC:
Subj: test
Enter your message below. Press CTRL/Z when complete, or CTRL/C to quit:
test
Exit

MAIL> exit

SYSTEM@NW2>
... 10 min. later
New mail on node NW2 from SMTP%"system@nw2.yy.xxx.org"
SYSTEM@NW2>


SYSTEM@NW2>ucx sho ver

Compaq TCP/IP Services for OpenVMS Alpha Version V5.1 - ECO 5
on a COMPAQ AlphaServer DS20E 833 MHzP running OpenVMS V7.2-1



SYSTEM@NW2>ucx sho int
Packets
Interface IP_Addr Network mask Receive Send MTU

IE1 192.168.100.200 255.255.255.0 2891 2426 1500
LO0 127.0.0.1 255.0.0.0 32 32 4096
WE1 172.18.152.129 255.255.255.0 20229 16343 1500

SYSTEM@NW2>type TCPIP$SMTP_LOGFILE.LOG

%%%%%%%%%%%% 1-JUL-2009 16:08:32.35 %%%%%%%%%%%%
%TCPIP-I-SMTP_LOGSUC, using log file SYS$SPECIFIC:[TCPIP$SMTP]TCPIP$SMTP_LOGFILE.LOG

%%%%%%%%%%%% 1-JUL-2009 16:08:32.36 %%%%%%%%%%%%
%TCPIP-I-SMTP_SYMBRUN, symbiont is running the queue TCPIP$SMTP_NW2_01

9 REPLIES 9
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

I don't know why but I made some progress.

Sending from nw1.xx.yyy.org as System to system@nw1.xx.yyy.org or nw2.xx.yyy.org works now, although I don't receive notification of mail arrival of the same account on the other node.

But while going through some of the previously received files I found:
Why is this? The 192.168.100 interfaces are the other interfaces in the same machines, which connect to the DNS server, called nw1lan and nw2lan. The nw1 and nw1 interfaces are in the 172.18.152 network. Default gw is 172.18.152.1.


From: SMTP%"TCPIP$SMTP@nw2.xx.yyy.org"
To: system@nw1.xx.yyy.org
CC:
Subj: Returned mail


---- Transcript of session follows ----

%TCPIP-E-SMTP_EXCMAXHOP, maximum number of hops exceeded; mail loop suspected

---- Unsent message follows ----

Return-Path: system@nw1.xx.yyy.org
Received: from nw2lan (192.168.100.200)
by nw2.xx.yyy.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 16:35:29 +0200
Received: from nw2lan (192.168.100.200)
by nw2.xx.yyy.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 16:35:27 +0200
Received: from nw2lan (192.168.100.200)
by nw2.xx.yyy.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 16:35:24 +0200
Received: from nw2lan (192.168.100.200)
by nw2.xx.yyy.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 16:35:22 +0200
Received: from nw2lan (192.168.100.200)
by nw2.xx.yyy.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 16:35:21 +0200
Received: from nw2lan (192.168.100.200)
by nw2.xx.yyy.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);

Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

Anyway, it's stills showing

...
%TCPIP-I-ANA_NOENTR, no queue entry found for file SYS$SPECIFIC:[TCPIP$SMTP]09070116150498_TCPIP$SMTP-25400336.TCPIP_NW2;1

%TCPIP-I-ANA_NOENTR, no queue entry found for file SYS$SPECIFIC:[TCPIP$SMTP]09070116151038_TCPIP$SMTP-25400337.TCPIP_NW2;1

%TCPIP-I-ANA_COMPLE, ANALYZE completed on host NW2

%TCPIP-I-ANA_FEPAIR, found 0 file-queue entry pairs
%TCPIP-I-ANA_DELQEN, deleted 0 queue entries
%TCPIP-I-ANA_FILNOQ, found 10 files with no queue entries
%TCPIP-I-ANA_FILHLD, holding 0 files in directory
%TCPIP-I-ANA_FILDEL, deleted 0 files from the Postmaster directory
%TCPIP-I-ANA_SUBFIL, submitted 0 files to the generic queue
%TCPIP-I-ANA_FILACE, encountered 0 file access errors
%TCPIP-I-ANA_NONCFF, found 0 files in Postmaster directory
%TCPIP-I-ANA_FILCOR, found 0 corrupted control files in Postmaster directory
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

I configured TCP/IP 5.3 eco 2 on another machine, which has the first interface in a different network, but the 2nd interface is also in the 192 like host nw1

HOST NW1:

nw1.xx.yyy.org - 172.18.152.128
nw1lan.xx.yyy.org - 192.168.100.101

HOST NG2:

ng2.xx.yyy.org - 172.18.136.65
ng2lan.xx.yyy.org - 192.168.100.201

There is no connection between nw1 and nw2 because the are on separate switches protected by firewall.

nw1lan and ng2lan are connected to the same switch. No firewall or anything inbetween

on NG2
mail to: system@nw1lan.xx.yy.org

mail returned:

Return-Path:
Received: from nw1lan.xx.yyyrl.org (192.168.100.100)
by ng2lan.xx.yyyrl.org (V5.3-18E, OpenVMS V7.3-1 Alpha);
Wed, 1 Jul 2009 20:00:36 +0200 (MET DST)
Date: Wed, 1 Jul 2009 20:02:29 +0200
Message-Id: <09070120022971@nw1.xx.yyyrl.org>
From: TCPIP$SMTP@nw1.xx.yyyrl.org
To: system@ng2lan.xx.yyyrl.org
Subject: Returned mail


---- Transcript of session follows ----

%TCPIP-E-SMTP_EXCMAXHOP, maximum number of hops exceeded; mail loop suspected

---- Unsent message follows ----

Return-Path: system@ng2lan.xx.yyyrl.org
Received: from nw1lan (192.168.100.100)
by nw1.xx.yyyrl.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 20:02:28 +0200
Received: from nw1lan (192.168.100.100)
by nw1.xx.yyyrl.org (V5.1-15Q, OpenVMS V7.2-1 Alpha);
Wed, 1 Jul 2009 20:02:27 +0200
Received: from nw1lan (192.168.100.100)
xx.yyy


Bind looks correct on NG2 and same on NW1:

Default Server: localhost
Address: 127.0.0.1

> nw1lan
Server: localhost
Address: 127.0.0.1

Name: nw1lan.xx.yyy.org
Address: 192.168.100.100

> nw1
Server: localhost
Address: 127.0.0.1

Name: nw1.xx.yyy.org
Address: 172.18.152.128

> ng2lan
Server: localhost
Address: 127.0.0.1

Name: ng2lan.xx.yyy.org
Address: 192.168.100.201

> ng2
Server: localhost
Address: 127.0.0.1

Name: ng2.xx.yyy.org
Address: 172.18.136.65


Can someone explain?
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

New info:


I still play around with this bdt. I figured out something I guess.

I turned off RELAY on NW2(tcpip set conf smtp /option=norelay, which is default. And now when I sent from either NG2 or NW2 it does no longer loop, but comes right back with NOSUCHUSER.

When I send from NW2 to itself, like system@nw2lan.yy.xxx.org or system@test.xxx.org it bounces with NOSUCHUSER too. But when I send system@nw2.yy.xxx.org it works fine.

NW2 and NW2LAN are both on the same machine, just different IP.

I do have an MX record though for nw2lan in xxx.org auth domain.

test IN MX 10 nw2lan.yy.xxx.org.
IN MX 20 nw1lan.yy.xxx.org. (it's offline)


SYSTEM@NW2>ucx sho int
Packets
Interface IP_Addr Network mask Receive Send MTU

IE1 192.168.100.200 255.255.255.0 313 241 1500
LO0 127.0.0.1 255.0.0.0 103 103 4096
WE1 172.18.152.129 255.255.255.0 3495 2346 1500

SYSTEM@NW2>ucx sho host /local

LOCAL database

Host address Host name

127.0.0.1 LOCALHOST, localhost
192.168.100.101 dns1
192.168.100.201 dns2
172.18.152.1 gw
172.18.152.128 nw1, NW1
192.168.100.100 nw1lan, NW1LAN
172.18.152.129 nw2, NW2
192.168.100.200 nw2lan, NW2LAN


I never had such difficulties ever before in setting up SMTP in such a simple way. This is crazy.
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

Btw, email to NW2 from some other computer in our network is working ok, at least the problem I had at first seem to have disappeared, for whatever reason. Don't know if it performs well but nothing is queueing up at least. NG2 as I found out had a wrong default gateway for the 172.18.136 network, which was not even in the subnet, but it should have never been used anyway.

What remains is that I cannot send email between the machines connected through the 192.168.100 network. Somehow the SMTP does not listen on the 192.168.100.200 interface. Looping when Relay is on. Why? I don't know.
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

Here is a another thought.

There is only a default gateway defined for the 172.18.152 network, but not for the 192.168.100 network where I want stmp mail to travel across.

Could it be that stmp ignores the mail MX record and re-routes internally to the 172 network for smtp because there is not default gateway for the 192 network, which is the interface itself?

I wonder about the set conf smtp /zone /gateway options, but I'm not clear on it yet.
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

I have spend a lot time trying to troubleshoot this problem. I reconfigured DNS and played with name, local MX, Bind MX and other SMTP routing, but no difference whatsoever. The debug has a lot of info, but did not give me answers. I could not try to change the default domain of the machine because it would not let me, reboot or not, I could not figure it out.

Tomorrow I'm going to uninstall TCP/IP 5.1 and try 5.0a. I have another couple of machines with the same configuration logic, but running 7.1-2 and 5.0a, and it works fine. I need eco 4 to fix some issue of "broken SMTP links", which requires 7.2, and that was the main reason to upgrade from 7.1-2.

Perhaps a bit off topic, but what I always liked about openvms is that it gets me the information I need without having to wonder what the developer of the software wanted to achieve. This TCP/IP product however, I think, has far too many buttons, which don't seem to make any difference.
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

Well, what a nightmare. The same problem in TCP/IP 5.0a, with a small difference: I could send to each other node in the cluster by using its host name, provided smtp options=norelay, but no luck using the MX record - unknown host. The basic problem remains. I re-installed TCP/IP 5.1 eco 5 again.
Markus Waldorf_1
Regular Advisor

Re: What's wrong with SMTP in TCP/IP 5.1 eco 5?

no solution yet. Opened a new thread.