Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems Running a TCPIP Service

 
SOLVED
Go to solution
KNewman
Occasional Visitor

Problems Running a TCPIP Service

Hi,

This is my first time in this forum. I'm fairly new to VMS, but this problem is a mystery to even the more seasoned people I work with so I'm hoping you can help me with this.

I'm trying to get the Veritas backup client running on OpenVMS 8.3 (HP TCP/IP 5.6-9 ECO2). I've installed it *just fine* on like 3 other boxes, all running 8.3, but this one is just not working for some reason.

Here's what the BPCD service looks like now:

TCPIP> sh service bpcd/fu

Service: BPCD
State: Enabled
Port: 13782 Protocol: TCP Address: 0.0.0.0
Inactivity: 5 User_name: SYSTEM Process: BPCD
Limit: 50 Active: 0 Peak: 1

File: $1$DGA481:[BPCD]BPCD_STARTUP.COM
Flags: Listen

Socket Opts: Keepalive Rcheck Scheck
Receive: 0 Send: 0

Log Opts: Acpt Actv Dactv Conn Error Exit Logi Logo Mdfy Rjct TimO Addr
File: $1$DGA481:[OPENV.NETBACKUP.LOGS]BPCD.LOG

Security
Reject msg: This is a test message

Accept host: 0.0.0.0
Accept netw: 0.0.0.0

But when I run bpcd_axp.exe (the executable associated with this service), a "show client" never connects ... it just hangs there. What's stranger, is that when I try telnet'ing to this server on port 13782, I make a connection, but the second I hit a key it drops the connection.

This telnet behavior is much different from the servers where the service *is* working, in that when I telnet to a server with a good client on port 13782, the connection stays open and I can type whatever gibberish I want into the terminal.

What makes this problem so difficult is that I can't get any sort of log or output from whatever is going on. I've tried enabling, disabling, etc., but a log file is never updated with new information.

Any ideas? Thanks guys!
10 REPLIES 10
Steven Schweda
Honored Contributor

Re: Problems Running a TCPIP Service

> [...] Veritas backup [...]

About which I know nothing, but, ...

> TCPIP> sh service bpcd/fu

This looks the same on the working systems
(give or take a disk name)?

> But when I run bpcd_axp.exe [...]

Is that the client program which is supposed
to connect with the BPCD server?

> [...] a "show client" never connects ...
> it just hangs there.

And that's a command to bpcd_axp.exe?

> File: $1$DGA481:[OPENV.NETBACKUP.LOGS]BPCD.LOG

Anything in there?
Hoff
Honored Contributor

Re: Problems Running a TCPIP Service

Call Veritas support for assistance; it's their tool, and they're the best path for troubleshooting assistance here.

Given that this is a fresh install, it is likely that some sort of (initial or on-going) product support is in place.

And yes, check the logs and such, and check the IP routing with whomever is supporting the network here. It's common for an environment with a "$1$DGA481:" disk to also have managed switches and firewalls and a fairly complex IP network, and managed switches can cause all manner of network connectivity errors; errors from subtle to catastrophic.

Also check the system parameter settings and the current system environment against the Veritas documentation, including startup procedures and logical names and such.
KNewman
Occasional Visitor

Re: Problems Running a TCPIP Service

1. Here's what it looks like on a working server (more or less the same but I took out logicals on the one with problems):

$ tcpip show service bpcd/fu

Service: BPCD
State: Enabled
Port: 13782 Protocol: TCP Address: 0.0.0.0
Inactivity: 5 User_name: SYSTEM Process: BPCD
Limit: 50 Active: 0 Peak: 6

File: SYS$SYSDEVICE:[BPCD]BPCD_STARTUP.COM
Flags: Listen

Socket Opts: Keepalive Rcheck Scheck
Receive: 0 Send: 0

Log Opts: None
File: not defined

Security
Reject msg: not defined
Accept host: 0.0.0.0
Accept netw: 0.0.0.0


2. Yes, bpcd_axp.exe is the executable called by the bpcd_startup.com script.

3. Show client is the command that tries a connection to localhost on port 13782. Basically looks at itself.

4. $1$DGA481:[OPENV.NETBACKUP.LOGS]BPCD.LOG doesn't exist, and that's what's confusing about this. I can never seem to get it to spit out a lot or ANY piece of helpful information.
Richard W Hunt
Valued Contributor

Re: Problems Running a TCPIP Service

What is your disk farm? HP Storage-Works variant, EMC variant, some other brand of SAN?

Verify that the disk is set up correctly. Verify that the SAN is mapping you correctly with normal rights. Verify that your logging disk (if different from where the product lies) is mapped correctly.

Cluster or non-cluster environment?

If Veritas is anything like Legato, you have a backup server and your alphas are running backup agents. Verify that your backup server can see your system. I.e. this might be a problem not specific to your failing server but to the other end of the agented connection. Also check for visibility but with asymmetric routing, which is a show stopper for every case I've ever seen with a quick-dropping connection.
Sr. Systems Janitor
Robert Gezelter
Honored Contributor

Re: Problems Running a TCPIP Service

KNewman,

As Hoff suggested, this would be an appropriate use of product support.

However, BEFORE calling support, I would definitely verify that all other IP-related features work (e.g., telnet, ftp) on the same path. I would also verify that there are no firewalls or other devices that can possibly interfere with the connection.

- Bob Gezelter, http://www.rlgsc.com
KNewman
Occasional Visitor

Re: Problems Running a TCPIP Service

SAN is HP StorageWorks, but I seem to doubt that's the problem. Server is up and running just fine with almost every other program working properly.

Nonclustered environment, too.

I've ruled out firewall because I CAN telnet to the server on that port. It's just that when I do connect, it instantly drops the connection. This is different behavior than if the port wasn't open, in which case I would just get a connect timeout.

I've noticed similar behavior of a TCP/IP service when I create a dummy service with a dummy /FILE (e.g. a hello world script as its /FILE). When testing with the dummy service I get the same behavior: successful connect on the proper port, but instant drop of the connection.

Also, FTP, SSH, telnet all work fine between the same 2 subnets. That being said, there *was* a small issue with someone directly copying a new sysuaf onto the system and screwing up the UAF entries for each service user, but I've fixed that since then.

I'd love to call Veritas support but I don't think we have a contract for this product so I was seeing if anyone else might have an idea.
KNewman
Occasional Visitor

Re: Problems Running a TCPIP Service

I finally found some logs with useful information. One such logged named BPCD_AXP.LOG is showing this:

type BPCD_AXP.LOG
$ Set NoOn
$ VERIFY = F$VERIFY(F$TRNLNM("SYLOGIN_VERIFY"))
%SET-W-NOTSET, error modifying $1$DGA481:
-SET-E-INVDEV, device is invalid for requested operation
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\\
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\\
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\\
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\\
(repeats a lot, then ...)
%RMS-W-RTB, 10791 byte record too large for user's buffer
SYSTEM job terminated at 8-SEP-2009 14:44:34.48

Accounting information:
Buffered I/O count: 172 Peak working set size: 3008
Direct I/O count: 70 Peak virtual size: 173072
Page faults: 208 Mounted volumes: 0
Charged CPU time: 0 00:00:00.05 Elapsed time: 0 00:00:00.07


Does that shed any light on this?
Hoff
Honored Contributor
Solution

Re: Problems Running a TCPIP Service

If you just installed it, then you likely have some sort of vendor support.

Your SYLOGIN.COM or LOGIN.COM or other local DCL code looks broken here, or somebody configured a DCL symbol or three that's masking some command that the Veritas code looks to be using.

As a test, nuke the contents of SYLOGIN.COM and LOGIN.COM (an "$ EXIT" or a "$ SET VERIFY" or "$ DEFINE SYLOGIN_VERIFY TRUE" at the top is a starting point) and see how far you get. You might need to do that a couple of times to get a look at the code.

Whatever you need to start digging through the site-local DCL code here that gets invoked prior to when the Veritas stuff gets called.

>SAN is HP StorageWorks, but I seem to doubt that's the problem.

Careful with that "doubt". Debugging and troubleshooting involve the avoidance of assumption and of presumption. This is as close to the raw and unvarnished application of the scientific method as most computer folks tend to get.
KNewman
Occasional Visitor

Re: Problems Running a TCPIP Service

Wow, I actually concluded the same thing and fixed this before you posted, but I'll give you credit :)

Apparently the errors in the login.com were causing the service to fail to start up properly. I think it wasn't handling the error output whatsoever, then bombing out as it couldn't parse the input. The key here was finding that log though, and that was the bitch of this.

Thanks for your help everyone.
Steven Schweda
Honored Contributor

Re: Problems Running a TCPIP Service

> %SET-W-NOTSET, error modifying $1$DGA481:
> -SET-E-INVDEV, device is invalid for requested operation

This looks like a [SY]LOGIN.COM trying to do
a SET TERMINAL command in a non-interactive
job, where the "terminal" is a disk (file).
As "HELP SET TERMINAL Parameter" says:

device-name[:]

Specifies the device name of the terminal. The default is
SYS$COMMAND if that device is a terminal. If the device is not
a terminal, an error message is displayed.

In a non-interactive job (often a batch job),
you get an error like this. The usual cure
is some conditionality on the SET TERMINAL
command, like, say:

[...]
$ IF (F$MODE() .EQS. "INTERACTIVE")
$ THEN
$ IF (F$GETDVI( "TT:", "DEVCLASS") .EQ. 66)
$ THEN
$ SET TERMINAL /INQUIRE /INSERT
[...]

To see the details, stick in some kind of
"SHOW LOGICAL SYS$COMMAND" where you can see
its output.