Operating System - OpenVMS
1827801 Members
2286 Online
109969 Solutions
New Discussion

Re: TCP Port becomes unusable

 
SOLVED
Go to solution
Randy W. Suhrbier
Occasional Advisor

TCP Port becomes unusable

OpenVMS Alpha 7.3-2
TCPIP 5.4 ECO5

Problem:
Every few months a process appears to be unable to open its predefined listening port. We reboot to solve the problem, we have not yet tried just restarting TCPIP.

Situation:
We have a weekly data deployment cycle that results in all the vendor's application processes being restarted. The vendor's environment includes a library around TCPIP, part of which supports opening listening ports. If the library is unable to open a listening port it will go into a loop, making up to 24 attempts, once every 5 seconds. After the last attempt fails the process aborts. Each atempt is logged to SYS$OUTPUT. We do not have access to the vendor's code, but debug output appears to indicate the looping only occurs if the return status is SS$_DUPLNAM.

It is quite common to see a few failed attempts on a restart even though the BGDevice shows the REUSEADR option.

Every 4 months or so a process will always fail all 24 attempts, every time we restart it. TCPIP SHOW DEVICE/PORT= shows nothing. The latest occurance lasted for a whole day and we left the process down for extended periods of time.

I have looked through the ECO 6 & 7 release notes and haven't noticed anything relavent.

Randy S.
13 REPLIES 13
Robert Gezelter
Honored Contributor

Re: TCP Port becomes unusable

Randy,

If you are rebooting to resolve the problem, please force a crash next time so that the dump can be examined.

The question is: Is this a quirk of IP or is it some form of hung process connection.

Certainly, while the dump is useful, it would be interesting to attempt just a restart of the IP stack. Please remember that such a restart must be done from:

- a direct connection,
- a DECnet remote terminal,
- a LAT session, or
- a batch job.


In other words, the restart cannot be done from a telnet or ssh session (which would use the IP stack).

- Bob Gezelter, http://www.rlgsc.com
Steven Schweda
Honored Contributor

Re: TCP Port becomes unusable

> [...] looping only occurs if the return
> status is SS$_DUPLNAM.

The status returned from what? (Is this a
process creation problem, or an IP-related
problem?)
Randy W. Suhrbier
Occasional Advisor

Re: TCP Port becomes unusable

IP related problem. Status from in IOSB for bind.
Jon Pinkley
Honored Contributor

Re: TCP Port becomes unusable

Randy S.

Have you talked to the vendor of the application? Is there a supplied shutdown procedure for their product? What are you doing to restart the vendor's application processes?

Here is what the manual has to say about this status:

http://h71000.www7.hp.com/doc/73final/6529/6529pro_019.html#tcppmch05_15

SS$_DUPLNAM Programming error. The port being bound is already in use. An attempt to bind the socket to an address and port failed.

Jon
it depends
labadie_1
Honored Contributor

Re: TCP Port becomes unusable

Could you post twice a
$ ucx sh dev bgxxx:/fu
for you bg device, when
1) it works fine
2) it is hung

It could help understand what goes on.

Robert Gezelter
Honored Contributor

Re: TCP Port becomes unusable

Randy,

While it is easier to postmortem a system from a crash dump (which would be not a disruption since you are already rebooting), you can use SDA on the running system to display the relevant data structures to see what the status of the relevant BG device is.

In an unrelated situations, over the years, I have encountered numerous situations where an application component failed to exit when so instructed, and the restart procedure would fail. Looking at the output of SHOW SYSTEM or going into SDA normally identified the culprit in fairly short order. Manually fixing the problem (e.g., in most cases, terminating the malfunctioning processs) allowed the restart to occur without the need to resort to a system restart.

Long time forum user symposia and forum readers know that I loath rebooting systems unnecessarily.

- Bob Gezelter, http://www.rlgsc.com
Randy W. Suhrbier
Occasional Advisor

Re: TCP Port becomes unusable

Hi,

1. We stop and start the processes per vendor supplied procedures.

2. We have not seen any evidence that the process has not exited.

3. When the problem exists, UCX SHOW DEVICE/PORT= does not show any devices using the known port.

4. Does anyone have direct experience with the REUSEADR option?
Arch_Muthiah
Honored Contributor

Re: TCP Port becomes unusable

Randy,

When we look into help and Jon's response...
> SS$_DUPLNAM Programming error.

you say $ ucx sho device/port=nnn shows nothing, no service and process assigned to that port. But the docu continues as
>The port being bound is already in use.
It won't be correct.

I doubt this may be the prog. error only.
OR as Jon's info, the application/service assigned on that port not properily shutdown. Because once the service is started on a specific port, simply disabling the service or stop/id the process will stop the process, but ends up with unpredicatable result and will be a problem again when we want to use that port. The vendor program may want to stop the service by specifying the service name, process name, port name, and protocol name. Proper shutdown is necessary in this case.

We have faced this kind of issue, but we were able to see the open socket name, so we disconnect the device socket, then re-start the program soled the issue.

Another doubt I have is that it happens every 4 months!. Also there will be a limit in the number of instances of the service to run in the system, I doubt this because you mentioned the program does not go through the loop all the time.

Check dcl show system --- when you don't find any opensocket with that port. And check options from $ucx sho devic/full --- to see REUSEADR for any other similar process.

Archie
Regards
Archie
Randy W. Suhrbier
Occasional Advisor

Re: TCP Port becomes unusable

Hi,

Normally the device looks like:

Device_socket: bg621 Type: STREAM
LOCAL REMOTE
Port: 37395 0
Host: * *
Service:
RECEIVE SEND
Queued I/O 0 0
Q0LEN 0 Socket buffer bytes 0 0
QLEN 0 Socket buffer quota 900000 900000
QLIMIT 4 Total buffer alloc 0 0
TIMEO 0 Total buffer limit 7200000 7200000
ERROR 0 Buffer or I/O waits 1 0
OOBMARK 0 Buffer or I/O drops 0 0
I/O completed 0 0
Bytes transferred 0 0

Options: ACCEPT REUSEADR KEEP LOOP FDPX_CLOSE
State: None
RCV Buff: WAIT
SND Buff: None

The LOOP option gives me pause since the documentation says it is reserved for VMS usage.

None of the vendor's apps are registered as TCPIP services.

Some simple testing with the QIO sample server and client programs in sys$examples: shows that restarting an app too quickly will result in SS$_DUPNAM if the REUSEADR is not specified. UCX SHOW DEVICE again does not show a matching device. Is there a way to look deeper into UCX?

Randy S.
Volker Halle
Honored Contributor

Re: TCP Port becomes unusable

Randy S.,

the most detailled information on TCPIP sockets you can probably get is from

SDA> tcpip sho dev/debug/full

When researching similar symptoms, I came across a discussion in:

http://forums.devx.com/showthread.php?t=37492

There is a concept of 'lingering' sockets, which stay around 'for some time'...

Don't know if this help, but I had not heard about that before.

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: TCP Port becomes unusable

In my notes I read that not REUSEADR will close the connection after about 2 minutes.

**
Q>How is a TCP connection terminated ?
--------------------------------------
Either side can send a FIN package. The one sending the FIN is doing the active
close. After the FIN, no more data can be sent. The other side can however
continue to send data (called half close). E.g. rsh will close the input
channel for the server when all commands are passed. The FIN must be acked.

If the 2nd FIN is not received by the active closer, the line will be broken
after 75 seconds of iddleness (most versions).

Q>What is 2MSL ?
----------------
When the "close connection" second FIN is received, an ACK is given and the
line closed. Before closing however, we must wait and see if the ACK arrived
well. Since there is no ACK for an ACK, the only thing that can happen is that
the FIN is sent again and we must ACK again. This continues until both sides
timeout. The timeout is after 2 times the MSL (about 2 minutes).

Only sockets that set the options "REUSEADR" don't bother about the 2MSL. All
known services use this option but many programs don't.
**

Wim
Wim
Jon Pinkley
Honored Contributor
Solution

Re: TCP Port becomes unusable

This thread from comp.os.vms discusses the same problem.

http://groups.google.com/group/comp.os.vms/browse_thread/thread/fa0e2ceafcae5c16/647cf29c9353d611?lnk=st&q=#647cf29c9353d611

If that link doesn't work, use a search for "Problem with UCX QIOW IO$_SETMODE Options".

It does seem to be to to a programming error, but more likely a programming error in the TCPIP stack than the application.

Here's the meat of the thread,

"The trick is that, with the QIO interface, you must not specify all
parameters in one call because it seems that they are processed in an
unsuitable sequence. The BG: driver processes them starting with p1, next
p2, p3 and so on, whatever was not passed as zero. This means that parameter
p3 which is used to bind the socket to a specific port is processed _before_
parameter p5 which is used to set socket options. Therefore, because the
UCX$_REUSEADDR option isn't yet set at the time parameter p3 is processed,
the binding fails with SS$_DUPLNAM."

I've also attached the complete response as a text file.
it depends
Randy W. Suhrbier
Occasional Advisor

Re: TCP Port becomes unusable

Thanks for the great information Jon! I have forwarded it to our vendor for comments. Based on the observed behavior I suspect they are combining the QIO's.

I feel that the problem of never being able to reuse the port is a TCPIP bug. Maybe changing the QIO's will allow us to avoid it.

Randy S.