Operating System - OpenVMS
1828355 Members
3051 Online
109976 Solutions
New Discussion

TCPIP connectivity problems

 
Willem Grooters
Honored Contributor

TCPIP connectivity problems

Environment:
VMS 7.3-2, TCPIP 5.4. ECO level unknown (both VMS and TCPIP)

Two issues I think may be related.
In both cases, data is transferred by a program on one node to another program on another node (not clustered) over TCPIP.
The message layout is always the same: 6 bytes containing data size, then the string containing the data - exactly the size specified. Kind of "variablë size with fixed prefix".
Sending these messages imply:
* Determine exact size of payload in bytes
* Convert this into a 6-byte ASCII string and send that 6 bytes to the socket.
* Send that number of bytes of the data to be sent, to the socket.
This repeats until all data is sent, after which an "end-of-data" message is sent (typically "000003STP").
The receiving process is a TCPIP service, and will run a program that will read the socket in this way:
* Read 6 bytes from SYS$NET, convert it to a number
* Read that number of bytes from SYS$NET and process this data
* continue until the message is "STP"

Some programs will end running, others will wait for the next message (state: HIB) and continue to stay active until an EXIT message is received (or the process is killed).

All IO over TCPIP is done by socket IO, (send and recv, both in C-code)

Issue 1.
A program sends huge amounts of data to another program that will once in a while acknowlegde receipt. The connection may stay live and active for hours. In some specific cases however, reading the size or the data will all of a sudden come to an more or less abrupt stop. one or two bytes are read, but then all transport sems to be stalled and the read will timeout. The program will retry, but no more data is found. However, it has been sent by the sending program.
In this case, the weirdest thing is that after restart, when the same record is sent again, the very same thing may happen. Note: MAY - in most cases it will, but in some the data could be received and proceessed - and we found nothing, what so ever, that could have caused the problem.

Issue 2.
A program sends a request (< 250 bytes), which is received and processed by the receiving program. This handles the request and drops a message containing the answer on the socket, the size may be < 40 characters or over 32K in total - but even the smallest is received in chuncks of 4-5 bytes at a time.
In some cases, a request is sent, but not delivered: either the service isn't even started, or it is not received by the listener.
In other cases, ït is recognized that data is sent to tye socket, but when the program requests the stream, nothing is found, but the connection still exists - the sender hasn't dropped the line.

The weird thing is that connections originating on this node, have no trouble at all. It's always a problem with connections that originate OUTSIDE the node.

Any ideas?
Willem Grooters
OpenVMS Developer & System Manager
6 REPLIES 6
Bojan Nemec
Honored Contributor

Re: TCPIP connectivity problems

Willem,

Probably you have checked for the network addapter and switch settings (speed and mode). A few days ago I have a problem where alpha to switch settings was ok but settings betwen two switches were not (one switch set to autonegotiate and another to fixed).

The second thing. Is the socket set to FIONBIO? If yes there is a common programming error.When sending to the socket, the programmer doesnt test the status (WOULDBLOCK) and resend the data. This works ok on a single system (or two adjacent nodes) but not when the connection is slow or one of the systems or the network is busy.

Bojan
Willem Grooters
Honored Contributor

Re: TCPIP connectivity problems

Bojan,

Thanks for the quick reply.
On the connection itself: This is to be part of the investigation - but it's another group I have no connection with. Just ask them to look and wait.

On the "programming error": Agreed when both parties could send the same time (I have such a set of programs), but that is not the case here. In both cases, we use a synchronous protocol, where send and receive will never interfere: The next message is not sent before the previous has been processed (as proven by receiving an ACK from the recipient, or a message from the sender). Also, for the first message sent, it would never block since the receiving process istn't there anyway - and still this may fail sometimes.

Willem
Willem Grooters
OpenVMS Developer & System Manager
Michael Yu_3
Valued Contributor

Re: TCPIP connectivity problems

Hi Willem,

If there are multiple routers between the source and the destination, then the following might explain issue 1.

There are multiple routers between the source and destination. These routers have different MTU sizes. Normally the traffic between the source and destination goes through router A (say, with MTU size of 1500). When router A fails, the traffic goes through router B (say, with MTU size of 536). This would be OK, if the frames sent from the source do not have the flag "Don't Fragment" set. However if this "Don't Fragment" flag is set, the frame will be dropped by router B.

Just a wild guess.

Thanks and regards.

Michael
Wim Van den Wyngaert
Honored Contributor

Re: TCPIP connectivity problems

Willem,

Did you check the tcp counters (ucx sho prot) and check the tcptrace ?
Or post it over here.

Wim
Wim
John Yu_1
Valued Contributor

Re: TCPIP connectivity problems

Maybe keepalive values need to be tweaked?
Artificial intelligence is rarely a match for natural stupidity.
Willem Grooters
Honored Contributor

Re: TCPIP connectivity problems

The first issue turned out to be something with data in the receiving application, we still have no idea how that could interfere with the connection, but once the record was removed, the porblem was gone as well.
This is still under investigation - but low priority.

The secind issue turned out to be a problem on "the other side" - some mismatch in that environment.

Anyway, it seems the problems have disappeared.
Willem Grooters
OpenVMS Developer & System Manager