- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: TCPIP services do not always react
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2003 01:55 AM
11-26-2003 01:55 AM
TCPIP services do not always react
two VMS machines, not clustered. Both VMS 7.1-2, machine A TCPIP 5.0A ECO1, machine B TCPIP 5.1
On machine A services a number of external applications. Actually, it's the same commandprocedure, the same user for each, accessing the same data, but on behalf of different systems (Unix, Windows). A third instance services requests from an application on machine B. These three services are (of course) on different ports: on 3011 (S1), 3012 (S2) and 3013 (S3) and have different processname; each has a limit of 15.
The rpogram starting behind it will keep the session opened, and will handle each subsequent request.
S1 has been activated a total of 12 times, S2 the full 15. So I have 12 times a process named S1_
However, process S3 can only be invoked a 2-3 times, but the next one will not even produce a logfile. The process (on system B) that tries to access this service gets some error on return. For what reason, I cannot tell (at least: for now, since the program needs to be altered for that). But giving the error and the fact that the service on system A does not produce ANY logging, I conclude that the service isn't even started.
The question is why. Since it DOES work appearently (S1 and S2 DO run, and S3 does for some seeions at least) there must be something that limits the ability for opening extra channels. But where?
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2003 02:03 AM
11-26-2003 02:03 AM
Re: TCPIP services do not always react
Check MAXPROCESSCNT sysgen parameter. May be system has reached that value.
Thanks & regards,
Lokesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2003 02:45 AM
11-26-2003 02:45 AM
Re: TCPIP services do not always react
$ucx sho comm
hope this helps,
Lokesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2003 10:20 PM
11-26-2003 10:20 PM
Re: TCPIP services do not always react
sorry to can't help you; I only can encourage and I try give you a clue: some service on VMS have limitated connection; for example, if you type TCPIP SHOW SERVICE /FULL you can see limit: nn where nn is the max telnet connection to server. I'm happened on my customer this value were 3 and the 4th PC cannot log in without any error in any log.
Your trouble sounds like this limitation; look at your service characteristic to discover limit value.
Bye
Antoniov
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2003 11:25 PM
11-26-2003 11:25 PM
Re: TCPIP services do not always react
MAXPROCESSCNT is big enough. It _could_ be the limit, but when I asked it had just happened again, MAXPROCESSCNT is over 1200 and number of processes as that moment was less than 600.
Numnber of sockets _could_ be a problem, just look at attachment (most likely happening on NODE3 - the requestor). But how to increase it? I dug into the documentation but didn't get a clue...
Antonio:
/LIMIT is not the point. It happens if way below that number....
Anyone - I've been told by a collegue it could very well be a matter of buffer exhaustion. But again: I cannot find a clue on how to increase this.
Attached: some info from each node involved. I don't know which node on the cluster invokes the problem, I tend to suspect the sender....
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-26-2003 11:35 PM
11-26-2003 11:35 PM
Re: TCPIP services do not always react
I have just posted a new thread about UCX SHO COMM command's output difference in older and newer versions of TCPIP. In older version, the Maximum, current & peak no. of device sockets were displayed, whereas in newer one it do not.
To count the no. of active devices sockets on system is - counting the no. of BG devices on the system. But question is how to find the maximum no. of device sockets in newer version ???
Best regards,
Lokesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2003 01:37 AM
11-27-2003 01:37 AM
Re: TCPIP services do not always react
The service is only enabled on one node of the cluster (NodeB).
If the problem occurs, it will happen that the sender (NodeC) will hang for some time. If the service on port 3013 is disabled and enabled on nodeB, the problem is over - for some time. But after a few requests have been sent, the problem turns up again.
I have requested somne more info and updated the document - again attached (plain text).
Idea: could it be that the limit exists on NodeC? Since the service on NodeB is not started at all (not even a message!) it could be possible the request was never sent?
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2003 04:57 AM
11-27-2003 04:57 AM
Re: TCPIP services do not always react
as I told prior I'm not sure about the reason of your trouble.
Service on port 3013 is limited to 15 connection; perhaps, if I understand, you need less then 15 connection concurrently; if you suppose some connection are not right close, may happens (after 15 connection) your sistem hangs because NodeB has exhaused resource due prior active (also unused) connection. I realize that I simply a lot the problem but you can check this quickly if you set service limit to 50 (for example): if your problem happens later (because it happpens however) you could investigate why some connection stay alive.
Remember if you change service limit you must stop service and restart it.
At moment I've not any other idea about.
Antoniov
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2003 06:58 PM
11-27-2003 06:58 PM
Re: TCPIP services do not always react
LIMIT this is not the problem. See attachement, I tried to explain in more detail.
But I appriciate your new thread, it gives me a next request for information ;-)
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2003 09:35 PM
11-27-2003 09:35 PM
Re: TCPIP services do not always react
Setup testmachine with same application environment, this machine has no problem at all, where NodeC hangs time after time. Even when NodeC request is waiting to be connected, this testmachine's request is served! It can NOT be replicated there.
It seems intermittendly going wrong. One time the request comes on NodeB, a request issued just a few moments later will end in falure but repeated, it _may_ succeed. There's no guarantee it will. We didn't find a pattern. It seems the requests is never leaving NodeC, since we don't see anything happen on NodeB - where we DO see that the testmachine IS serverd. (Number of active services is increased).
So we concluded so far:
* The problem is NOT on NodeB, otherwise there would be problems with other systems as well, and the testsystem has no problems.
* The problem is NOT the NIC on nodeB - for the same reason
* The problem is NOT the NIC on NodeC - for the same reason
Remains: some setting on NodeC.
We're open for suggestions WHAT to change....
I included ana/sys output from both nodes, and the current SYSGEN parameters on NodeC. BTW: The application uses an RDB database on that machine. For that reason, some parameters will have quite high values.
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2003 11:29 PM
11-27-2003 11:29 PM
Re: TCPIP services do not always react
then you will have plenty of time to analyse the hang.
Regards
Gerard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2003 01:12 AM
11-28-2003 01:12 AM
Re: TCPIP services do not always react
I would suggest this when it were just a test system. But this is a procuction system running a database. Last resort, perhaps, and only if really unavoidable.
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-28-2003 08:54 PM
11-28-2003 08:54 PM
Re: TCPIP services do not always react
here some clue to analyze.
a)TCP/IP is good installated?
TCPIP>sysconfig -s
You must see inet,socket and arp loaded and configurated (on all hosts).
b)Have you sufficient socket?
TCPIP>sysconfig -q socket
somaxconn must be at least 1024
HP hints a high value (also 65536) on server (on NodeA and NodeB). Also HP hints on server set pmtu_enabled=0. Here you can read more details: http://h71000.www7.hp.com/doc/73final/6631/6631pro_contents.html
Reread you attachment; I've seen on NodeB out-of-order packets are 0,27% while on NodeC rate 2,16%; may be trouble is on NodeC?
On NodeC:
TCPIP>SH DEV
Look for dev used for request service, then
TCPIP>SH DEV
Here you could find some insuficient value.
Can you repeat on server NodeB, too.
Bye
Antoniov
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-30-2003 09:45 PM
11-30-2003 09:45 PM
Re: TCPIP services do not always react
My guess is indeed that NodeC causes the problems. However, it's not the services that go wrong. NodeC issues the request so outgoing traffic seems to be the problem. It may depend on other TCPIP traffic (Telnet sessions...), so I've asked for some more details - when the application seems to hang.
(Alas, I have no direct access to that machine, I have to rely on others....)
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 08:04 PM
12-03-2003 08:04 PM
Re: TCPIP services do not always react
What is "sobacklogdrops" - connections dropped due to time-out?
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 09:30 PM
12-03-2003 09:30 PM
Re: TCPIP services do not always react
in link I posted upper, you can read:
[...]
Network performance can degrade if a client overfills a socket listen queue
with TCP SYN packets, thereby blocking other users from the queue. To
eliminate this problem, increase the value of the sominconn attribute to its
maximum value. If the system continues to drop SYN packets, decrease the
value of the tcp_keepinit attribute to 30 (15 seconds). Monitor the values of
the sobacklog_drops and somaxconn_drops attributes to determine whether the
system is dropping packets. (See Section 2.3.2 for more information about event
counters.)
You can modify the tcp_keepinit attribute without rebooting the system.
[...]2.3.2
The socket subsystem has three attributes that monitor socket listen queue
events:
â ¢ The sobacklog_hiwat attribute counts the maximum number of pending
requests to any server socket.
â ¢ The sobacklog_drops attribute counts the number of times the system
dropped a received SYN packet because the number of queued SYN_RCVD
connections for a socket equaled the socketâ s backlog limit.
â ¢ The somaxconn_drops attribute counts the number of times the system
dropped a received SYN packet because the number of queued SYN_RCVD
connections for the socket equaled the upper limit on the backlog length
(somaxconn attribute).
The initial value of these attributes is 0. Use the sysconfig -q socket command
to display the current attribute values. If the values show that the queues are
overflowing, you may need to increase the socket listen queue limit.
The value of the sominconn attribute should equal the value of the somaxconn
attribute. When these two attributes are equal, the value of somaxconn_drops
will have the same value as sobacklog_drops.
However, if the value of the sominconn attribute is 0 (the default), and if one
or more server applications uses an inadequate value for the backlog argument
to its listen system call, the value of sobacklog_drops may increase at a rate
that is faster than the rate at which the somaxconn_drops counter increases. If
this occurs, you may want to increase the value of the sominconn attribute.
H.T
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-10-2003 01:25 AM
12-10-2003 01:25 AM
Re: TCPIP services do not always react
Node B is 4100, 2G memory. I counted over 400 IP sessions.
NodeC is ES40, 2Gb memory, with 124 IP sessions.
Testmachine - functionally equal to NodeC - is some small, old Alpha system.
Whenever NodeC cannot connect (hangs), the very same request is repeatedly sent from this (relatively slow)testmachine, and it succeeds time after time. This kind of proves there is something wrong on NodeC.
A sudden thought: Could it be a case that ES40 is far to fast compared to 4100?
I have asked for tracing (TCPTRACE) on both nodes to see what traffic occurs. I will come back to this later.
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-16-2003 08:31 PM
12-16-2003 08:31 PM
Re: TCPIP services do not always react
NodeC had a problem with TCPTRACE, couldn't lock the pages in the working set. After /BUFFERS=50 (half the default) no data could be written.
Could it be a memory assignement problem - Too many connections perhaps? That could explain why one request will succeed one time and fail another....
OpenVMS Developer & System Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-16-2003 09:08 PM
12-16-2003 09:08 PM
Re: TCPIP services do not always react
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2003 07:02 PM
12-21-2003 07:02 PM
Re: TCPIP services do not always react
After consulting HP we found this:
The application on NodeC starts the communication with the right IP address: 10.21.0.12 (we can prove that!). However, a BG-device than allocated says the remote system is 108.21.0.12. It won't find that machine - so the connection times out.
If we specify the nodename : NodeB, all is running fine. Without a problem!
So my first idea was to suspect routing tables that contain the wrong information, but in second thought that couldn't be true, since when nodename was specified, taht would than show the same problem. So it's not the routing tables....
Final possibility: The module that initiates the connection is erring. It uses the socket interface. Still I don't get it. This module is used so very often, in so many applications that my thought is that it should have problems elsewhere. But this is the first (and so far: only) place that we've got trouble with it.
OpenVMS Developer & System Manager