RPC & Portmapper

Rob Houghton · ‎08-20-2010

Hi, does anyone have any expierencing using RPC & PORTMAPPER. We have an application that is using RPC & PORTMAPPER for communciation and under high load, we seem to be having queueing issues. This is possibly down to the actual PORTMAPPER process. Also if the process run's with SYSPRV/BYPASS & OPER or a UIC < maxsysgroup(8) the system grinds to a halt and we expierence processes going into MUTEX.

What I'm really after is any additional portmapper configuration that can be setup for High Performance & throughput.

Im new to these forums so please go easy....

Thanks

Rob.

Rob Houghton · ‎08-20-2010

Probably best to post the version of TCPIP where using along with the version of OpenVMS to.

TCPIP v5.6 ECo5
OpenVMS V8.3-1H1

labadie_1 · ‎08-20-2010

Hi Rob

>>>processes going into MUTEX.

See
http://h71000.www7.hp.com/wizard/wiz_7120.html

Dig into SDA and check if it is a bytlm issue, or tqelm...

Hoff · ‎08-20-2010

The mutex (and potential quota exhaustion) may well be secondary to the error. The goal of the quota mechanisms were to prevent a run-away application from consuming all available (shared) resources.

RPC is a communications mechanism, and (as mentioned elsewhere in the thread) RPC and networking in general can wedge for lack of quotas.

Portmapper is a directory service mechanism for IP, and not usually something that is (in isolation) performance critical.

High loads are absolutely the right conditions for exposing latent bugs in application (and system) code, too. I've seen these bugs latent for decades. Application routines get slightly slower than encountered under debugging and test conditions, and (problematic or missing) synchronization goes off the rails, or when there are volatile variables that get corrupted, and run-away sequences can easily arise.

If this is your code or something you're supporting, you're going to be debugging.

If this is a product you're using and supported by some entity, you'll want to contact the vendor.

If this is your application code, here is some additional reading for you:

http://h71000.www7.hp.com/wizard/wiz_1661.html

Rob Houghton · ‎08-23-2010

Hi,

Many thanks for your responses, I'm not the developer who wrote the code. Just someone who's trying to debug the performance issue. When the initial TCP socket is created there appears to be a QLIMIT which is set low to 4. What I would like to know is. Where is this LIMIT set. I've looked at the RPC programmers guide and I can't see anything in there. If I monitor this device during the operation I can see us hitting 4 on regular occassions and I do appear to be getting delays from some clients. What does TCPIP do with the outstanding connections? Is it held in a queue/buffer somewhere. Is it possible to amend the QLIMIT? I presume these attributes can be set when the socket is created.

Device_socket: bg9923 Type: STREAM
LOCAL REMOTE
Port: 1023 0
Host: * *
Service:

RECEIVE SEND
Queued I/O 0 0
Q0LEN 0 Socket buffer bytes 0 0
QLEN 0 Socket buffer quota 61440 61440
QLIMIT 4 Total buffer alloc 0 0
TIMEO 0 Total buffer limit 491520 491520
ERROR 0 Buffer or I/O waits 0 0
OOBMARK 0 Buffer or I/O drops 0 0
I/O completed 14996 0
Bytes transferred 0 0

Options: ACCEPT
State: PRIV
RCV Buff: SEL ASYNC
SND Buff: ASYNC

Jim_McKinney · ‎08-23-2010

That QLIMIT is also known as the backlog - and the backlog value is controlled by the second parameter of the listen() function that I expect you would find in the source code for the application.

When connections arrive more quickly than they can be processed (perform three-way handshake and become established), if the listener permits, those new connections will be held in a queue by the stack until the application's listener is ready to accomodate. Once the listener's backlog queue is full, any subsequent connection requests will be rejected until an already queued request is processed to open up a new queue slot.

Rob Houghton · ‎08-23-2010

Thanks Jim - the developer is using svctcp_create & svc_register from the RPC library. He's not using the listen()function, I'm presuming the RPC library implements this. Is there a way to feed this in? If the QLIMIT is the backlog can someone explain why when I set define TCPIP$SOCKET_TRACE to 1 I recieve the following output which shows a backlog of 2 but the QLIMIT is 4.

17:15:26.73 +socket family: 2, type: 2, proto: 0
17:15:26.73 -socket chan: 0x130, st: 0x1, iosb: 0x1 0
17:15:26.73 *ioctl sock: 0x130, req: 0xc0086914
17:15:26.73 *ioctl sock: 0x130, req: 0xc0206911
17:15:26.73 *ioctl sock: 0x130, req: 0xc0206911
17:15:26.73 *close sock: 0x130, st: 0x1
17:15:26.73 +socket family: 2, type: 2, proto: 17
17:15:26.73 -socket chan: 0x130, st: 0x1, iosb: 0x1 0
17:15:26.73 *bind sock: 0x130, st: 0x1, iosb: 0x1 0
17:15:26.73 *ioctl sock: 0x130, req: 0x8004667e
17:15:26.73 +sendto_64 sock: 0x130, len: 60, flags: 0x0
17:15:26.73 -sendto_64 st: 0x1, iosb: 0x1 60 60
17:15:26.73 +select nfds: 128, timeout.sec: 5, timeout.usec: 0
17:15:26.73 assigning initial select channel
socket channels upon calling select:
read: 0x130
socket channels upon returning from select:
read: 0x130
17:15:26.73 -select st: 0x1, iosb 0x1 1, nfds: 1
17:15:26.73 +recvfrom_64 sock: 0x130, len: 400, flags: 0x0
17:15:26.73 -recvfrom_64 st: 0x1, iosb: 0x1 28 28
17:15:26.73 *close sock: 0x130, st: 0x1
trying to unregister any previous service

17:15:26.73 +socket family: 2, type: 1, proto: 6
17:15:26.74 -socket chan: 0x130, st: 0x1, iosb: 0x1 0
17:15:26.74 *bind sock: 0x130, st: 0x1, iosb: 0x94 48
17:15:26.74 *bind sock: 0x130, st: 0x1, iosb: 0x1 0
17:15:26.74 *getsockname sock: 0x130
17:15:26.74 *listen sock: 0x130, backlog: 2

Hoff · ‎08-23-2010

> I'm not the developer who wrote the code. Just someone who's trying to debug the performance issue. ...

Don't start with a narrow view of the environment when performing application tuning. In the inimitable words of R.J. Squirrel, "that trick never works."

As a developer with source code access, start with DECset PCA or an analogous (and equivalently broad) coverage of the application activity, and find the code that's sitting on the biggest pile of wall-clock, and work from there.

If you don't have the application source code, then you're going to have (more) problems and (more) effort with the performance tuning, and (if it's a commercial package) you'll often want to chat with the vendor's support folks.

As for external tuning and monitoring, you'll also want to utilize tools such as T4 (or MONITOR directly) and watch what (external) activities are involved with the application. (T40 or MONITOR- and SDA-based performance-monitoring activities are nowhere near what DECset PCA or instrumented code can get you, though.) I/O, window turns, memory usage, etc.

This could be application load.

This could be server load.

This could be disk fragmentation.

This could be a network error.

This could actually be an RPC or Portmapper issue.

Or this could be something completely different.

Having tuned code written by myself and by others, it's common to find the performance limits are not where I thought they were lurking, too. While DECset PCA can provide confirmation of a theory, it can also provide performance revelations.

Or grind the box to a halt, force a crash, and analyze the system dump, too. If you're getting (unrelated) application processes in MUTEX states, there's likely a shared resource here that's being depleted. If this is causing MUTEX errors on parts of the application, it could be application bugs or loading or insufficient quotas. That could be how the application works at the current load, or it could be the scale of the application, or it could be insufficient hardware, or it could be indicative of a leak. And it could be a hardware problem.

Hoff · ‎08-23-2010

Map out what the application network traffic is doing here, too. RPC obviously isn't particularly cheap as procedure calling mechanisms go. And it means you (also) need to figure out what's going on with the remote end of the connection; whether the turn-around delays are due to the network speeds and feeds, or due to the (potential lack of) speed on the processing of the remote end of the RPC call.

FWIW...

17:15:26.74 *bind sock: 0x130, st: 0x1, iosb: 0x94 48

"%SYSTEM-F-DUPLNAM, duplicate name"

If you're getting a combination of excessive RPC calls and sufficiently large numbers of process creations and problems with network connections and remote server sluggishness, you're approaching a mountain of slowness; performance can and usually will tank.

Martin D Platts · ‎08-26-2010

Robs on holiday now so I'm holding this baby at present.

I think the duplicate name is just because we were previously running the server so its trying to unregister the previous one which takes a fraction of a second to unwind itself from the various data structures in the system.

We are now looking to make a small shrinkwrap demonstrator for the problem and then raise it with HP - a server to register itself and a client and batch job which can make lots of calls to the portmapper to find the server and show that as the number of people requesting the lookup grows beyond about 6-7 then processes are forced into this 25s or so wait time along with excessive mutex delays (I understand the need for the mutex to coordinate access to the resource, just not the effect it exhibits/causes).

Rob Houghton · ‎01-13-2011

RPC wasn't suitable for the amount of load we were generating. Closing thread

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

RPC & Portmapper

RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper

Re: RPC & Portmapper