Operating System - OpenVMS
1839142 Members
3080 Online
110136 Solutions
New Discussion

Re: debugging memory leak in a pthread program

 
SOLVED
Go to solution
Guinaudeau
Frequent Advisor

debugging memory leak in a pthread program

hello,

we developed for customers a critical application running 24/24 on OpenVMS 7.3-x using posix threads (precisely, several processes in the app are DCE RPC servers).

For more precision, our systems are configured SYSGEN MULTITHREAD=1, and the RPC server programs are linked :

%THREADCP-I-MKT, multiple kernel threads are disabled
%THREADCP-I-UPC, upcalls are disabled

I was informed thru people of this forum (eg, thread 865731) about the new V7.3-2 feature :

$ SET PROC/DUMP=NOW/ID=

We have currently troubles with a memory leak within our application, and situation was yesterday very bad for one very active RPC server in the app.

Normal process values : pagefile usage < 250 Mbytes. Probably due to a memory leak, process pagefile usage ~ 420 Mbytes and VIRTPEAK ~ 540
Mbytes !

We decided with our customer a restart of the server process, but before doing it, to dump it using the SET PROC/DUMP=NOW command.

I holed the dump to our development site, on a very close system (same VMS patch level is probably mandatory). Copied the executable and the application shareable images into the login directory for the analyzer
to find them.

I try to look with the SDA PTHREAD extension, and i discovered something suspect :

DBG> pthread vm
lookaside 0 (32 bytes; obj-name) 585866 in use, 1 free
lookaside 1 (256 bytes; hash-bucket) 187 in use, 0 free
lookaside 2 (384 bytes; rwb, mub, cvb) 586318 in use, 0 free
lookaside 3 (4096 bytes; tsd-array) 0 in use, 0 free
lookaside 4 (4288 bytes; mu-meter) 0 in use, 0 free
lookaside 5 (4352 bytes; cv-meter) 0 in use, 0 free
lookaside 6 (8192 bytes; tcb) 0 in use, 0 free

On a very few active RPC server these are :

Process name: BNASR0 Extended PID: 20200184 Thread data: "vm"
-------------------------------------------------------------------------
lookaside 0 (32 bytes; obj-name) 2323 in use, 1 free
lookaside 1 (256 bytes; hash-bucket) 82 in use, 0 free
lookaside 2 (384 bytes; rwb, mub, cvb) 2338 in use, 0 free
lookaside 3 (4096 bytes; tsd-array) 0 in use, 0 free
lookaside 4 (4288 bytes; mu-meter) 0 in use, 0 free
lookaside 5 (4352 bytes; cv-meter) 0 in use, 0 free
lookaside 6 (8192 bytes; tcb) 0 in use, 0 free

The total size of the lookaside lists 0 and 2 is already
(384*586318 + 32*866) ~ 225 Mbytes

Where could i find the lookaside lists and examine some buffers to
possibly detect an error pattern ? PTHREAD HELP is very little help
for this. Could someone bring informations about this SDA extension ?

I have also discovered with "pthread threads" a series of threads
which sound suspect, although i should check with the developer
details (may be they are normal in a very active server sometimes).

I might try (if i suspect some threads) to examine the stack of some
of them using SET TASK, but i am not familiar with that.

Thanks for any help

Louis Guinaudeau
24 REPLIES 24
Travis Craig
Frequent Advisor

Re: debugging memory leak in a pthread program

Louis,

I'm sorry I don't have any answers for your specific questions. We do have a multithreaded RPC server application, though, and thought I would share a couple of experiences we have had.

First, for one type of problem we had, probably a race condition with inadequate locking between threads, we also set the MULTITHREAD Sysgen parameter to 1 and that helped, so I think you are on a good track there (assuming you have a multiprocessor).

Second, we used to link the executable without the /THREADS=(MULTIPLE_KERNEL_THREADS,UPCALLS) qualifier and we saw problems with the execution of our program. I don't remember what went wrong, but I think it might have hung and the problem might have involved the fact that we were doing significant work in AST routines, too. H-P recommends against doing anything significant in an AST routine if you are using Pthreads and I heartily agree with them, but we have legacy VMS-specific code that we are not going to change at this point. In any case, when we started linking for multiple kernel threads and upcalls, that particular problem cleared up and our application has been able to run.

I'm sorry I don't have anything for you on memory leaks or further analysis of your dump. I tried the new dump feature on our RPC server app and found nothing new for you there.

--Travis Craig
My head is cold.
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

Travis,

thanks for your answer.

outside the scope of my memory leak troubles, i would appreciate to share experiences with people like you, when i understand your experience, because our VMS application (a run-time SCADA + motif based man-machine interface) was originally purely VMS "legacy" (?) software, using VMS resources (mailboxes, lock [mostly locally and lock convert => fast ENQ ops]) for the synchronisation between [mono-threaded + AST] processes. now the app is a mix between both types of processes, those pure VMS and the new coming RPC servers

if you accept to share some experiences, would be helpfull if you have them already done :

- traceback adaptation to PTHREAD process : we had traceback on the original app, cannot have a reasonable traceback in the RPC servers

- PTHREAD SDA extension : difficult to train on the job that feature, or am i missing the right documentation, or am i not enough competent on development side (i am sysadmin and should help maintenance of these codes)

i may either open a new thread for these issues, or communicate per mail directly.

anyway, thanks for the answer.

louis
Ian Miller.
Honored Contributor

Re: debugging memory leak in a pthread program

I have not seen any documentation for the PTHREAD SDA extension apart from the help text.

You should read the appendicies of the pthreads manual
http://h71000.www7.hp.com/doc/73final/6493/6101pro_031.html#vms_appendix
http://h71000.www7.hp.com/doc/73final/6493/6101pro_033.html#debugging_threads_appendix
____________________
Purely Personal Opinion
Galen Tackett
Valued Contributor

Re: debugging memory leak in a pthread program

Ian,

> http://h71000.www7.hp.com/doc/73final/6493/6101pro_033.html#debugging_threads
_appendix

When I tried this link the web server mumbles something about "The requested document "http://h71000.www7.hp.com/doc/73final/6493/6101pro%3CBR%3E_033.html" was not found on this server."

The first link worked okay though.

Galen

Ian Miller.
Honored Contributor

Re: debugging memory leak in a pthread program

What can I say but the link works for me (Windows XP, Firefox V1.5 and IE6).
____________________
Purely Personal Opinion
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

Ian :

i know the first document, i looked some parts of it (trying to implement the traceback, see previously). but i should have a look again at it focusing on the trouble.

i knew but forgot to look at the 2nd one, will.

now now the priorities are elsewhere, so wait a minute.

also the recent information i should look at : SDA extension sources available in licensed source CD-ROM => until now we never used (thought) about such a thing, but if it helps to better use SDA PTHREAD extension, at least because one catches some structures and links in the process dump ??? i should at least examine this.

Galen :

have a look to the exact command you catch when click on the web reference. i use IE6 and i got from time to time an error like "page not found" too.

Just i look at the page URL and it was :

javascript&colon;openExternal('http://h71000.www7.hp.com/doc/73final/6493/6101pro_031.html')

i just edit manually into

http://h71000.www7.hp.com/doc/73final/6493/6101pro_031.html

and that's it.

i could probably improve my IE6 settings to avoid this, but it is not enough frequent and i didn't until now took time to find the error.
Travis Craig
Frequent Advisor

Re: debugging memory leak in a pthread program

Louis,

Yes, I'll look at your reply more closely when I have a few minutes and am willing to see what useful information we can exchange if you want to start another thread for it.

--Travis
My head is cold.
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

Travis :

any help is welcome. thanks in advance. i am also taken by other tasks and can / should be patient for that process dump to be analysed further ...

another issue is the traceback in case of thread cleanup (most of our cases) or fatal error of such a process : if you have any experience (and are ready to share it) would be of interest for us. we miss this feature where we have PTHREAD processes, and that should be possible under VMS. The VMS condition frame does exist and is valuable, but is it that the frame should be evaluated in special way due to the multi-threading and the stack peculiarity ?

louis
Galen Tackett
Valued Contributor

Re: debugging memory leak in a pthread program

Louis,

HP has a debugging tool called Visual Threads that might interest you. It's available free of charge if you join the Developer & Solution Partner Program (DSPP), and there's no charge for individual membership.

I have never really tried it out beyond installing it and seeing that it appeared to operate. But it allegedly offers some pretty nice capabilities for dealing with thread-based application debugging.

You can read more about it here: http://h21007.www2.hp.com/dspp/tech/tech_TechSoftwareDetailPage_IDX/1,1703,5077,00.html

You can get to the registration page for DSPP from:
http://h21007.www2.hp.com/dspp/tech/tech_TechSoftwareDetailPage_IDX/1,1703,5062,00.html
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

galen,

DSPP is probably the new brand name for CSA alias ... as a software house, we are anyway HP partners.

i will look at quoted links about this debugger. i red the name, but did not really try to go further. i thought i would have some other infos (eg, to analyse this lookaside list) thru SDA extension PTHREAD.

the memory consumption displayed from it is awfull in our opinions, something should be around this, would we find the code(s) which allocate them.

i will look also at another way for that leak, that is : should examine the map of the listing. for the VMS concern, eg, some DECC library run-time code alike DECC$GETENV implicitly allocate and dont deallocate (at least for the life of the corresponding thread). it should have limited effects. i can provide here link informations, possibly map listing, if someone has any suggestion around the link, esp and run-time libraries called we should carefully look at.

louis
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,


SDA extension sources available in licensed source CD-ROM


the PTHREAD$DBGSHR listings are not part of the OpenVMS source listings CD (at least not in V7.3-1).

Volker.
Ian Miller.
Honored Contributor

Re: debugging memory leak in a pthread program

On the V7.3-2 source listings CD in the [PTHREAD] directory you will find various files thd*.* which are the sources for pthread$dbgshr

This directory is not present on the V7.3-1 CD.
____________________
Purely Personal Opinion
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

volker :

do you mean : to understand which informations PTHREAD SDA extension provide, this CD-ROM would be no much help ?

practically, i got the infos of the lookaside list (see previously) and would like to understand which threads allocate these buffers (0 buffers are free), where are they located, look at their contents etc ...

would this CD-ROM gives me enough about the structures and so on to find this ? or i better would have to do this with Visual Pthread ?

yours

louis

Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

as Ian has indicated, the PTHREAD sources would be on the V7.3-2 (or higher) source listing CDs. With that information, it might be possible to find the lookaside lists in question and look at their contents.

It will still be a major task to go through the source listings to understand those structures and how they are being allocated/deallocated. This will all be pthreads internals...

It may be more efficient to order PTHREAD engineering support from HP than trying to buy the source CDs and do it yourself.

Volker.
Travis Craig
Frequent Advisor

Re: debugging memory leak in a pthread program

Louis,

I have looked more at your reply, but have no further insight or wisdom to offer for it.

I'm still willing to consider any new questions you have and/or share some experience in a new thread when you find time to do that. I also work on SCADA systems that have evolved over the years, mostly to UNIX/Linux and Windows-based systems now, but with a few of our established users still on OpenVMS.

--Travis
My head is cold.
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

from looking at your PHTREAD$SDA data, it seems that the number of packets used from the lookaside list and the used packets seem to be closely related. The same is true for the other RPC server data you've shown.

This might imply that some piece of code seems to allocate one packet from each list for a certain call/operation and not return them.

Could you correlate this number of non-returned packets with the execution count of some of your threads or with the no. of certain operations in your application ?

If you suspect a certain routine to behave incorrectly, could you try to write a little example program, create a thread calling that routine and execute it lots of time ?

What you also can do in your dump, is look around with SDA> EXA addr;100 at various addresses (outside the address range of the activated images as shown by SHOW PROC/IMA) looking for unusual or repeating patterns. If such a large portion of the process address space is used by the same kind of structures, you might spot something.

Volker.
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

here is another SDA> PTHREAD command to try:

SDA> PTHREAD show -a

will print the known object lists. From this info, we can deduct the type of packets in the lookaside 2 queue:

rwb = rwlock objects
mub = mutex objects
cvb = condition variable objects

Once we would be able to get the address of one of those queues, the PTHREAD SQUEUE command will be able to follow and format those queues.

Volker.
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

SDA> PTHREAD vm -cf

will list more details about the various lookaside lists (including high water mark and trend).

Volker.
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

after a little bit of SDA> SEARCH, I've finally been able to locate the data structures describing the VM lookaside lists in the PTHREAD$RTL image in memory, but BAD LUCK - they do not contain a queue of 'blocks in use'. The only queue listheads seem to be used for 'free blocks'.

The problem now is how to find those data strucutres, which are 'in use'. There are lots of them in your dump...

Let's see if using any of the other SDA> PTHREAD commands given in my 2 previous replies shed any more light on this problem.

Volker.
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

Visual Threads can be downloaded freely from

http://h21007.www2.hp.com/dspp/tech/tech_TechSoftwareDetailPage_IDX/1,1703,5071,00.html

Easy to install with PCSI. It has Programming Error Rules, which check for StackObjectLeak and ThreadLeak and various other things.

Just give it a try. There are some examples included to get used to it, it's just a couple of mouse-clicks ;-)

NOTE: Visual Threads is not supported with V8.2 (neither Alpha nor I64). V7.3-2 is the last supported version of OpenVMS for running Visual Threads.

Volker.
Volker Halle
Honored Contributor

Re: debugging memory leak in a pthread program

Louis,

we're getting there...

Each pthread object (e.g. condition variable, mutex etc.) occupies a little bit of static memory in a PSECT in the image. It contains a pointer into a somewhat larger structure in allocated memory (P0 space). Those structures are linked in a quadword doubly linked list from listheads in PTHREAD$RTL.

The SDA> PTHREAD xxx -f qualifier lists the addresses of those structures.

With this information, it must be possible to find the objects consuming that much memory in your process dump and linking them back to the objects created by the application code.

Visual Threads should also point out very quickly, if you terminate threads without prior deallocation of allocated objects.

Volker.
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

volker,

does the moderator of this thread sleep, or ?

sorry, i don't sleep, i am just in other battles at other "fronts" in support tasks ...

many thanks for helps of you and other peoples in the forum, i will look at your reply very carefully and will try things and will reply either this week, later next week, hopefully.

during that time, a developer has detected a subtle error (something anyway suspect) in one application routine, a routine which handles the RPC binding (as said previously, this "memory leaking" process is an RPC server, implicitly using PTHREAD).

we will verify "in the near future", hopefully within the next week, whether this suspect sequence has to do with the error, but it will take some time on our test platform. that is in my task-list, but not high priority, although this silly "bug / feature" forces currently one of our customers to restart the RPC server every week on his site (in test platform, happens very slowly, we never restart ...). on other customer sites, they sshould be happy to restart "only" every two-four weeks the same process.

If you suspect a certain routine to behave incorrectly, could you try to write a little example program, create a thread calling that routine and execute it lots of time ?

That is certainly a good idea (and we had it too). To verify our previously described hypothese on our test platform : either we apply this idea, or we will use exclusively for this test a tiny platform (no other colleague working with it, no other client connections, etc ...) to reproduce and wait eg one-two week(s). the virtual size of this RPC server grows continuously on our test platform too.

louis
Volker Halle
Honored Contributor
Solution

Re: debugging memory leak in a pthread program

A simple test running the application with VISUAL THREADS has shown a couple of MUTEXes and Condition Variables being created but not released during a specific application function.

The code was dynamically allocatiing (malloc) mutex and cv structures and initializing them with pthread_*_init, but - when finished - only deallocated these structures with 'free'.

There were missing calls to pthread_mutex_destroy and pthread_cond_destroy to also deallocate the pthread internal data structures.

Volker.
Guinaudeau
Frequent Advisor

Re: debugging memory leak in a pthread program

Late to thank for your help ...

Volker did already enter the conclusion. A silly programming error, indeed. The incorrect sequence (routine) is used very frequently in the production system.

Especially he brought to the right idea with SDA PTHREAD options to survey the allocated internal buffers.

This Visual Thread was also very usefull to locate where those remaining pthread objects were allocated. I used "visualthreads" and not "vttrace", should try somewhen the 2nd one since "visualthreads" probably needs much more resource (memory, PGFLQUOTA).

Travis, thanks for the proposed help. We have remaining troubles with RPC servers in our application.

Just now now we have solved a first issue (also thanks Volker) : a TRACEBACK facility in PTHREAD process. That did not work previously due to a conflict with the established exception handler of PTHREAD environment. Thread fatal error is traced with significant call_frame informations before termination.

Louis