Weird memory problem

Peter Hug · ‎11-07-2006

I frequently use GlancePlus C.03.72.00 to monitor memory usage un an HP-UX 11i machine.

There are two memory values glance reports, RSS and VSS. What I find under HP-UX is that even though my application deallocates memory, this fact is not reported by glance.

Attached is a small C++ app eatram which can allocate blocks of memory (in multiples of 1MB) from the local heap. If I run this program on an HP-UX 11i and allocate say 500MB of RAM, glance correctly shows RSS and VSS as 500MB. If I then release RAM to 0MB without shutting down the utility, glance reports something like RSS/VSS=300MB/500MB.

While I think Glance may not correctly report eatram's memory usage, what really puzzles me is that when I exit eatram, glance reports that available memory on the machine has grown by the amount of memory it still thought was allocated to eatram.

As far as I can see the C++ code releases memory correctly. Am I experience a problem with Glance or is there something wrong with my code?

Thanks for any feedback in advance
Pete

Steven E. Protter · ‎11-07-2006

Shalom Pete,

Glance is up to version 4.5. You might want to updgrade and try a different version.

You could also use tusc and hang a tusc trace on the appliction to insure its really releasing the memory. Memory leaks are easy to cause and hard to find and fix.

Always good to run a tool check when you get strange results.

:-)

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

A. Clay Stephenson · ‎11-07-2006

This is normal behavior whether it is done using the free() function or embedded within a C++ destructor, the memory is not returned to the system but rather is placed back on the process's heap for possible subsequent reuse by the current process. It is possible to actually return memory to the OS via the sbrk() system call using a negative increase parameter but sbrk() can only be used if ALL memory allocation is done via sbrk(). This means that new(), malloc(), calloc(), ... can never be used -- including library functions which might make use of the standard memory allocation tools.

If it ain't broke, I can fix that.

Peter Hug · ‎11-07-2006

Hi Clay,

Does this mean that if another process needs more memory the OS can reduce the size of my processes heap automatically?

Pete

Bill Hassell · ‎11-07-2006

> Does this mean that if another process needs more memory the OS can reduce the size of my processes heap automatically?

Nope. The memory remains reserved until the program terminates. Heap or local memory as well as shared memory is never adjusted by the OS. However, HP-UX is a virtual memory system which means that 5 programs that malloc 500megs each will run in 250megs of RAM as long as you have plenty of swap space. (the programs will swap in and out mercilessly)

Bill Hassell, sysadmin

Peter Hug · ‎11-07-2006

The main reason I wrote the eatram utility was because I felt that glance incorrectly reported memory usage of my daemon.

What I tyically experience is something along these lines:

Glance reports that my daemon is using RSS/VSS 160MB/220MB. So while my daemon is idle I run eatram and allocate loads of memory from the local heap. Surprisingly, glance shows that my dameons memory usage decreases to 38MB/220MB.

The reduced figure typically coincides with what I experience when my daemon is idle under windows (as a service).

So it appears that somehow that RSS figure can come down if another (hungry) process requires more memory.

How could that be explained if what you stated Bill is correct?

Steven E. Protter · ‎11-07-2006

Shalom again,

The behavior of your program with respect to a smaller heap depends totally on how it was coded.

Memory can be returned to the heap without releasing it to the OS or changing the heap size.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Peter Hug · ‎11-07-2006

I'm simply using the global new and delete operators which I would imagine are implemented in libstdc++.

Dennis Handly · ‎11-07-2006

>Peter: So it appears that somehow that RSS figure can come down if another (hungry) process requires more memory.

This is correct. RSS measures how much of the process is in memory. (R is for resident.)

>Steven: Memory leaks are easy to cause and hard to find and fix.

gdb has some commands to do leak detection:
info leaks
info heap
You can download gdb from:
http://www.hp.com/go/wdb

>Clay: It is possible to actually return memory to the OS via the sbrk() system call using a negative increase parameter but sbrk() can only be used if ALL memory allocation is done via sbrk(). This means that new(), malloc(), calloc(), ... can never be used --

This is not quite true. The 11.23 mallopt(3) has M_REL_LAST_FBLK so that a free may release the last block with sbrk.

Peter Hug · ‎11-09-2006

I'm just really lost here...

My daemon runs threads which are activated by some external triggers. Each thread is quite memory hungry. However, I can limit the number of concurrent threads of execution and currently I have set these to 3.

Under Windows and Linux RH I can see that reported memory usage increases/decreases according to the number of threads running.

Also, I have used leak detectors under Windows and there are no problems whatsoever. I haven't tried wdb because I compile my code with gcc and therefore use the gnu gdb debugger which hasn't got these features. Will my code be compatible with wdb?

Somehow, under HP-UX the RSS memory seems to constantly increase. This seems to be consistent with the eatram program I attached to the first message in this thread.

What really concerns me is that glance shows both Mem and Swap as really high and it seems to simply not release memory when I think it should.

A. Clay Stephenson · ‎11-09-2006

When you compare Linux and especially Windows to HP-UX it's an apples to oranges comparison. Not that sbrk() applies to a process not a thread however for threaded applications, the behavior of malloc() and its cousins has been modified to use memory "arenas". Man malloc and look for the threads references and the associated environment variables.

One of the other big gotcha's in UNIX memory allocation is realloc(). Most times realloc() can expand a block of memory in place so that pointers which reference the block are still valid; however, a careful reading of the man page will indicate that this behavior is not guaranteed. Sometimes (and depending upon the flavor of UNIX), realloc is not able to expand the black in-place and must allocate an entire new larger chunk and copy the old chunk's data to the new. If your code references locations in the old block then the behavior is unstable. Often, because the old chunk was placed back on the malloc heap and hasn't been reused (yet), your code continues to work until another seeming harmless allocations reuses the old chunk. I once fixed this very problem in a MAJOR database vendor's source code --- and the excuse was well, it doesn't do that on brand X UNIX --- and my response was "go read the man page, again".

If it ain't broke, I can fix that.

Dennis Handly · ‎11-09-2006

>I'm just really lost here...

It's pretty simple. Once you malloc memory, you have it forever. But you can reuse it or it can be swapped out if whole pages aren't used.

>Will my code be compatible with wdb?

If you are looking for leaks, yes.

>glance shows both Mem and Swap as really high and it seems to simply not release memory when I think it should.

You can't release it until you terminate the process. Or use mmap to allocate temporary memory.

Why are you worried about this issue? Are you short on swap space?

Note: high memory is good. High CPU and I/O for swapping is bad. So if you don't have the latter, why worry?

Peter Hug · ‎11-13-2006

>Why are you worried about this issue? Are you short on swap space?

No, the reason is that I am experiencing something under HP-UX that never happens under Linux RH 9 or Windows: I get a segmentation fault when memory usage on the system is really high.

This is the main reason I coded the eatram program so I could artificially consume memory on the machine which ultimately caused my application to core.

The stack trace of the offending thread I get from the core dump doesn't make any sense though:

#0 0xc23d9504 in std::__default_alloc_template::allocate(unsigned long) (__n=20)
at /opt/apal3/GCC/gcc-bin-3.3.2/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/bits/stl_alloc.h:417
#1 0x002144f4 in std::_Rb_tree<:key>, D3::KeyLessPredicate, std::allocator<:key> >::_M_create_node(D3::Key* const&) () #2 0x002144c0 in std::_Rb_tree<:key>, D3::KeyLessPredicate, std::allocator<:key> >::_M_insert(std::_Rb_tree_node_base*, std::_Rb_tr ee_node_base*, D3::Key* const&) () #3 0x002143a4 in std::_Rb_tree<:key>, D3::KeyLessPredicate, std::allocator<:key> >::insert_equal(D3::Key* const&) () #4 0x002de6b8 in D3::Relation::AddChild(D3::Entity*) () #5 0x0027a620 in D3::InstanceKey::NotifyRelationsAfterUpdate() () #6 0x00279e34 in D3::InstanceKey::On_AfterUpdate() () #7 0x002680c8 in D3::Entity::On_AfterPopulatingObject(bool) () #8 0x002a7664 in D3::OTLDatabase::PopulateObject(D3::MetaEntity*, D3::OTLRecord&, bool) () #9 0x002a6c2c in D3::OTLDatabase::LoadObjects(D3::MetaEntity*, std::list<:entity> >*, std::string const&, bool) () #10 0x00297758 in D3::OTLDatabase::LoadObjects(D3::MetaEntity*, std::string const&, bool) () #11 0x00357188 in D3::AP3Shipment::DeepLoadObject(D3::Database*, std::string const&, std::string const&, D3::APAL3Activity*) () #12 0x00357b9c in D3::AP3Shipment::MakeShipmentResident(D3::Database*, std::string const&, std::string const&, D3::APAL3Activity*) () #13 0x000a2fe0 in apal3::TMSHandler_ImportShipment(void*) () #14 0x0038a40c in APAL3ActivityHandler::Process(D3::Database*) () #15 0x0038a278 in APAL3ActivityHandler::ProcessHandler(void*) () #16 0x000a08f0 in apal3::ActivityThread::Run() () #17 0x000b74d4 in apal3::Thread::operator()() () #18 0x000ba4d4 in boost::detail::function::void_function_obj_invoker0<:thread>::invoke(boost::detail::function::any_pointer)

I don't trust that trace because I find it hard to believe that the standard library would core.

BTW, a fact is that Frame #6 traps the std::bad_alloc& exception. Frames #4 and 5 don't trap any exceptions and the lower frames are all from the standard C++ library.

Another reason for me to look at memory issues was the fact that in production, the software runs on many machines completely trouble free for months on end. However, it occasionally crashes on the test machine. The only significant difference between these machines is that the test server is utilised more heavily and has only 4GB of RAM while the production machines have 6GB.

I'm non the wiser.

Pete

Dennis Handly · ‎11-13-2006

>I get a segmentation fault when memory usage on the system is really high.

Ah, your real problem. You need more swap space.

>This is the main reason I coded the eatram program so I could artificially consume memory

(You could also use the ulimit -d to limit memory, when you start the process.)

>The stack trace of the offending thread I get from the core dump doesn't make any sense though:
#0 0xc23d9504 in std::__default_alloc_template::allocate(unsigned long) (__n=20) at .../gcc-bin-3.3.2/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/bits
/stl_alloc.h:417
#1 0x002144f4 in std::_Rb_tree<:key>::_M_create_node
#2 0x002144c0 in std::_Rb_tree<:key>::_M_insert
#3 0x002143a4 in std::_Rb_tree<:key>::insert_equal
#4 0x002de6b8 in D3::Relation::AddChild(D3::Entity*)
#5 0x0027a620 in D3::InstanceKey::NotifyRelationsAfterUpdate
#6 0x00279e34 in D3::InstanceKey::On_AfterUpdate

>I don't trust that trace because I find it hard to believe that the standard library would core.
>BTW, a fact is that Frame #6 traps the std::bad_alloc& exception.

Why not? :-(
aC++ did on IPF when throwing bad_alloc until we implemented an emergency pool. Can you look at stl_alloc.h line 417.

>it occasionally crashes on the test machine. The only significant difference between these machines is that the test server is utilised more heavily and has only 4GB of RAM while

This looks like lack of swap space and sloppy programming that doesn't check the return from malloc.

Peter Hug · ‎11-14-2006

I'm really surprised Dennis, I was somehow relying 100% on the standard C++ library.

Below is a printout of the method

std::__default_alloc_template::allocate(size_t __n)

which gdb reported as causing a segmentation fault on line 417 (I have marked line 417 with the appendet comment "// Line 417"). Sorry but this code is way beyond me, I'm only a C++ programmer, not a C++ standard library implementor.

static void* allocate(size_t __n)
{
void* __ret = 0;

// If there is a race through here, assume answer from getenv
// will resolve in same direction. Inspired by techniques
// to efficiently support threading found in basic_string.h.

if (_S_force_new == 0)
{
if (getenv("GLIBCPP_FORCE_NEW"))
__atomic_add(&_S_force_new, 1);
else
__atomic_add(&_S_force_new, -1);
}

if ((__n > (size_t) _MAX_BYTES) || (_S_force_new > 0))
{
__ret = __new_alloc::allocate(__n);
}
else
{
_Obj* volatile* __my_free_list = _S_free_list + _S_freelist_index(__n);

// Acquire the lock here with a constructor call. This
// ensures that it is released in exit or during stack
// unwinding.

_Lock __lock_instance;

_Obj* __restrict__ __result = *__my_free_list;

if (__builtin_expect(__result == 0, 0))
{
__ret = _S_refill(_S_round_up(__n));
}
else
{
*__my_free_list = __result -> _M_free_list_link; // Line 417
__ret = __result;
}

if (__builtin_expect(__ret == 0, 0))
__throw_bad_alloc();
}

return __ret;
}

Dennis Handly · ‎11-14-2006

>I'm really surprised Dennis, I was somehow relying 100% on the standard C++ library.

Did you pay anything for it? ;-)
(But it appears you may have corrupted the heap.)

For libCsup we had to implement a massive effort to fix several libs because of a very important customer.

>gdb reported as causing a segmentation fault on line 417 (I have marked line 417 with the appendet comment "// Line 417"). Sorry but this code is way beyond me, I'm only a C++ programmer, not a C++ standard library implementor.

(I play one in real life. :-)
Ok, it is lots fancier than libCsup, which just calls malloc.

It appears that there may be several issues with the code:
_Obj* volatile* __my_free_list = _S_free_list + _S_freelist_index(__n);

I'm assuming _S_freelist_index() just does an indexing operation and possibly rounds up. I.e. doesn't need to be in the lock.

// Acquire the lock here with a constructor call
_Lock __lock_instance;
_Obj* __restrict__ __result = *__my_free_list;

This load is protected, so any other functions fiddling with it better do the same. Line 417 may be the only other place?

if (__builtin_expect(__result == 0, 0))
__ret = _S_refill(_S_round_up(__n));

I assume this checks to see if the pool has something in it and if not, gets more.

*__my_free_list = __result -> _M_free_list_link; // Line 417

If this aborts, the free list has been blasted. Perhaps this is exactly the same as what happened to me on Linux recently. With aC++'s STL, there is slop in a vector. But on Linux there isn't. And if you go beyond what you have allocated, it may blast the _M_free_list_link.

So if you have blasted the _M_free_list_link in the 10th free element, when you go to get more space for the 10th time, you will put the bad value into __my_free_list and you will abort on the next time.

Can you print the values in the function:
info locals
What is the value of __result, __my_free_list

Peter Hug · ‎11-15-2006

> Did you pay anything for it? ;-)

True and I do feel guilty :-$

> (But it appears you may have corrupted the heap.)

That is my biggest worry. As a result of this thread I have requested the client install wdb on the development server. Thene again, if I wrote to incorrect memory locations I would expect the code to crash under other OS too particularly since it runs for a very long time on any machine.

> // Acquire the lock here with a constructor call
> _Lock __lock_instance;
> _Obj* __restrict__ __result = *__my_free_list;
>
> This load is protected, so any other functions fiddling
> with it better do the same. Line 417 may be the only other place?

No, I did find it in other places. I have the std_alloc.h enclosed with this post.

> *__my_free_list = __result -> _M_free_list_link; // Line 417
>
> If this aborts, the free list has been blasted.

Yiekes, that sounds very scary.

> Can you print the values in the function:
> info locals
> What is the value of __result, __my_free_list

Now here is one more clue. I have three separate core dumps. The interesting thing is that all three core dumps crashed in the exact same place (stl_alloc.h line 417). What is surprising though is that backtrace showed that the program was in all three cases doing totally differrent, unrelated things. Even more surprising was that when I did the debugger commands on all three cores the result was precisely the same:

(gdb) info locals
__my_free_list = (__default_alloc_template::_Obj * volatile*) 0x680bde08
__lock_instance = {}
__ret = (void *) 0x9
__ret = (void *) 0x9

(gdb) print __result
$3 = (void *) 0x7

(gdb) print __my_free_list
$4 = (__default_alloc_template::_Obj * volatile*) 0x680bde08

(I also can't understand why __ret appears twice in the locals list).

Hey thanks heaps for your feedback here Dennis - I really don't know where else to turn to!

Dennis Handly · ‎11-15-2006

>As a result of this thread I have requested the client install wdb on the development server. Then again, if I wrote to incorrect memory locations I would expect the code to crash under other OS too particularly since it runs for a very long time on any machine.

We were not able to get wdb to detect it with "heap check" because of the slop in HP's STL container. Also, you'll have problems because the g++ STL allocator is allocating a big block, (so gdb won't know about the small blocks) and it is the control info in the block that is getting blasted.

You may be able to either workaround it or detect it by using:
export GLIBCPP_FORCE_NEW=

This always go through the __new_alloc::allocate path, which I assume is a call to eventually malloc?

Setting GLIBCPP_FORCE_NEW essentially turns off the g++ STL small block allocator.

To use your current STL and gdb, you must know what the data structure is and be able to set the watch point on the right place. And you must be able to duplicate it in a short enough time so you won't get bored in the debugger. :-)

Trying to keep enough info around in your core file would require you to instrument the code to leave some traces.

>Now here is one more clue. I have three separate core dumps. The interesting thing is that all three core dumps crashed in the exact same place (stl_alloc.h line 417). What is surprising though is that backtrace showed that the program was in all three cases doing totally different, unrelated things.

This would be expected with threading. All thread cases were probably allocating a 20 byte item?

>Even more surprising was that when I did the debugger commands on all three cores the result was precisely the same:
__my_free_list = 0x680bde08
(gdb) print __result
$3 = (void *) 0x7

This is good, it means you only have to check for 7.

>(I also can't understand why __ret appears twice in the locals list).

Right, __result should have been there.

>Hey thanks heaps for your feedback here Dennis - I really don't know where else to turn to!

You're just lucky we already debugged something like this and it took a while.

I'll attach a modified version of stl_alloc.h, just compile everything with -DTRACE_CORRUPTION.

Dennis Handly · ‎11-16-2006

Note: I didn't compile or verify the changes in stl_alloc.h I gave you. There are a few places where I need to skip 0 values since they are not a sign of corruption, or we can't tell:

430: if ((unsigned long)__temp < 0x10000L) {
New:
430: if (__temp && (unsigned long)__temp < 0x10000L) {

469: if ((unsigned long)__temp < 0x10000L) {
New:
469: if (__temp && (unsigned long)__temp < 0x10000L) {

Peter Hug · ‎11-19-2006

That's fantastic Dennis. I'll give it a go as soon as I can spare some time. The problem is that I'll need to rebuild everything and unfortunately on the test machine that can take considerable time.

I post back the results soon.

Thanks heaps for your support - where can I send the beer?

Dennis Handly · ‎11-20-2006

>The problem is that I'll need to rebuild everything and unfortunately on the test machine that can take considerable time.

If you are lucky, all you need to do is rebuild the lib. If not, you'll have to rebuild every module that uses those allocate/deallocate member functions. I suppose you could use nm -pxNA to scan the objects to see how pervasive the functions are. (If inlined, that won't work.)

Peter Hug · ‎11-29-2006

We have done all of what you suggested. But now have another problem: as soon as we start the app, it starts writing massive amounts of messages like:

1164805853:243022000:9987394:13105:1:2:33:4:0x680c80d8:0x0:0x1:0x20:0x0:0x0:

to a file called tr.x.00001 (where x=ProcID)

Eventually the app crashes and the last messages in the above file read:

Pthread internal error: message: write() failed, file: /ux/core/libs/threadslibs/src/common/pthreads/trace.c, line: 774
Error msg: No space left on device

From these messages I take it that pthread is printing these messages. Does this sound familiar to you? How could I suppress these?

Dennis Handly · ‎11-29-2006

>as soon as we start the app, it starts writing massive amounts of messages

I can't see how those messages would come from U_STACK_TRACE and the fprintfs I suggested.

Pthread internal error: message: write() failed, file: .../pthreads/trace.c, line: 774
>From these messages I take it that pthread is printing these messages. Does this sound familiar to you? How could I suppress these?

Yes, libpthread*.
This is completely unexpected. Are you using libpthread_tr?

Peter Hug · ‎11-30-2006

We are using boost::threads.

Dennis Handly · ‎11-30-2006

>We are using boost::threads.

Are you linking with -lpthread_tr?

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Weird memory problem

Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem

Re: Weird memory problem