1751969 Members
4607 Online
108783 Solutions
New Discussion

Re: Rare heap corruption

 
SOLVED
Go to solution
blackwater
Regular Advisor

Rare heap corruption

We have coredumps from time to time on one of our customer's servers. It seems to be caused by a heap corruption. On our servers we  have not managed to reproduce this problem so far. What approached (besides analyzing corefiles) can be used on a customer server to localize the problem?

 

I thoght that it could be librtc and tested the program with these settings:

>more rtcconfig
set heap-check off
set heap-check free on
set heap-check leaks off
set heap-check bounds on
abort_on_bounds=1
abort_on_bad_free=1

but using librtc results in significant performance decrease and I am not sure I can recomend to install librtc on a real customer server.

 

What is HP recomendation for localizing problems in this situations?

4 REPLIES 4
Dennis Handly
Acclaimed Contributor
Solution

Re: Rare heap corruption

>It seems to be caused by a heap corruption.

 

I would hope you have figured out it is more than just "seems" and be more positive.  If it isn't heap corruption, then using librtc won't help.

 

>What approached (besides analyzing corefiles) can be used on a customer server to localize the problem?

 

Have you learned anything from the corefiles?

 

>I thought that it could be librtc and tested the program with these settings:

>more rtcconfig
>set heap-check off

 

Any reason you have off here?

Make sure "show heap-check" shows it is on.  Though this file may not be the same syntax as gdb's.


>using librtc results in significant performance decrease and I am not sure I can recommend to install librtc on a real customer server.  What is HP recommendation for localizing problems in this situations?

 

librtc is the only automated way of checking.  Really nothing else to recommend except to find patterns in the aborts you have.

nitinjain
Occasional Advisor

Re: Seldom heap corruption

Definitely there will be performance impact.

The best you can do is run the application in Batch mode of librtc, you can use nudge feature to get the memory reports at any point of time by just sending a signal to the process.

You can obtain the reports at multiple times during the same application run.

 

It might be possible that the corruption happens but the crash is rare, such corruption cases could be found.

 

Even if the application crashes before you could obtain memory report, you can still obtain the corruption report from the core file as the process that dumped core had librtc preloaded.

 

You need not explicitly turn on/off individual rtc parameters you can just do "set heap-check on". It by default enables the required parameters.

 

But if the corruption itself happens rarely, they will be difficult to track down.

blackwater
Regular Advisor

Re: Rare heap corruption

>>I thought that it could be librtc and tested the program with these settings:

>>more rtcconfig
>>set heap-check off

 

>Any reason you have off here?

 

Dennis, yes, there is a reason. I am interested in librtc reports about memory corruption and not interested about memory leaks and memory consumption. When "set heap-check off" I only get *.mem files and I don't get *.heap files.

 

 

blackwater
Regular Advisor

Re: Seldom heap corruption

nitinjain,

 

> you can use nudge feature to get the memory reports at any point of time by just sending a signal to the process

Please explain how to get a memory report when a process run with LD_PRELOAD=librts.so by sending a signal?
I could not find it in the HP wd 6.1 documention.