1753850 Members
7272 Online
108807 Solutions
New Discussion юеВ

Re: Wierd stack problem.

 
Paresh Borkar
Occasional Advisor

Wierd stack problem.

Hi,

We are trying to port our application (an application server) to hp-ux11 (32-bit) from Sun solaris. The application is written in C++ and we are building with aCC on hp-ux (on solaris, we build using g++).
We are facing a wierd problem regarding the call stack. We have a function which takes a 'const char *' as the only argument. This function (say SymbolLookup), internally looks up the argument (an symbol name) in a dynamically loaded shared library. This function is called by another function which passes to it a 'static const char *' symbol name as an argument.
The problem is that "at times" the program throws a 'SIGBUS'. GDB reports "Program terminated with signal 10, Bus error", when the core is specified. And in the core, it shows that an empty string (at address 0x0) is being passed to the SymbolLookup(...) function. Whereas, one level up in the stack frame shows that the correct symbol (which is a static const char *, at global level) is being passed.
When we print the value at register 26 ($r26), we get to see the correct value. Regsiter $r26 is supposed to hold the first parameter, if the function takes less that four integer parameters (reference from gdb manual).

The other wierd problem we are facing is that the 'this' pointer at times turns out to be NULL. But at the earlier stack frame for the same class, the 'this' pointer is a valid value.

The memory usage of our program process is around 100MB (viewed using top cmd).

Please can some one shed some light on this wierd behavior.

Thanks,
Paresh.
4 REPLIES 4
A. Clay Stephenson
Acclaimed Contributor

Re: Wierd stack problem.

Hi:

Without much to go on, here is my guess:

char *mystring(const char *p)
{
char s[128];

(void) sprintf(s,"Mystring + %s",p);
return(s);
} /* mystring */

Notice that this function returns a pointer to the auto declared string s. Upon exit of this function, you have an invalid pointer which could be used elsewhere. However, if instead
it were declared static char s[128]; within the function now the pointer is valid upon exit. If this is going on, it can play havoc with the stack. Check very carefully for this type of declaration in functions which must return pointers.

Regards, Clay
If it ain't broke, I can fix that.
Paresh Borkar
Occasional Advisor

Re: Wierd stack problem.

Hi,

No, there are no such places, where we are returning pointers to auto variables from functions. If there was such a thing, then the application would have crashed on NT/Solaris too. The application (release build) is running fine on NT and Solaris Platform. That's why even we are surprised about this wierd behavior on HP-UX 11 (32-bit) only.

thanks,
Paresh.
A. Clay Stephenson
Acclaimed Contributor

Re: Wierd stack problem.

Hi again:

Actually I have seen that very kind of problem only appear when porting code or upgrading OS's when the original code was working by accident. There is one run-time patch which might help - PHSS_22543. You should also search the the patch database for aCC and see if anything else might seem to apply. I assume thatr you have used the debugger to do a stack trace.


If it ain't broke, I can fix that.
Paresh Borkar
Occasional Advisor

Re: Wierd stack problem.

Finally we have found the culprit. It was the pthread stack size. On hp-ux the default thread stack is 64k. Whereas, on Solaris it is 1MB. Our function calls were exceeding this default stack limit.
And as a result of this, we were encountering wierd behavior at various places. A detailed look at the porting guide (from Solaris to HP-UX), revealed this difference in the stack sizes

- Paresh.