Operating System - HP-UX
1830060 Members
2225 Online
109998 Solutions
New Discussion

Regarding core dump analysis

 
vind123
Regular Advisor

Regarding core dump analysis

We are using HP-UX 10.2. One of our application daemon is giving core dump. I dont have any log file of the daemon. How do i find the error with the coredump? I read about gdb. In the net i am able to find only debugging the program. How do i use gdb to analyse an coredump.? Pls post me if there is any good web links
11 REPLIES 11
Alex Glennie
Honored Contributor

Re: Regarding core dump analysis

Jaime Bolanos Rojas.
Honored Contributor

Re: Regarding core dump analysis

Vind123,

Try to use file and strings to find some information about the dump, for more info, take a look at this thread,

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1045698

Regards,

Jaime.
Work hard when the need comes out.
A. Clay Stephenson
Acclaimed Contributor

Re: Regarding core dump analysis

Meaning no disrepect and not trying to be flippant, if you have to ask, you are probably the wrong person to be trying to analyze a stack trace. Debugging someone else's code --- especially when it was not compiled with -g which includes debugger data in the object files --- is tedious at best. Your best approach will be to contact the developer and send him your core file.
If this is a "homegrown" application then compile the application using -g, let it crash, and then use the debugger to examone the source file. In this case, the stack trace can point you to the exact source line where the problem lies.
If it ain't broke, I can fix that.
vind123
Regular Advisor

Re: Regarding core dump analysis

Thanks a lot for the info...
I tried the below things how do i locate the code and reason that threw core dump

1.
>gdb -c core
HP gdb 1.1
Copyright 1986 - 1999 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 1.1 (based on GDB 4.17-hpwdb-980821)
Wildebeest is free software, covered by the GNU General Public License, and
you are welcome to change it and/or distribute copies of it under certain
conditions. Type "show copying" to see the conditions. There is
absolutely no warranty for Wildebeest. Type "show warranty" for details.
Wildebeest was built for PA-RISC 1.1 or 2.0 (narrow), HP-UX 10.20.

Reading symbols from xcom...done.
Core was generated by `xcom'.
Program terminated with signal 11, Segmentation fault.

warning: The shared libraries were not privately mapped; setting a
breakpoint in a shared library will not work until you rerun the program.

Unable to find dynamic library list.

#0 0x20202034 in ?? ()
(gdb) bt
#0 0x20202034 in ?? ()
Cannot access memory at address 0x2000.
(gdb) p/x $iir
$1 = 0x43ffff80
(gdb) p/x $ior
$2 = 0xca961d5f
(gdb) quit

2.
>adb
0x43ffff80=i
LDB 8128(sr3,r31),r31



3.
>file core
core: core file from 'xcom' - received SIGSEGV

>strings core | grep -i fatal
FATAL ERROR: running out of SVC control resources
FATAL: cannot get shared memory (SVC_RSC) with key = %ld
FATAL: cannot attach to shared memory (SESSION) with key = %d
FATAL: cannot get semaphore with key = %d
fatal
fatal
fatal
fatal
fatal
fatal
***Fatal: bad switch value
***Fatal: bad switch value
***Fatal: bad switch value
***Fatal: bad switch value


vind123
Regular Advisor

Re: Regarding core dump analysis

Can anyone help me out to locate the error?
bhupesh m
Frequent Advisor

Re: Regarding core dump analysis

core dump analysis is a bit difficult task as per my knowlegde. hp support people knows that things. maybe some body here can help u

Bill Hassell
Honored Contributor

Re: Regarding core dump analysis

Do you have the source code for the daemon? If not, knowing that the program had a segmentation violation (SIGSEGV) is about all the information you have concerning the error. The program did something wrong internally and it crashed. It can't be fixed without rewriting the program. Contact the programmer.


Bill Hassell, sysadmin
A. Clay Stephenson
Acclaimed Contributor

Re: Regarding core dump analysis

As I tried to tell you earlier, analyzing a stack trace without the corresponding source code is very, very difficult and at the very least requires some understanding of the underlying code:

Cannot access memory at address 0x2000.

0x2000 is certainly a bogus address but that's really all you can know. Knowing that does almost nothing to help you fix the problem. You have to send your core file to the developer and even then it's not trivial to diagnose. A stack trace becomes easy to analyze when the debugger information is included in the object files. Then a stack trace will point you to the exact line in the exact source file. The "gotcha" even in this case is that the -g compiler option (to include debugger data) also disables the optimizer. I've seen cases where non-optimized code (with -g) could never be induced to crash while the optimized code would crash everytime.

The suggestion to use the strings command on the core file is all but useless. All that will do is dump out all the string constants (including possible error messages) in the executable but it does absolutely nothing to identify which error message (if any) was actually output.

As a Plan B. Try to run the application under tusc. That will show the arguments and result of every system call and can be very helpful in diagnosing a problem. However, even in this case, you aren't going to be able to fix this because a code change will be needed. The answer is the same: contact the developer.
If it ain't broke, I can fix that.
vind123
Regular Advisor

Re: Regarding core dump analysis

I am the developer and i have the source code with me. I got the coredump from production machine and i ran the xcom executable in testmachine with the core file and i am getting the below output. I am able to see when i ran the command "no debugging symbols found" . What does it mean? Is there any way from the below info i can locate the code where it failed?

./gdb xcom core
HP gdb 3.0.01 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 10.20.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 3.0.01 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..(no debugging symbols found)...

warning: exec file is newer than core file.
Core was generated by `xcom'.
Program terminated with signal 11, Segmentation fault.

warning: Unable to find __dld_flags symbol in object file.

Error while reading dynamic library list.

#0 0x20202034 in ?? ()
(gdb) frame 0
#0 0x20202034 in ?? ()
(gdb) bt
#0 0x20202034 in ?? ()
warning: Attempting to unwind past bad PC 0x20202034
#1 0x20202034 in ?? ()
#2 0x20202034 in ?? ()
Cannot access memory at address 0x442c0a20
(gdb) frame 1
#1 0x20202034 in ?? ()
(gdb) frmae 2
Undefined command: "frmae". Try "help".
(gdb) frame 2
#2 0x20202034 in ?? ()
(gdb) backtrace full
#0 0x20202034 in ?? ()
No symbol table info available.
#1 0x20202034 in ?? ()
No symbol table info available.
#2 0x20202034 in ?? ()
No symbol table info available.
Cannot access memory at address 0x442c0a20
(gdb) where
#0 0x20202034 in ?? ()
#1 0x20202034 in ?? ()
#2 0x20202034 in ?? ()
Cannot access memory at address 0x442c0a20
A. Clay Stephenson
Acclaimed Contributor

Re: Regarding core dump analysis

In that case, recompile/link the application using the -g option and if there is a "strip" command in your makefile comment it out also. You then run the newly compiled executable and allow it to crash.

Next you start gdb (or whatever debugger you are running) and add -d sourcedir arguments to tell it where the source code directories are. The stacktrace will then pinpoint your source code line.
If it ain't broke, I can fix that.
vind123
Regular Advisor

Re: Regarding core dump analysis

i dont know under what condition it crashed.
So after compiling i will be not be able to create the new core dump.
if i compile it with -g option and run
with coredump i have will it workout. if no, is there any other debugger or way to find it.