Operating System - HP-UX
1833895 Members
2258 Online
110063 Solutions
New Discussion

Re: Howto investigate core files?

 
SOLVED
Go to solution
Ralph Grothe
Honored Contributor

Howto investigate core files?

Hello developers,

I need a little help from you hacker wizards who are familiar with debugger sessions on core dumps under HP-UX.

On one of our customers' webserver (i.e. Apache) many spawned child sockets encounter a sudden SIGBUS demise on a quite regular basis.

In the webserver's error log there appear hundreds of lines similar to this:


# grep -i bus\ error error_log|tail -1
[Mon Sep 9 08:30:10 2002] [notice] child pid 18050 exit signal Bus error (10),
possible coredump in /app/adis/apache


A core file is indeed dumped and can be found in the named directory.

SIGBUS to me sounds like a severe memory vialation by the respective process owe to some sloppy coding practice.

I have to admid however that I absolutely cannot overlook what the moribund child processes are initiating that lead to the SIGBUSes (the whole application was thought up, coded, implemented and deployed by a third party company which meanwhile seems to be out of business, so no support from the developers, as ususual)
I can only suspect that when a client is serviced by the webserver's forked off children (n.b. protocol is HTTPS, SSL/TLS on port 443), that a connection to a central database server (i.e. an Oracle instance) is initiated in order to let the client send SQL queries and be serviced with according result sets.

What also looks strange to me is the sheer amount of sockets that are in FIN_WAIT_2 state.
Is this normal for a sound webserver?

I have now been given the order to find out what are the causes for the SIGBUS coredumps, although I have no developer's background at all (I only do high level scripting in Perl to aid my system administration).
That's why I have no idea how to tweak a core file with a debugger.

Would I need to install a whole ANSI-C development suite, or will the standard HP-UX debugger (e.g. adb) suffice for this task?
How should I proceed, what to look for in the core file?
Are there other possible tools (e.g. strace) to trace a moribund httpd child?

Regards
Ralph
Madness, thy name is system administration
6 REPLIES 6
Bill McNAMARA_1
Honored Contributor

Re: Howto investigate core files?

Hi Ralph!

FIN_WAIT_2 state.
Is this normal for a sound webserver?
>> No.

For core dump analysis, you can use
file core
and
what core,

for more serious debugging:
get wdb (wildebeest debugger) from www.hp.com/go/developer.

The fin_wait_2 state can be forceably removed...
a thorough explaination can be found here:

ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_ndd.txt


Later,
Bill


It works for me (tm)
Olav Baadsvik
Esteemed Contributor

Re: Howto investigate core files?


Hi,

If you have wdb installed you may get
the stack-trace of the failing program
by starting wdb like this:

wdb program core

where program is the executable that failed.

When wdb has started you go to the
view pulldown and select "Call Stack"

This will give you the call-stack at the time
of failure which is very often a good starting
point in the process of finding the cause
of the problem

Olav
Ralph Grothe
Honored Contributor

Re: Howto investigate core files?

Bill,

many thanks for the valuable ndd HOWTO.

Is it safe to reduce the
"tcp_keepalive_detached_interval" ?

Should this parameter first be tried before setting the "kludgy timer"
"tcp_fin_wait_2" ?

Can one safely set network device parameters through ndd on a running kernel without impact on currently open sockets?

Would you mind to have a look at the attachment?
i.e. the output of "file" and "what" on the core file.

I will see if I can download and install the recommended debugger...




Madness, thy name is system administration
Paula J Frazer-Campbell
Honored Contributor
Solution

Re: Howto investigate core files?

Hi Ralph

Down load and install wdb:-

http://hp.com/go/wdb

fire up /opt/langtools/bin/gdb -c
"bt" will give a stack trace:-

(gdb) bt
#0 0xc01ecb88 in ??
#1 0xc01ecb68 in ?? ()
#2 0xc01ecb68 in ?? ()

or use "where"

There are two stored registers that will tell you the address being
accessed
and instruction the process was executing when it failed. In gdb or at
the
"(gdb)" command prompt in wdb, try:

(gdb) p /x $ior

This prints the "Interrupt Offset Register" that is the address the
program was trying to access when it failed.

(gdb) p /x $iir

This is the "Interrupt Instruction Register" that shows the machine instruction that caused the failure. To decode this start up a separate "adb" (adb) should already be installed ??? note it has no prompt and the $ prefixes each command - $q = quit) session and enter the value from the above command and follow it
with "=i". For example,

(gdb) p /x $iir
$2 = 0xfe01280

$ adb
0xfe01280=i
LDW 254,0(r31)
This is a Load word command being executed. See instructions set for PA Risc.

HTH

Paula
If you can spell SysAdmin then you is one - anon
Ralph Grothe
Honored Contributor

Re: Howto investigate core files?

Paula,

thank you for the short wdb tutorial.

Unfortunately, although I've repeatedly been trying since yesterday, I haven't been able to download the wdb binaries from any of the mentioned HP URLs.
These sites are unreachable for clients from Germany as it looks to me.
I only suceeded in downloading the HP WDB Quick Srtart Guide pdf file.
Do you know of any HP download URLs that service European/German clients at a considerable bandwidth?
This is really frustrating and annoying.
I encounter this poor HTTP service with all HP webservers (including this forum) since I joined the ITRC (some two years ago), and I constantly keep complaining about it whenever I receive feedback calls or mails from HP, but nothing really improved so far.

I discovered on one of our leased servers a preinstalled bundle of ansic and langtools.
Among these I also found wdb.
Unfortunately I have a OS version mismatch between this wdb (i.e. on HP-UX 11.00) and the origin of the core file (i.e. HP-UX 10.20).
Do you believe I can despite investigate a 10.20 core file with a 11.00 wfb?
Madness, thy name is system administration
lassehab
Advisor

Re: Howto investigate core files?

Hi Ralph

You can use gdb like that:

$gdb -xdb -tui executable

for starting a debugger using HP-UX TUI ( Terminal User Interface ), so you can see a default terminal screen size of 24 by 80 chars, this termainal is divided into 2 windows, a source window at the top and a command window at the bottom.

It's very easy to see where your script go with SIGBUS

after :

$b main
$run
a right angle bracket ( > ) points to the current location
After
$n ( For next )

And you follow a " > " where it's stop, so you can see which instruction is invalid

You can see assembler version with

$la asm

and source version with

$la src

And registers version with

$la regs

You can split your window in Source/Disassembly by :
$la split

You can also reach this window from the source window with the XDB command
$td

N.B:

"$" is prompt gdb here

Hope is help

Regards
I love Java OS