Operating System - HP-UX
1752681 Members
5157 Online
108789 Solutions
New Discussion юеВ

Bus error(ELF-64 core file received SIGBUS)

 
SOLVED
Go to solution
TeresaY
Advisor

Bus error(ELF-64 core file received SIGBUS)

Application got a Bus error.

Context: The current application on our production server was compiled on 10.20 but running on 11.11 system.

The application needs to be modified to accommodate some requirements. The newly modified application was compiled successfully and running well on the backup server. When I copied the binaries and libs onto the producation sever.(Along with the gcc4.3.3 lib folder, the production server doesn't have gcc4.3.3 installed dued to insufficient disk space.)

I got a Bus error with core dumped when running the modified program on production server:
core: ELF-64 core file - PA-RISC 2.0 from 'rdprogram' - received SIGBUS
(The app is written in C++ and compiled using gcc4.3.3 on 11.11)

Here is the internal attributes for the program:(by chatr rdprogram)

rdprogram:
64-bit ELF executable
shared library dynamic path search:
LD_LIBRARY_PATH enabled first
SHLIB_PATH enabled second
embedded path enabled third /users/system/dev/build:/usr/local/newrd/lib:/opt/hp-gcc64-4.3.3/lib/gcc/hppa64-hp-hpux11.11/4.3.3:/usr/ccs/lib/pa20_64:/opt/langtools/lib/pa20_64:/opt/hp-gcc64-4.3.3/lib/gcc/hppa64-hp-hpux11.11/4.3.3/../../..:/lib/pa20_64:/usr/lib/pa20_64
shared library list:
libstdc++.sl.6
libm.2
libgcc_s.sl
libc.2
shared library binding:
deferred
global hash table disabled
global hash table size 1103
shared library mapped private disabled
runtime checks disabled
shared library segment merging disabled
shared vtable support disabled
explicit unloading disabled
linkage table protection disabled
segments:
index type address flags size
6 text 4000000000000000 z---c- D (default)
7 data 8000000100000000 ---m-- D (default)
executable from stack: D (default)
static branch prediction disabled
kernel assisted branch prediction enabled
lazy swap allocation for dynamic segments disabled
nulptr dereferences trap enabled



Any idea and thoughts would be appreciated.

Thanks,


9 REPLIES 9
Dennis Handly
Acclaimed Contributor

Re: Bus error(ELF-64 core file received SIGBUS)

You need to use a debugger to see where and why it aborts. A simple gdb stack trace (bt) would be helpful.
$ gdb rdprogram core
(gdb) bt
(gdb) info reg
(gdb) disas $pc-4*20 $pc+4*4
(gdb) q

Has this application been running in 64 bit mode for some time?
David Johns
Advisor

Re: Bus error(ELF-64 core file received SIGBUS)

Hi Teresa:

My bus errors have always been due to unaligned data access. Since you are going from 32 to 64 bits, I'd guess this could be the case here as well.

The compiler debugging option, -g, tends to both help and hinder in pinpointing the problem. If you can back trace a core dump with or without the option, it should give enough information to proceed.

Look for ILP32 vs LP64 issues and structures, and structure padding.

Cheers,
-Dave
TeresaY
Advisor

Re: Bus error(ELF-64 core file received SIGBUS)

Hi Dennis,

The application at our production server was compiled in 32 bits and copied directly from an old server(10.20) years ago. We recently got a backup server and I rebuilt the development enviroment and modified and compiled the program on the backup server. The program was running fine on the backup server.

Here is what I got by using gdb:

% /opt/langtools/bin/gdb rdprogram core
HP gdb 3.3 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 3.3 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
Detected 64-bit executable.
Invoking /opt/langtools/bin/gdb64.
Use "run" to continue execution.
(gdb) bt
No stack.
(gdb) info reg
The program has no registers now.
(gdb) disas $pc-4*20 $pc+4*4
No registers.
(gdb) q

If type in run, it seems getting the SIGBUS error right away:

..
Detected 64-bit executable.
Invoking /opt/langtools/bin/gdb64.
Use "run" to continue execution.
(gdb) run
Starting program: /usr/resrch/newrd/bin/rdprogram

Program received signal SIGBUS, Bus error.
0xc00000000001561c in ?? ()
(gdb)

If I run the gdb debugger on the backup server, the result is different: get lots of warnings but it proceeds.



Hi David,

I assume that if I can run the program on my backup server, I should be able to run it on the production server. Both of the server are almost the same, except the backup server has the developent environment(lots of GNU stuff: GCC4.3.3, autoconf, automake, gmake, bison, flex, aclocal etc.)

The option that I use to configure is:
-disable-shared --host=hppa2.0w-hp-hpux11.11

I tried to compile it in 32 bit, but it wouldn't allow me.


Thanks,
Teresa


Dennis Handly
Acclaimed Contributor
Solution

Re: Bus error(ELF-64 core file received SIGBUS)

>HP gdb 3.3 for PA-RISC 1.1 or 2.0 (narrow)

This is obsolete, please download the latest:
http://www.hp.com/go/wdb

>% /opt/langtools/bin/gdb rdprogram core

Do you have a core file?

> 0xc00000000001561c in ??

This is likely aborting in dld.sl. What patch do you have for the linker/dld?
TeresaY
Advisor

Re: Bus error(ELF-64 core file received SIGBUS)

Hi Dennis,

I forgot another difference between the backup server and the production server: I patched the linker on the backup server and the one the production server was very old which might came with the box:

% ls -l dld.sl
-r-xr-xr-x 1 bin bin 236632 Nov 14 2000 dld.sl
% pwd
/usr/lib/pa20_64

the backup server was patched with PHSS_33033 first and later with PHSS_39749:
# ls -l dld.sl
-r-xr-xr-x 1 bin bin 466520 May 14 2009 dld.sl


Is it safe to patch the linker on the production server? There are data acquisition programs and analysis program running on the production server.


Thanks,
Teresa
Dennis Handly
Acclaimed Contributor

Re: Bus error(ELF-64 core file received SIGBUS)

>Is it safe to patch the linker on the production server?

It should be. You can always remove it if problems.
TeresaY
Advisor

Re: Bus error(ELF-64 core file received SIGBUS)

Hi Dennis,

I will patch the linker after I do a backup (in case I need to rollback).

Another question which is irrelevant of the problem: when I do a level 0 fbackup, can I also back up the kernels(/stand)?


Thanks,
Teresa
Dennis Handly
Acclaimed Contributor

Re: Bus error(ELF-64 core file received SIGBUS)

>I will patch the linker after I do a backup (in case I need to rollback).

Normally you can just use swremove on the linker patch. The recommended backup is in case things fail catastrophically.

>when I do a level 0 fbackup, can I also back up the kernels (/stand)?

Not really. You can't really restore them.
For that, you need ignite.
TeresaY
Advisor

Re: Bus error(ELF-64 core file received SIGBUS)

Thanks a lot!

Teresa