1832645 Members
3019 Online
110043 Solutions
New Discussion

Signal 10, SIGBUS

 
Michael Elleby III_1
Trusted Contributor

Signal 10, SIGBUS

I have an application that executes on an L2000, that generates the SIGBUS error when executing, the following is the result of running gdb and getting a stack trace. I was also getting a SIGSEGV with the same application earlier, which I was able to resolve by increasing the maxssiz (Thanks to A. Clay Stephenson).

Can anyone point me in the right direction as to what is wrong?

Here it is:

GNU gdb 5.1.1
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "hppa2.0n-hp-hpux11.00"...
Core was generated by `mmsmain'.
Program terminated with signal 10, Bus error.

warning: The shared libraries were not privately mapped; setting a
breakpoint in a shared library will not work until you rerun the program.


warning: Can't find file mmsmain referenced in dld_list.
Reading symbols from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0...done.
Reading symbols from /opt/app/oracle/OraHome1/lib/libwtc8.sl...done.
Reading symbols from /usr/lib/libcl.2...done.
Reading symbols from /usr/lib/libisamstub.1...done.
Reading symbols from /usr/lib/librt.2...done.
Reading symbols from /usr/lib/libpthread.1...done.
Reading symbols from /usr/lib/libnss_dns.1...done.
Reading symbols from /usr/lib/libm.2...done.
Reading symbols from /usr/lib/libxcurses.1...done.
Reading symbols from /usr/lib/libc.2...done.
Reading symbols from /usr/lib/libdld.2...done.
#0 0xc36aacac in ltstidi () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
(gdb) bt
#0 0xc36aacac in ltstidi () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#1 0xc321b668 in kpufhndl () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#2 0xc31d88bc in OCIHandleFree () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#3 0xc31b6410 in sqlclo () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#4 0xc319a590 in sqlclst () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#5 0xc318c610 in sqlcac () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#6 0xc318a12c in $00000070 () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#7 0xc3184d80 in sqlcmex () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#8 0xc3185428 in sqlcxt () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0
#9 0x00143cf0 in altdbs (node_name={len = 8, arr = "RICHMOND", '\000' })
at /home/prjlib/PM/tmp/mms.c:15828
#10 0x00144608 in con1dbs () at /home/prjlib/PM/tmp/mms.c:16035
#11 0x00026704 in extdbs_format_asnstd (node_name={len = 7, arr = "CONCORD", '\000' })
at /home/prjlib/PM/tmp/man100_modul.c:16857
#12 0x005a0a4c in posprc_pulatn (node_name={len = 7, arr = "CONCORD", '\000' })
at /home/prjlib/PM/tmp/pulatn.c:9306
#13 0x005a0558 in extdbs_format_pulatn_88 (node_name={len = 7, arr = "CONCORD", '\000' })
at /home/prjlib/PM/tmp/pulatn.c:9177
#14 0x005046e0 in procx_format_pulatn_88 (the_line=0x40183658, status=70, proc_id=1)
at /home/prjlib/PM/tmp/mmscore4.c:22505
#15 0x001463cc in $00000078 () at /home/prjlib/PM/tmp/mms.c:16649
#16 0x001d6410 in mmsprcdrv (prjerror=128) at /home/prjlib/PM/tmp/mms.c:40030
#17 0x001d8168 in mmsprcsel (prjerror=131) at /home/prjlib/PM/tmp/mms.c:40726
#18 0x001cc7d0 in mmsctldrv (prjerror=38) at /home/prjlib/PM/tmp/mms.c:36311
#19 0x00006234 in man002_auftrags_bearb () at /home/prjlib/PM/mmsmain/src/man000_fct.c:379
#20 0x00005f68 in main (argc=1, argv=0x6c770ef4) at /home/prjlib/PM/mmsmain/src/mmsmain.c:532
(gdb)

Mike

Knowledge Is Power
12 REPLIES 12
Clemens van Everdingen
Honored Contributor

Re: Signal 10, SIGBUS

Hi,

Checked around a bit what this means !
Found following comments from different sources.

A SIGBUS ("bus signal") error is an indication from the operating system that an I/O error has occurred. (SIGBUS errors are defined in
the file /usr/include/sys/signal.h.)

# define _SIGBUS 10 /* bus error */

Generally bus errors occur for one of a few reasons:

1) Your application tried to access memory which isn't physically there
2) Your application attempted to access misaligned data
3) Your application attempted to access memory which is corrupt.

Now errors 1) and 2) can occur either because your application is referencing a bad pointer or because a library that your application is utilizing is using a bad pointer. Your best bet is to take a look at the core file generated and try to grab a stack trace from it.

As a rule of thumb, a Bus Error is not sent by a process or daemon -- these are usually sent by the kernel to a process when it attempts to violate one of the 3 above rules.

HtH
C.
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Michael Elleby III_1
Trusted Contributor

Re: Signal 10, SIGBUS

Clemens-

I had already researched this information. I posted the stack trace because I wanted to know exactly what was causing this sigbus meaning, does this stack trace tell me exactly what's causing the problem.. memory or misaligned data?

Thanx for your help though..

Mike-
Knowledge Is Power
John Palmer
Honored Contributor

Re: Signal 10, SIGBUS

Michael,

I believe that the last entry on the stack trace:-

#0 0xc36aacac in ltstidi () from /opt/app/oracle/OraHome1/lib//libclntsh.sl.8.0

is most significant. This is an Oracle library routine. Has this program ever worked?

It could be an Oracle bug but it could also be an Oracle environment issue.

Regards,
John
harry d brown jr
Honored Contributor

Re: Signal 10, SIGBUS

Michael,

Your program is trying to reference an illegal memory address, hence the sigbus error. it's either a bad pointer or a bad array operation (out of bounds).

live free or die
harry
Live Free or Die
S.K. Chan
Honored Contributor

Re: Signal 10, SIGBUS

From your trace I don't think one can tell exactly what may cause it. You're running the trace against the core file right ? By doing that it points you to the program/libraries that has problem with memory management, you need to take a step further by running your debugger against these programs. That's the only way you can find out for instance if there is a memory leak from these codes. These looks like the programs you need to look at ..
- /home/prjlib/PM/tmp/mms.c
- /home/prjlib/PM/tmp/man100_modul.c
- /home/prjlib/PM/tmp/pulatn.c:9306
and so on ...

No expert in debugging but it's just some ideas I have.
Michael Elleby III_1
Trusted Contributor

Re: Signal 10, SIGBUS

Harry-

So do I now go back to the DBA and inform him that he needs to review his application code to see where it is attempting to access this pointer or array?

Thanx,

Mike-
Knowledge Is Power
Michael Elleby III_1
Trusted Contributor

Re: Signal 10, SIGBUS

S.K.-

Are you speaking of running DDE against the programs/libraries?

Thanx.

Mike-
Knowledge Is Power
S.K. Chan
Honored Contributor

Re: Signal 10, SIGBUS

Yes, the reason why I suggest that is because I remembered you can actually run gdb on the programs/libraries to detect for "leaks". I don't have the details, I happened to recall the process in debugging such stuff ..
1- First run it on the core file and get the list of "potential problematic" code file.
2- Then run it on those files individually.

How to do that .. you probably know that more than I do:)
harry d brown jr
Honored Contributor

Re: Signal 10, SIGBUS

Michael,

Without a doubt I would throw this "puppy" back at the DBA's. I just finished reading Clay's response to your other thread.

Did you end up bumping maxssiz and maxssiz_64bit to 32mb?

Because I have sloppy programmers that think memory is unlimited I usually have to bump maxdsiz to 268435456 and maxdsiz_64bit to 1073741824, and sometimes to 2GB. Ignorant sloppy programming, and to think I used to write code to drive ATM's with a 32K limit!!!!

live free or die
harry
Live Free or Die
John Palmer
Honored Contributor

Re: Signal 10, SIGBUS

Michael,

Running the program under tusc to trace the system calls might give you some extra clues.

If you haven't got tusc then you can download it from:
http://hpux.cs.utah.edu/hppd/hpux/Sysadmin/tusc-7.3/

Regards,
John
Michael Elleby III_1
Trusted Contributor

Re: Signal 10, SIGBUS

S.K. - Firstly, meant to give you 4pts on your answer, although DDE is not installed on this server, and I would have to jump through hoops to get that done...

Harry - I did bump both guys (maxssiz and maxssiz_64bit up to 32mb on Saturday past, but received the error again on Monday. Have not touched maxdsiz or maxdsiz_64bit as of yet, as their values are 268435456 and 1073741824 respectively, the same as you indicate..

John - I will take a look at TUSC..

Any other pointers (no pun intended)?

Thanx..

Mike-
Knowledge Is Power
Michael Elleby III_1
Trusted Contributor

Re: Signal 10, SIGBUS

John, attached are the results of running tusc against the executable.

Let me know if you find anything.

Mike Elleby
Knowledge Is Power