1834296 Members
2597 Online
110066 Solutions
New Discussion

Bus Error

 
SOLVED
Go to solution
James Harrington
Frequent Advisor

Bus Error

Some background:
1. This is a new rp2400 (A400) computer.
2. We have installed HP-UX11.11 and many patches.
3. We have installed HP-UX Microfocus Cobol HP version B.13.45 MF version V4.1 revision 40.
4. We have installed Oracle 9.2.0.1.0

We are linking some programs provided to us by a third party. They are linked using the cobol compiler. The routines being linked are 32-bit C programs. We are linking with the 32-bit Oracle libraries.

When we run the program we get:

Bus error(coredump)

If we do "file core" we get;

received SIGBUS

When we use tusc we get:

getrlimit64(RLIMIT_NOFILE, 0x77ff0a40) ................... = 0
brk(0x40040008) .......................................... = 0
sigaction(SIGPIPE, 0x77ff0a68, NULL) ..................... = 0
Received signal 10, SIGBUS, in user mode, [SIG_DFL], partial siginfo
Siginfo: si_code: I_NONEXIST, faulting address: 0x1, si_errno: 0
PC: 0xc004d85b, instruction: 0x489f0000
exit(10) [implicit] ...................................... WIFSIGNALED(SIGBUS)|W
COREDUMP

The third party is a bit stuck, as they have not met this before. They do have similar installations.

Any ideas?
15 REPLIES 15
Adam J Markiewicz
Trusted Contributor

Re: Bus Error

Hi

Check the core with debugger (gdb). The place in stack trace just at the top should give you more info.

The interesting thing to check would be the content of sigaction structure that was passed as an argument to this sigaction() call.

I can only tell that SIGBUS is quite typical for accessing 0x1.

Good luck
Adam
I do everything perfectly, except from my mistakes
Adam J Markiewicz
Trusted Contributor

Re: Bus Error

Correction

If this is tusc trace it means that sigaction() call was sucessully finished and the real problem is after it. Just check the core file.

Too much work with stack traces... sorry for the confusion.

Good luck
Adam
I do everything perfectly, except from my mistakes
John Bolene
Honored Contributor

Re: Bus Error

Ah, the famous bus error.

What this really means is that you have attempted to access a memory location that was not assigned to you.

You may have addressed out of the stack or used an invalid address somewhere.

Why it is called bus error is beyond me.

Maybe because the bus could not resolve the address you wanted.
It is always a good day when you are launching rockets! http://tripolioklahoma.org, Mostly Missiles http://mostlymissiles.com
Michael Steele_2
Honored Contributor

Re: Bus Error

James Harrington
Frequent Advisor

Re: Bus Error

Adam,

I have installed gdb but have no idea how what to do with it. I tried gdb program, but it returned:

..(no debugging symbols found)...

I tried gdb core and it returned:

.."core": not in executable format: File format not recognized

Michael,

getconf KERNEL_BITS
64
Michael Steele_2
Honored Contributor

Re: Bus Error

I don't think a 64 bit O/S and a 32 bit application will work together. Better check on this.
Support Fatherhood - Stop Family Law
Olav Baadsvik
Esteemed Contributor

Re: Bus Error


Hello,

There is no problem running a 32-bit application on a 64-bit hp-ux.

What I would check is the following:

. kernel-parameters nproc and maxdsiz

. swapspace.

I would also check if the software from the third party is compiled on hp-ux 11.xx as
mixing objects compiled on 10.20 and 11.x is not supported and can give any kind of
problems.

Olav
Adam J Markiewicz
Trusted Contributor

Re: Bus Error

Hi

No debug info... Thats bad news, however it is not a surprise for a production version...

you could do this:

gdb -core core

But I'm affraid it'll be able only to give you assembly...

However what I suspect for this 0x1 is that it is: One byte after fome dynammically allocated pointer. If malloc() was not checked for returning NULL, but just used further you could expect something, like it.

So you could try to increase maxdsiz, so malloc() will always return a pointer to valid memory.


Good luck
Adam
I do everything perfectly, except from my mistakes
James Harrington
Frequent Advisor

Re: Bus Error

Adam,

gdb -core core returned:

(no debugging symbols found)...(no debugging symbols found)...
#0 0xc004d858 in pthread_mutex_init+0x1c () from /usr/lib/libpthread.1

Any help?

MAXDSIZ is already at 1gb. Could it need more?

Olav,

nproc is 4100
maxdsiz 1073741824

I agree with your comments regarding "mixing objects compiled on 10.20 and 11.x" for Cobol, however these library routines were written in C.


Adam J Markiewicz
Trusted Contributor

Re: Bus Error

Hi, James

You managed further than I expected. Congratulations! ;)

Okay.

Let's try something more:

Inside gdb try entering:
bt
That stands for 'back trace' - the trace of all functions actually called on the stack. If you're lucky, you'll find what function called this pthread_mutex_init(). You could then give it to your third party - it should tell them a lot more! However this function is used to initiate mutex structure (from your point of view it doesn't matter that this structure is), so I'm more convinced that it is just initialisation of just new allocated something. And for some reason memory allocation returned NULL, of course. Or was further misinterpretd as NULL.

I don't know what is your programm doing but I guess that 1GB should be more than enough. But mayby the problem here is swap space? Is your Oracle running on the same host? Is it havilly occupied?

I'm also not sure what are you trying to do:
Do you have the third's party object files and Oracle libraries and try to link them together with cobol linker?

Teoretically there can be also some problems with internal interpretation.
It can be due to:
1. Not compatible compiling options between Oracle and your object files (should be checked)
2. Missinterpretation of introduced by the linker (I have no expiriance with cobol developing)

I'm wondering if it is just at the initialisation of the program or it works for some time and then coredumps?


Good luck
Adam
I do everything perfectly, except from my mistakes
Mike Stroyan
Honored Contributor

Re: Bus Error

| gdb -core core returned:
|
| (no debugging symbols found)...(no debugging symbols found)...
| #0 0xc004d858 in pthread_mutex_init+0x1c () from /usr/lib/libpthread.1

That is interesting. That routine is using its second paramter as the address to load from in the instruction that causes the SIGBUS. If you use the 'bt' command in gdb it will show you a stack trace with the name of the function that passed the bad parameter.
There is a fairly common problem with linking libpthread with the wrong libraries,
or linking libc in front of libpthread. You can use the the 'info share' command in gdb to see what shared libraries have been loaded and in what order.
Mike Stroyan
Honored Contributor

Re: Bus Error

| gdb -core core returned:
|
| (no debugging symbols found)...(no debugging symbols found)...
| #0 0xc004d858 in pthread_mutex_init+0x1c () from /usr/lib/libpthread.1

That is interesting. That routine is using its second parameter as the address to load from for the instruction that causes the SIGBUS. If you use the 'bt' command in gdb it will show you a stack trace with the name of the function that passed the bad parameter.
There is a fairly common problem with linking libpthread with the wrong libraries,
or linking libc in front of libpthread. You can use the the 'info share' command in gdb to see what shared libraries have been loaded and in what order.
James Harrington
Frequent Advisor

Re: Bus Error

Adam,

Thanks for your help. My specialist subject is Cobol on MPE/iX so this is all a bit strange to me!

I tried bt:
(gdb) bt
#0 0xc004d858 in pthread_mutex_init+0x1c () from /usr/lib/libpthread.1
#1 0xc2519c24 in libc_init+0x94 () from /usr/lib/libc.2
#2 0xc2519388 in __libc_init+0xf0 () from /usr/lib/libc.2
#3 0xc05a339c in hp__pre_init_libc+0x2c () from /usr/lib/libcma.2

I have sent this info to the third party.

Yours questions:
Swap space? 640mb ram, 1280mb swap.

Is your Oracle running on the same host? Yes.

Is it heavily occupied? No, I am the only user. It is a brand new machine.

Do you have the third's party object files and Oracle libraries and try to link them together with cobol linker? Yes. We have tried using "ld" to link and the program is ok (but we need to use cob to include some file handling routines).

I'm wondering if it is just at the initialisation of the program or it works for some time and then coredumps? I believe it is at initialisation. The program has a "trace" option, but displays nothing when I turn it on.

Thanks,
James.
Adam J Markiewicz
Trusted Contributor
Solution

Re: Bus Error

There we go! :)

libcma and libpthread are designed more-less to do the same job, but differently, so CANNOT BE MIXED FOR ANY REASON.

They have a lot of functions with the same names, one of them is pthread_mutex_init().

You should decide for one of them.
If Oracle is supposed to work with one and your third party with the other you have some trouble I'm affraid. However converting one to another (if not very, very unique used) should be easy job, if needed at all.

Good luck
Adam
I do everything perfectly, except from my mistakes
James Harrington
Frequent Advisor

Re: Bus Error

Adam,

Thanks. The link does not include "lipthread" explicitly, but does have lcma as the last library.

I have removed lcma. The link works and the program now loads (at last, no BUS!).

I will pass this info to the third party.

Thanks for your help. One day I might understand it!

James.