Operating System - HP-UX
1829122 Members
1913 Online
109986 Solutions
New Discussion

Core dump with BUS_ADRALN error

 
SOLVED
Go to solution
K!rn Kumr
Frequent Advisor

Core dump with BUS_ADRALN error

Hi, I am working on HP-UX B.11.23 U ia64 platform. I have trying to port 32 bit code to 64 bit. I have compiled it run. It dumps a core when the control branches from my file to another file which is a third party libraries via function call it fails. And during debugging i see that the value of a variable rqst is not available. I am not able to get a concrete picture of the reason for crash. As the application is crashing when a function in third party library is called, is it possible to say that there is a problem with that code and mine is clean.


HP gdb 5.7 for HP Itanium (32 or 64 bit) and target HP-UX 11.2x.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.7 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
Core was generated by `4telback'.
Program terminated with signal 10, Bus error.
BUS_ADRALN - Invalid address alignment. Please refer to the following link that helps in handling unaligned data: http://docs.hp.com/en/7730/newhelp0610/pragmas.htm#pragma-pack-ex3

warning: Load module /hubd/environ/hubdaec1/HLLIB/lib/libhublog.sl has been stripped.
Debugging information is not available.

#0 get_tcp_service (service=0x400000000000b990 "TSC720",
ip_addr=0x9fffffffffffe158, port=0x9fffffffffffe150) at min_tcp_serv.c:146
146 min_tcp_serv.c: No such file or directory.
in min_tcp_serv.c
(gdb) bt
#0 get_tcp_service (service=0x400000000000b990 "TSC720",
ip_addr=0x9fffffffffffe158, port=0x9fffffffffffe150) at min_tcp_serv.c:146
#1 0x400000000001f6f0:0 in connect_ipc (service=0x400000000000b990 "TSC720",
qos=0x0, timeout=60, connection_id=0x9fffffffffffe7d8) at conn_ipc.c:335
#2 0x40000000000261f0:0 in ti_start_line_test (line=0x9fffffffffffe860,
timeout=50, session=0x9fffffffffffe7d8) at ti_asynch.c:153
#3 0x40000000000251c0:0 in ti_line_test (line=0x9fffffffffffe860,
progress_update=0x9fffffffef6a6460, check_for_cancel=0, timeout=50,
result=0x9fffffffffffe900) at ti_synch.c:241
#4 0x4000000000023970:0 in line_test (rqst=)
at /hubd/home/id817019/4TEL/src/svc.c:377
#5 0xc0000000045c8d30:0 in _tmsvcdsp () at tmsvcdsp.c:475
#6 0xc00000000461b8e0:0 in _tmrunserver () at tmrunsvr.c:1882
#7 0xc0000000045c7300:0 in _tmstartserver () at tmstrtsrvr.c:141
warning:
ERROR: Use the "objectdir" command to specify the search
path for objectfile BS-47bc.o.
If NOT specified will behave as a non -g compiled binary.

#8 0x400000000000c8c0:0 in main+0x60 ()
53 REPLIES 53
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>I have trying to port 32 bit code to 64 bit.

Have you read all of the documentation about porting and alignment issues? Have you compiled with +w64bit +wlint?

>And during debugging I see that the value of a variable rqst is not available.

Have you compiled with -g and NO optimization options? Why is rqst important? Where is the last frame of your code?

>I am not able to get a concrete picture of the reason for crash. As the application is crashing when a function in third party library is called, is it possible to say that there is a problem with that code and mine is clean.

Of course not. An alignment trap is a user problem. Typically when porting from 32 to 64 bit, if you pass an address of an int to a function that wants a long*, you will get this trap 50% of the time.
It could also be due to using pragma pack on structs.
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

No I have not compiled my code using +W64bit +WIint, but I have used +DD64 flag.

Yes I have compiled it with -g (debug)flag but with more flags like +p +O2 +Ofltacc=relaxed +Onolimit +DSmontecito +FPD -Wl,+pi,1M -Wl,+pd,1M -Wl,+mergeseg -Wl,+s +w1.

I have checked the datatypes of the aruguments with the datatypes of the arguments that the funuction is expected

This is frame0

#0 get_tcp_service (service=0x400000000000b990 "TSC720",
ip_addr=0x9fffffffffffe158, port=0x9fffffffffffe150) at min_tcp_serv.c:146
146 min_tcp_serv.c: No such file or directory.
in min_tcp_serv.c

And this is the last frame.

#8 0x400000000000c8c0:0 in main+0x60 ()

rqst is an argument for a funtions in my code for which i am not able to find the value while debugging. Later in the code it is being accessed. But it is not crashing the place where it is being accessed.

There is no pragma pack used in the code too....

Below is the frame that belongs to my code

#4 0x4000000000023970:0 in line_test (rqst=)
at /hubd/home/id817019/4TEL/src/svc.c:377

But the line no 377 is just a debug statement which is getting printed in the log.

After this line I have a function call of 3rd party library.
Laurent Menase
Honored Contributor

Re: Core dump with BUS_ADRALN error

min_tcp_serv.c:146 look line 146 of min_tcp_serv.c

what is the line you can see?
how buffers on that line are allocated?
what are the types of those buffers in that line.
if you examine the values of those params, what do you find?

K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

Except this frame

#4 0x4000000000023970:0 in line_test (rqst=)
at /hubd/home/id817019/4TEL/src/svc.c:377

rest all belong to 3rd party libraries on which I dont have any control....
Laurent Menase
Honored Contributor

Re: Core dump with BUS_ADRALN error

so you need to contact your lib vendor for help on this debug or debug at assembly level to identify where the buffer comes from, is it a global or a param which you could have not aligned properly.
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>I have not compiled my code using +w64bit +wlint

Don't even think about porting until you have done this. What version of aCC6 do you have?

>but with more flags like +p +O2 +Ofltacc=relaxed +Onolimit +DSmontecito +FPD -Wl,+pi,1M -Wl,+pd,1M -Wl,+mergeseg -Wl,+s +w1.

Don't use +O2 until you are done, it prevents debugging. +w1 is obsolete, replace by just +w on IPF.

>I have checked the datatypes of the arguments with the datatypes of the arguments that the function is expected

No structs being passed?

>This is frame 0, And this is the last frame.

(This was in your initial message.)

>rqst is an argument for a functions in my code for which I am not able to find the value while debugging.

Remove that +O2.

>Below is the frame that belongs to my code
#4 0x4000000000023970:0 in line_test (rqst=) svc.c:377
>But the line 377 is just a debug statement which is getting printed in the log.
>After this line I have a function call of 3rd party library.

Do you pass rqst to that ti_line_test?

>Laurent: what is the line you can see?

All good questions. If you don't have the source you will have to debug in assembly mode:
(gdb) frame 0
(gdb) disas $pc-16*8 $pc+16*4
(gdb) info reg
Laurent Menase
Honored Contributor

Re: Core dump with BUS_ADRALN error

(gdb) frame 0
(gdb) disas $pc-16*8 $pc+16*4
(gdb) info reg

exactly what I meant!
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

I have a doubt. I have gone through many of the docs from hp site. All of them say to use +DD64 for 64bit compilation on itanium platform. I have no idea of +w64bit and +wlint.

I am using A.06.15 version of aCC.

Yeah I am passing pointers to structures.

No I do not pass rqst to the function.

Can you please help by suggesting some site or some info on how to debug in assembly mode.

Thanks in advance
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

Yeah I have learnt that +w64bit and +wlint are for enabling the warning messages during porting to 64bit thanks for that....
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

But I think its equivalent to +M2 option which I have already been using.
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>All of them say to use +DD64 for 64bit compilation on Integrity platform. I have no idea of +w64bit and +wlint.

You must have +DD64. The +w* options enable porting and other warnings.

>I am passing pointers to structures.

Do they contain pointers or longs?

>Can you please help by suggesting some site or some info on how to debug in assembly mode.

You would have to learn the Itanium instruction set.
Just post the output.

>I have learned that +w64bit and +wlint are for enabling the warning messages during porting to 64bit

They should help. Have you looked at the 64-bit porting guide:
http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc3f3515b49c108973a801/?ciid=2308852bcbe02110852bcbe02110275d6e10RCRD

>I think its equivalent to +M2 option which I have already been using.

Better to use the new option that says what it does, not the obsolete PA one.

Should we assume that your third party library has been ported to 64 bit for more than a decade and can't possibly have any problems?
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

Okay. Yeah the 3rd party has recently ported to 64 bit on the same platform.
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>the 3rd party has recently ported to 64 bit

Then you could ask them if this is a bug they know about in their code or should you be looking at your code, and ask for any hints.
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

yeah Dennis..I think I need to inform them as my application crashes only when their code it hit. I am passing arguments this way

func( emp* e1 ) is suppose the funtion prototype of the 3rd party library.

I am creating a variable emp e2; and i am calling the function by passing then address of the variable i have created. func(&e2).

I am even passing function pointers as arguments too....eg.func(&e2,&fun,NULL).I am passing NULL for which it is expecting a function pointer whose return type is int.

I feel like I am passing the arguments which hold good values. Let me know if there is any discrepancy in it.
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>I think I need to inform them as my application crashes only when their code is hit.

That doesn't prove it is their problem.

>I am passing arguments this way
>func(emp *e1) is suppose the prototype of the 3rd party library.

Instead of func, mention names that are in the stack trace.

>I am even passing function pointers as arguments too. eg. func(&e2,&fun,NULL). I am passing NULL for which it is expecting a function pointer

I assume the interface allows NULL?
Do you know if fun is being called, before it aborts?

>I feel like I am passing the arguments which hold good values. Let me know if there is any discrepancy in it.

Nothing obvious but I would need more details.

Where is the gdb dump I asked for?
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

#3 0x40000000000251c0:0 in ti_line_test (line=0x9fffffffffffe860,
progress_update=0x9fffffffef6a6460, check_for_cancel=0, timeout=50,
result=0x9fffffffffffe900) at ti_synch.c:241

The above is the frame i was trying to explain. And the function is ti_line_test

which is being called from frame 4

I hope u r asking for the core dump. I am not able to attach it here.
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

The function which I am passing as a pointer is not being called before the application aborts.
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

I am sorry. I got u now. Below is the assembly that you have asked for

(gdb) frame 0
#0 get_tcp_service (service=0x400000000000d120 "TSC720",
ip_addr=0x9fffffffffffe1d8, port=0x9fffffffffffe1d0) at min_tcp_serv.c:146
146 min_tcp_serv.c: No such file or directory.
in min_tcp_serv.c
(gdb) disas $pc-16*8 $pc+16*4
Dump of assembler code from 0x4000000000034dc0:0 to 0x4000000000034e80:0:
;;; File: min_tcp_serv.c
;;; Line: 138
0x4000000000034dc0:0 :
cmp.eq p6=r0,r41
0x4000000000034dc0:1 : nop.m 0x0
0x4000000000034dc0:2 :
(p6) br.cond.dptk.few get_tcp_service+384;;
;;; Line: 142
0x4000000000034dd0:0 : mov r47=r41
0x4000000000034dd0:1 : mov r9=24;;
0x4000000000034dd0:2 : add r9=r9,r1
0x4000000000034de0:0 : nop.m 0x0
0x4000000000034de0:1 : mov r14=r1;;
0x4000000000034de0:2 : nop.i 0x0
0x4000000000034df0:0 : ld8.acq r10=[r9]
0x4000000000034df0:1 :
adds r9=8,r9;;
0x4000000000034df0:2 : mov b6=r10
0x4000000000034e00:0 : ld8 r1=[r9]
0x4000000000034e00:1 : nop.m 0x0
0x4000000000034e00:2 : br.call.dptk.few b0=b6;;
0x4000000000034e10:0 : mov r1=r44
0x4000000000034e10:1 : mov r38=r8;;
;;; Line: 143
0x4000000000034e10:2 :
cmp.eq p6=r0,r38
0x4000000000034e20:0 : nop.m 0x0
0x4000000000034e20:1 : nop.m 0x0


=================================================

(gdb) info reg
pr0: 0x1
pr1: 0x1
pr2: 0
pr3: 0
pr4: 0
pr5: 0
pr6: 0
pr7: 0x1
pr8: 0x1
pr9: 0
pr10: 0
pr11: 0
pr12: 0x1
pr13: 0
pr14: 0
pr15: 0x1
pr16: 0
pr17: 0
pr18: 0
pr19: 0
pr20: 0
pr21: 0
pr22: 0
pr23: 0
pr24: 0
pr25: 0
pr26: 0
pr27: 0
pr28: 0
pr29: 0
pr30: 0
pr31: 0
pr32: 0
pr33: 0
pr34: 0
pr35: 0
pr36: 0
pr37: 0
pr38: 0
pr39: 0
pr40: 0
pr41: 0
pr42: 0
pr43: 0
pr44: 0
pr45: 0
pr46: 0
pr47: 0
pr48: 0
pr49: 0
pr50: 0
pr51: 0
pr52: 0
pr53: 0
pr54: 0
pr55: 0
pr56: 0
pr57: 0
pr58: 0
pr59: 0
pr60: 0
pr61: 0
pr62: 0
pr63: 0
gr0: 0
gr1: 0x6000000000000c88
gr2: 0xc0000000001a4e60
gr3: 0
gr4: 0
gr5: 0xc000000000000408
gr6: 0xc00000000002bdc0
gr7: 0x9fffffffef7f8f38
gr8: 0x60000000000c9f48
gr9: 0
gr10: 0x1
gr11: 0x100
gr12: 0x9fffffffffffe1c0
gr13: 0x9fffffffef7dd430
gr14: 0x6
gr15: 0x9fffffffef7dc060
gr16: 0xc000000000000008
gr17: 0x1f
gr18: 0
gr19: 0x9fffffffef7ff8d0
gr20: 0x9fffffffffffe000
gr21: 0
gr22: 0x9fffffff7f7e8880
gr23: 0x38
gr24: 0
gr25: 0xc000000000047810
gr26: 0
gr27: 0
gr28: 0
gr29: 0
gr30: 0
gr31: 0x710
gr32: 0x400000000000d120
gr33: 0x9fffffffffffe1d8
gr34: 0x9fffffffffffe1d0
gr35: 0x400000000000d120
gr36: 0x9fffffffffffe1d8
gr37: 0x9fffffffffffe1d0
gr38: 0x60000000000c9f18
gr39: 0
gr40: 0x60000000000ca974
gr41: 0x9fffffffffffe8e7
gr42: 0
gr43: 0
gr44: 0x6000000000000c88
gr45: 0xc000000000000915
gr46: 0x40000000000330e0
gr47: 0x9fffffffffffe8e7
gr48: 0xc000000000000791
br0: 0x4000000000034e10
br1: 0
br2: 0
br3: 0
br4: 0
br5: 0
br6: 0xc0000000000ba2a0
br7: 0xe00000012c0006c0
rsc: 0x1f
bsp: 0x9fffffffef7ff6d8
bspst: 0x9fffffffef7ff6d8
rnat: 0
ccv: 0
unat: 0
fpsr: 0x9a04d8a70437f
pfs: 0xc000000000000791
(sor:0, sol:15, sof:17)
lc: 0
ec: 0
ip: 0x4000000000034e40:1
cfm: 0x791
(sor:0, sol:15, sof:17)
psr: 0x5355535045
(gdb)

0x4000000000034e20:2 :
(p6) br.cond.dptk.few get_tcp_service+352;;
;;; Line: 145
0x4000000000034e30:0 :
adds r8=24,r38;;
0x4000000000034e30:1 : ld8 r8=[r8]
0x4000000000034e30:2 : nop.i 0x0;;
0x4000000000034e40:0 :
ld8 r40=[r8];;
;;; Line: 146
0x4000000000034e40:1 : ld8 r8=[r40]
0x4000000000034e40:2 : nop.i 0x0;;
0x4000000000034e50:0 : st8 [r36]=r8
;;; Line: 147
0x4000000000034e50:1 : mov r42=1
0x4000000000034e50:2 :
br.cond.dptk.few get_tcp_service+368;;
;;; Line: 151
0x4000000000034e60:0 : mov r43=2
;;; Line: 152
0x4000000000034e60:1 : mov r42=0
0x4000000000034e60:2 : nop.i 0x0;;
0x4000000000034e70:0 : nop.m 0x0
0x4000000000034e70:1 : nop.m 0x0
0x4000000000034e70:2 :
br.cond.dptk.few get_tcp_service+400;;
End of assembler dump.
(gdb)
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>The above is the frame I was trying to explain.

Should I be asking why the third party lib is compiled with debug info? Would they provide the source file of the place where it aborts?

Can you print the values of:
p *line
P *progress_update
p *result

Also use ptype to see if the types/fields match what you have:
ptype line

>I hope you are asking for the core dump.

Of course not, that would be useless without a packcore.
I asked for:
(gdb) set redirect-file gdb.out
(gdb) set redirect on
(gdb) bt
(gdb) frame 0
(gdb) disas $pc-16*12 $pc+16*4
(gdb) info reg
(gdb) set redirect off

Then attach gdb.out.
Laurent Menase
Honored Contributor

Re: Core dump with BUS_ADRALN error

put also a disass get_tcp_service
to get the full function disass output, so we can see where that r40 comes from
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

A part of dump was missing above. Find the complete dump u have asked for

(gdb) frame 0
#0 get_tcp_service (service=0x400000000000d120 "TSC720",
ip_addr=0x9fffffffffffe1d8, port=0x9fffffffffffe1d0) at min_tcp_serv.c:146
146 in min_tcp_serv.c
(gdb) disas $pc-16*8 $pc+16*4
Dump of assembler code from 0x4000000000034dc0:0 to 0x4000000000034e80:0:
;;; File: min_tcp_serv.c
;;; Line: 138
0x4000000000034dc0:0 :
cmp.eq p6=r0,r41
0x4000000000034dc0:1 : nop.m 0x0
0x4000000000034dc0:2 :
(p6) br.cond.dptk.few get_tcp_service+384;;
;;; Line: 142
0x4000000000034dd0:0 : mov r47=r41
0x4000000000034dd0:1 : mov r9=24;;
0x4000000000034dd0:2 : add r9=r9,r1
0x4000000000034de0:0 : nop.m 0x0
0x4000000000034de0:1 : mov r14=r1;;
0x4000000000034de0:2 : nop.i 0x0
0x4000000000034df0:0 : ld8.acq r10=[r9]
0x4000000000034df0:1 :
adds r9=8,r9;;
0x4000000000034df0:2 : mov b6=r10
0x4000000000034e00:0 : ld8 r1=[r9]
0x4000000000034e00:1 : nop.m 0x0
0x4000000000034e00:2 : br.call.dptk.few b0=b6;;
0x4000000000034e10:0 : mov r1=r44
0x4000000000034e10:1 : mov r38=r8;;
;;; Line: 143
0x4000000000034e10:2 :
cmp.eq p6=r0,r38
0x4000000000034e20:0 : nop.m 0x0
0x4000000000034e20:1 : nop.m 0x0
0x4000000000034e20:2 :
(p6) br.cond.dptk.few get_tcp_service+352;;
;;; Line: 145
0x4000000000034e30:0 :
adds r8=24,r38;;
0x4000000000034e30:1 : ld8 r8=[r8]
0x4000000000034e30:2 : nop.i 0x0;;
0x4000000000034e40:0 :
ld8 r40=[r8];;
;;; Line: 146
0x4000000000034e40:1 : ld8 r8=[r40]
0x4000000000034e40:2 : nop.i 0x0;;
0x4000000000034e50:0 : st8 [r36]=r8
;;; Line: 147
0x4000000000034e50:1 : mov r42=1
0x4000000000034e50:2 :
br.cond.dptk.few get_tcp_service+368;;
;;; Line: 151
0x4000000000034e60:0 : mov r43=2
;;; Line: 152
0x4000000000034e60:1 : mov r42=0
0x4000000000034e60:2 : nop.i 0x0;;
0x4000000000034e70:0 : nop.m 0x0
0x4000000000034e70:1 : nop.m 0x0
0x4000000000034e70:2 :
br.cond.dptk.few get_tcp_service+400;;
End of assembler dump.
(gdb)
Dennis Handly
Acclaimed Contributor

Re: Core dump with BUS_ADRALN error

>Laurent: put also a disass get_tcp_service to get the full function disass output

I was hoping not to have to do that. :-(

>Below is the assembly that you have asked for

Basically it is aborting here:
0x4000000000034e00:2 : br.call.dptk.few b0=b6;;
0x4000000000034e10:0 : mov r1=r44
0x4000000000034e10:1 : mov r38=r8;;
;;; Line: 143
0x4000000000034e10:2 : cmp.eq p6=r0,r38
0x4000000000034e20:2 : (p6) br.cond.dptk.few GTS+0x160;;
;;; Line: 145
0x4000000000034e30:0 : adds r8=24,r38;;
0x4000000000034e30:1 : ld8 r8=[r8]
0x4000000000034e40:0 : ld8 r40=[r8];;
;;; Line: 146
0x4000000000034e40:1 : ld8 r8=[r40] <<<

Basically it called some libc function that returns a pointer. It wasn't NULL, so it dereferences a pointer to a pointer field 24 bytes into the struct. Then gets an alignment trap trying to use that pointer to load a pointer/long with only 4 byte alignment:
gr38: 0x60000000000c9f18 function return
gr40: 0x60000000000ca974 misaligned pointer

I think you can get the function it called with:
x /i *(void**)($r44+24)
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

I am attaching the gdb.out with the results of all the commands asked for
K!rn Kumr
Frequent Advisor

Re: Core dump with BUS_ADRALN error

(gdb) x /i *(void**)($r44+24)
0xc0000000001a54a0:0 :
alloc r33=ar.pfs,0,9,4,0