- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Why some POSIX threads do not have their stack...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2013 06:44 AM - edited 08-01-2013 06:45 AM
08-01-2013 06:44 AM - edited 08-01-2013 06:45 AM
Why some POSIX threads do not have their stacktrace starting from __pthread_bound_body?
I have been analyzing a core file of a multithreaded C++ program.
So as a first step I did in gdb 'thread apply all bt' (gdb 6.1, HP-UX 11.31).
Most of the threads have stack trace that starts from __pthread_bound_body().
This is an example:
#52 0xc000000021f98800:0 in selfcare::operations_execute (ctx=@0x60000002f364a140, from=<not available>, to={_M_current = 0x6000000751306090}, snmp_on=false) at has_source/request_handler.cpp:1002 #53 0xc000000021fa1d80:0 in selfcare::request_handler::process_request (this=0x60000002f364a000, cmd=0x60000007b06ec200, parameters=@0x6000000727499fe8) at has_source/request_handler.cpp:1609 #54 0xc000000021fa8920:0 in selfcare::request_handler::execute (this=0x60000002f364a000) at has_source/request_handler.cpp:2042 #55 0xc000000018b0c460:0 in threads::thread_proc (thr_ptr=0x60000002f364a000) at has_common_source/source/cpp/threads.cpp:225 #56 0xc000000000117b20:0 in __pthread_bound_body () at /ux/core/libs/threadslibs/src/common/pthreads/pthread.c:4875
However there are a few threads that have stacktrace that starts from other functions.
These are examples:
1) frame #25 is the last thread in this thread
#23 0xc000000021f98800:0 in selfcare::operations_execute (ctx=@0x60000002f3ac8140, from=<not available>, to={_M_current = 0x6000000710d13490}, snmp_on=false) at has_source/request_handler.cpp:1002 #24 0xc000000021fa1d80:0 in selfcare::request_handler::process_request (this=0x60000002f3ac8000, cmd=0x600000072a2fea00, parameters=@0x60000002eb5b48c8) at has_source/request_handler.cpp:1609 #25 0xc000000021fa8920:0 in selfcare::request_handler::execute (this=0x60000002f3ac8000) at has_source/request_handler.cpp:2042
2)frame #29 is the last thread in this thread
#24 0xc000000021f94030:0 in selfcare::operation_execute (ctx=@0x60000002f5d6a140, o=@0x6000000870219d80) at has_source/request_handler.cpp:781 #25 0xc000000021f98800:0 in selfcare::operations_execute (ctx=@0x60000002f5d6a140, from=<not available>, to={_M_current = 0x6000000870219e10}, snmp_on=false) at has_source/request_handler.cpp:1002 #26 0xc000000021f99680:0 in selfcare::execute_operations (root_operation_name=<not available>, ec=@0x60000002f5d6a140, snmp_on=false) at has_source/request_handler.cpp:1085 #27 0xc000000021dd03e0:0 in selfcare::operation_context_wrapper::exec_operation (this=0x9fffffff39cc35a8, name=@0x9fffffff39cc2a48, args=<not available>, ptr_os=<not available>, ptr_psf=0x60000002f5d6eb30) at has_source/operation_context_wrapper.cpp:330 #28 0xc000000021dca0a0:0 in selfcare::operation_context_wrapper::exec_operation (this=0x9fffffff39cc35a8, name=<not available>, args=@0x9fffffff39cc2df8, ptr_os=0x9fffffff39cc3350, ptr_psf=0x9fffffff39cc2cb0) at has_source/operation_context_wrapper.cpp:366 #29 0xc000000023873190:0 in serviceguide::operations::call_credit_subs_brt_get::operator() (this=<not available>, ctx=@0x9fffffff39cc35a8) at sc_source/get_brt_cc_volumes.cpp:290
All POSIX threads in the program are created in the same way.
I don't understand why some threads do not have their stacktrace starting from __pthread_bound_body.
Does it mean that when there is no __pthread_bound_body() then this is kind of corruption in stack or data?
If it is indeed corruption how can this particular corruption be found or detected at runtime?
Or it just gdb can't show stacktraces properly?
- Tags:
- gdb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2013 08:17 AM
08-01-2013 08:17 AM
Re: Why some POSIX threads do not have their stacktrace starting from __pthread_bound_body?
For example #1 where the frame #25 is the last thread in this thread:
(gdb) f 25
#25 0xc000000021fa8920:0 in selfcare::request_handler::execute (
this=0x60000002f3ac8000) at has_source/request_handler.cpp:2042
2042 in has_source/request_handler.cpp
(gdb) info frame
Stack level 25, frame at 0x9fffffff5731a390:
ip = 0xc000000021fa8920:0 in selfcare::request_handler::execute()
(has_source/request_handler.cpp:2042); saved ip 0x0
caller of frame at 0x9fffffff57315560
source language c++.
Size of frame is 93, Size of locals is 88, Size of rotating is 0.
NAT collections saved at 0x9fffffff56f1c1f8.
Arglist at 0x9fffffff56f1c0e8, args: this=0x60000002f3ac8000
Locals at 0x9fffffff56f1c0e8, Previous frame's sp is 0x9fffffff5731a390
So for frame #25
1) Stack level 25, frame at 0x9fffffff5731a390
2) Previous frame's sp is 0x9fffffff5731a390
Iit means that "this frame" == "Previous frame" and there is no more frames above. Am I right? But where is a frame for __pthread_bound_body()?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2013 09:43 PM - edited 08-01-2013 10:34 PM
08-01-2013 09:43 PM - edited 08-01-2013 10:34 PM
Re: Why some POSIX threads do not have their stacktrace starting from __pthread_bound_body?
>If it is indeed corruption how can this particular corruption be found or detected at runtime?
Well if you have corruption, it could be one thread stack destroying another by jumping over the guard pages.
With pthread_attr_setguardsize you can make them bigger.
Have you checked the sizes of your stacks?
The following will give you the sizes of all of your memory segments:
# elfdump -o -S core-file
Type Offset Vaddr FSize Memsz
CoreStck 0000000001228bd0 9fffffffbf7ff000 0000000000001000 0000000000001000 RSE stack
CoreStck 0000000001229bd0 9ffffffffffd0000 0000000000030000 0000000000030000 user stack
Thread stacks would be MMF regions, unfortunately shlib data is the same. Look for the right sizes.
Or use gdb's "info shared" and exclude those ranges.
>Or it just gdb can't show stacktraces properly?
Can you repeatedly duplicate the problem?
Can you call U_STACK_TRACE in some of those unusual threads?
gdb should be calling libunwind and that should get it right. U_STACK_TRACE would prove it one way or the other.
>saved ip 0x0
This is why it stopped.
You need to disassemble the bunch of instructions at the start of this function. Figure out where it put the PC. And if in a register, where in the RSE stack it put it.
What does "info reg" show for that frame?
>1) Stack level 25, frame at 0x9fffffff5731a390
>2) Previous frame's sp is 0x9fffffff5731a390
>It means that "this frame" == "Previous frame"
Unfortunately gdb has some weird idea how frames should be described and I need to draw a picture each time. The compiler only deals with SP and BSP. And another dedicated register if alloc frames.
RSE stack:
0x9fffffff56f1c0e8 arg list/locals == BSP
0x9fffffff56f1c1f8 NAT reg
V V
guard page
^ ^
User stack:
0x9fffffff57315560 caller of frame This doesn't make sense, should be after!
0x9fffffff5731a390 frame at
0x9fffffff5731a390 prev frame's SP
gdb thinks the frame starts on the high address.