- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Core Dump with multi-threaded application...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2007 12:30 AM
тАО02-07-2007 12:30 AM
Re: Core Dump with multi-threaded application...
I am analyzing the core file on my test machine where it works fine. So its for sure has all the shlib files.
>>You could look at stack traces for the other threads to see if they are fiddling with NIS_PWDIR_
forgive my ignorance , but how do I do that?
>>to see if there is a pattern for that $ nm -pxv HpuxAgent
nothing like pattern of zero
>>(gdb) x /128x 0x400014b0-64*4
Attached is the dump of raw memory...
And this application works fine on most of our test machine but gives core on the customer machine. Is there something specific to that machine I should look for?
Beacause when I run same application both NIS_PWDIR_ and SHADOW are assigned default values as;
string Context::SHADOW = "shadow";
string Context::NIS_PWDIR_ = "/etc/";
And this is not conditional code, and should always initialize both variables. So how come they got zeroed on the customer machine.
Is there anything else I should look at?
As I mentioned earlier , when we had similar problem earlier , we had to rebuilt ICU lib files as they were built with older gcc.
But this time all the component of application are built with same gcc 3.4.2
Any pointers/info is much appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2007 07:23 PM
тАО02-07-2007 07:23 PM
Re: Core Dump with multi-threaded application...
>I am analyzing the core file on my test machine where it works fine. So its for sure has all the shlib files.
That's not what I mean. You must have an EXACT copy of every shlib your customer has. Including libc, etc. It is easier for your customer to just give you what they have with packcore, rather than play guessing games. That may be why your stack traces aren't complete.
>>You could look at stack traces for the other threads to see if they are fiddling with NIS_PWDIR_
>forgive my ignorance, but how do I do that?
To dump all threads in gdb:
(gdb) info thread
(gdb) thread apply all bt
Then look at the functions in each trace to see where they all are. See if there is a pattern the next time it happens. See if any function is accessing any variables near in memory to NIS_PWDIR_.
>>to see if there is a pattern for that $ nm -pxv HpuxAgent
This gives the mangled names of variables in memory order.
>nothing like pattern of zero
>Attached is the dump of raw memory...
There are a bunch of variables that are 0. If they are strings, they are also bad:
0x40001404 0x00000000 0x00000000 0x00000000
0x40001410 <_ZN7Context8DEV_PTS_E>: 0x00000000 0x0 0x0 0x0
0x40001420 <_ZN7Context5PROC_E>: 0x00000000 0x0 0x0 0x0
0x40001430 <_ZN7Context9USR_SBIN_E>: 0x00000000 0x0 0x0 0x0
0x40001440 <_ZN7Context8VAR_RUN_E>: 0x00000000 0x0 0x0 0x0
0x40001450 <_ZN7Context9CRON_DENYE>: 0x00000000 0x0 0x0 0x0
0x40001460 <_ZN7Context8CHPWDAGEE>: 0x00000000 0x0 0x0 0x0
0x40001470 <_ZN7Context11SYSLOG_CONFE>: 0x00000000 0x0 0x0 0x0
0x40001480 <_ZN7Context4GREPE>: 0x00000000 0x0 0x0 0x0
0x40001490 <_ZN7Context7USERMODE>: 0x00000000 0x0 0x0 0x0
0x400014a0 <_ZN7Context22NISMAP_NETGROUP_BYUSERE>: 0x00000000 0x0 0x0 0x0
Then your variable and 2 more words of 0:
0x400014b0 <_ZN7Context10NIS_PWDIR_E>: 0x00000000 0x0 0x0 0xffff0112
(If there are other variables in this area, they aren't listed if static or they don't start on the first address of each line. You need to match up with the nm(1) output.)
>Is there something specific to that machine I should look for?
Nothing is obvious, you'll set it when you find it. ;-) (How many CPUs?)
>So how come they got zeroed on the customer machine.
A bad pointer? Something wasn't locking a critical resource like a pointer?
>As I mentioned earlier, when we had similar problem earlier, we had to rebuilt ICU lib files as they were built with older gcc.
A mismatch in layouts is a possible cause.
What does "info shared" show?
>Any pointers/info is much appreciated.
There isn't much I can suggest except to try to build debugging tools/tracing into your product. Or see if you can get the customer to repeat it. It seems you were able to do that from your original message??
As I said, linking with -z may catch the error sooner.
You are asking questions that are getting into real time and money.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-12-2007 10:15 PM
тАО02-12-2007 10:15 PM
Re: Core Dump with multi-threaded application...
I had some luck to reproduce the issue on one of my HPUX box.
I started debugging and was constantly watching the static global variables and they were "zero" from the beginning, I mean from start of the program, in the main function.
Later I did same thing on the box where the application works fine and all the static global variables have proper non-zero values.
And all these static variables are initilized with hard coded values at global declaration area (above all function definition)...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-13-2007 10:04 AM
тАО02-13-2007 10:04 AM
Re: Core Dump with multi-threaded application...
These variables should be zero at the very start of executation since they are runtime initialized. But sometime before main, they should be initialized to their proper value.
For PA32, aC++ used to initialized them on a call to _main inside of main. I think g++ does it sooner.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-13-2007 09:31 PM
тАО02-13-2007 09:31 PM
Re: Core Dump with multi-threaded application...
I have written a small sample program that just simulate my situation and it also gives the core dump in same way.
Attached is the sample program.
Here are the commands used to build and link;
g++ -g -pthread -c test.cpp
g++ -o TestStaticString test.o -pthread -L /var/home/HPUXAGENT/lib -L/usr/local/lib -lpthread -lcrypt -lxnet -lnsl -lsec -lstdc++ -Xlinker +b -Xlinker ../lib:. -Xlinker +s -z
Here is the ldd output of the sample exe on build machine.
$ ldd TestStaticString
/usr/lib/libc.2 => /usr/lib/libc.2
/usr/lib/libdld.2 => /usr/lib/libdld.2
/usr/lib/libc.2 => /usr/lib/libc.2
/var/home/HPUXAGENT/lib/libgcc_s.sl => /var/home/HPUXAGENT/lib/libgcc_s.sl
/usr/lib/libc.2 => /usr/lib/libc.2
/usr/lib/libm.2 => /usr/lib/libm.2
/usr/local/lib/libstdc++.sl.6 => /usr/local/lib/libstdc++.sl.6
/usr/lib/libc.2 => /usr/lib/libc.2
/scratch/njs/pkgbuild/3.3.1/hpux-11/gcc-3.4.2-b/gcc/libgcc_s.sl => /usr/local/lib/libgcc_s.sl
/usr/lib/libc.2 => /usr/lib/libc.2
/usr/lib/libm.2 => /usr/lib/libm.2
/usr/lib/libsec.2 => /usr/lib/libsec.2
/usr/lib/libm.2 => /usr/lib/libm.2
/usr/lib/libnsl.1 => /usr/lib/libnsl.1
/usr/lib/libxti.2 => /usr/lib/libxti.2
/usr/lib/libxnet.2 => /usr/lib/libxnet.2
/usr/lib/libxti.2 => /usr/lib/libxti.2
/usr/lib/libpthread.1 => /usr/lib/libpthread.1
And here is the ldd output of same sample exe on the test machine where it gives core dump.
bash-2.04# ldd TestStaticString
=>
/usr/lib/libc.2 => ../lib/libc.2
/usr/lib/libdld.2 => ../lib/libdld.2
/usr/lib/libc.2 => ../lib/libc.2
/var/home/HPUXAGENT/lib/libgcc_s.sl => /home/sanjay/agent/lib/libgcc_s.sl
/usr/lib/libc.2 => ../lib/libc.2
/usr/lib/libm.2 => ../lib/libm.2
/usr/local/lib/libstdc++.sl.6 => /home/sanjay/agent/lib/libstdc++.sl.6
/usr/lib/libc.2 => ../lib/libc.2
/scratch/njs/pkgbuild/3.3.1/hpux-11/gcc-3.4.2-b/gcc/libgcc_s.sl => /home/sanjay/agent/lib/libgcc_s.sl
/usr/lib/libm.2 => ../lib/libm.2
/usr/lib/libsec.2 => ../lib/libsec.2
/usr/lib/libm.2 => ../lib/libm.2
/usr/lib/libnsl.1 => ../lib/libnsl.1
/usr/lib/libxti.2 => ../lib/libxti.2
/usr/lib/libxnet.2 => ../lib/libxnet.2
/usr/lib/libxti.2 => ../lib/libxti.2
/usr/lib/libpthread.1 => ../lib/libpthread.1
bash-2.04# ./TestStaticString
Main started
initialized string variable
Calling getfullname
Segmentation fault (core dumped)
bash-2.04#
ANd the dump analysis;
bash-2.04# gdb TestStaticString core
HP gdb 5.2 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00
and target hppa1.1-hp-hpux11.00.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.2 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: core file may not match specified executable file.
Core was generated by `TestStaticStr'.
Program terminated with signal 11, Segmentation fault.
SEGV_UNKNOWN - Unknown Error
#0 _ZNSsC1ERKSs ()
at /scratch/njs/pkgbuild/3.3.1/hpux-11/gcc-3.4.2-b/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/bits/basic_string.h:182
182 /scratch/njs/pkgbuild/3.3.1/hpux-11/gcc-3.4.2-b/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/bits/basic_string.h: No such file or directory.
in /scratch/njs/pkgbuild/3.3.1/hpux-11/gcc-3.4.2-b/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/bits/basic_string.h
(gdb) where
#0 _ZNSsC1ERKSs ()
at /scratch/njs/pkgbuild/3.3.1/hpux-11/gcc-3.4.2-b/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/bits/basic_string.h:182
#1 0x3b84 in _ZStplIcSt11char_traitsIcESaIcEESbIT_T0_T1_ERKS6_S8_ ()
at /usr/local/include/c++/3.4.2/bits/basic_string.h:1984
(gdb) x/4x *(void**)($sp-0x12c)
0x40001078 <_ZN7static14nameE>: 0x00000000 0x00000000 0xffff0106 0x00000100
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-13-2007 10:31 PM
тАО02-13-2007 10:31 PM
Re: Core Dump with multi-threaded application...
Then setup your watch point.
>And here is the ldd output of same sample exe on the test machine where it gives core dump.
Why don't you point to where it is different? libstdc++.sl.6 & libgcc_s.sl?
(gdb) x/4x *(void**)($sp-0x12c)
0x40001078 <_ZN7static14nameE>: 0x00000000 0x00000000 0xffff0106 0x00000100
So, just put a watchpoint here.
You can also do that nm -pxv to see what variables are around yours.
If you are correct that these string variable are never runtime initialized, you have found a g++ problem. In my version 3.3.2, I have no problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-14-2007 01:20 AM
тАО02-14-2007 01:20 AM
Re: Core Dump with multi-threaded application...
bash-2.04# gdb TestStaticString
HP gdb 5.2 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00
and target hppa1.1-hp-hpux11.00.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.2 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
(gdb) watch static1::name
warning: can't do that without a running program; try "break main", "run" first
(gdb) b main
Breakpoint 1 at 0x36f4: file test.cpp, line 29 from /test/TestStaticString.
(gdb) r
Starting program: /test/TestStaticString
Breakpoint 1, main () at test.cpp:29
warning: Source file is more recent than executable TestStaticString.
29 static1* s = new static1();
(gdb) watch static1::name
Hardware watchpoint 2: static1::name
(gdb) p static1::name
$1 = {static npos = 4294967295,
_M_dataplus = {
(gdb) p static1::surname
$2 = {static npos = 4294967295,
_M_dataplus = {
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-14-2007 01:27 AM
тАО02-14-2007 01:27 AM
Re: Core Dump with multi-threaded application...
(gdb) b main
Breakpoint 1 at 0x36f4: file test.cpp, line 29 from /test/TestStaticString.
(gdb) r
Starting program: /test/TestStaticString
warning: Load module /usr/local/lib/libstdc++.sl.6 has been stripped
Breakpoint 1, main () at test.cpp:29
29 static1* s = new static1();
(gdb) p static1::name
$1 = {static npos =
_M_dataplus = {
_M_p = 0x40004aec "sanjay"}}
(gdb) p static1::surname
$2 = {static npos =
_M_dataplus = {
_M_p = 0x40004b14 "sutar"}}
(gdb)
So its for sure that the static variables are not initilized.
>>If you are correct that these string >>variable are never runtime initialized, >>you have found a g++ problem
Then I wonder ,how this works on some other machine
>>Why don't you point to where it is different? libstdc++.sl.6 & libgcc_s.sl?
I tried hard to link with my version of both files , but every time the linker links with libstdc++ and libgcc form /usr/local/lib :(
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-14-2007 03:52 PM
тАО02-14-2007 03:52 PM
Re: Core Dump with multi-threaded application...
I am going to post this on langauage forum...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-14-2007 04:39 PM
тАО02-14-2007 04:39 PM
Re: Core Dump with multi-threaded application...
>I tried hard to link with my version of both files , but every time the linker links with libstdc++ and libgcc from /usr/local/lib :(
(By point, I meant add a comment saying "look here <<<".)
You can use chatr(1) to see what the paths are. If you embed a path, it can point to somewhere different. And you can also use SHLIB_PATH to change it.
>I am going to post this on langauage forum...
I talked with our expert here and he didn't recall any static init errors lately. I sent him a pointer to this thread.