Operating System - HP-UX
1752831 Members
3443 Online
108789 Solutions
New Discussion

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

 
laxmia
Occasional Advisor

Getting core dumps daily at 00.03 to 00.13 on zeus process

Hi,

 

We are getting core  dumps daily in production on the zeus process at exactly from 00.03 to 00.13.  Afterwords we don't see the core dumps on the process

 

We are unable to reproduce the issue on the test machines.

 

 

The gdb output is pointing to the .so files   .  Please find the gdb report for the core

 

gdb /system_x/bin/eq -c copy_core28-00-20-1
HP gdb 6.1 for HP Itanium (32 or 64 bit) and target HP-UX 11iv2 and 11iv3.
Copyright 1986 - 2009 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 6.1 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: Load module /system_x/bin/eq has been stripped.
Debugging information is not available.
 
(no debugging symbols found)...
Core was generated by `eq'.
Program terminated with signal 11, Segmentation fault.
SEGV_MAPERR - Address not mapped to object
(no debugging symbols found)...#0  0xc0000000003b4510:0 in strcmp+0x50 ()
   from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0  0xc0000000003b4510:0 in strcmp+0x50 () from /usr/lib/hpux64/libc.so.1

 

Also the kimModelUpdate.out report during the core period

 

 

vi kimModelUpdate.out
"kimModelUpdate.out" 21 lines, 1052 characters
/system_x/bin/kimModelUpdate[29]: 2764 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 2767 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 2771 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3105 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3108 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3115 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3188 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3196 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3211 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3389 Memory fault(coredump)
Waiting for KIM service announcement
/system_x/bin/kimModelUpdate[29]: 3411 Memory fault(coredump

 

Attached the copy of the script .

 

14 REPLIES 14
Dennis Handly
Acclaimed Contributor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

>The gdb output is pointing to the .so files.  Please find the gdb report for the core

 

I don't see a complete stack trace, just the strcmp call.

You are probably passing NULL or a bad pointer to it.

 

>warning: Load module /system_x/bin/eq has been stripped.

 

You can't strip if you want to debug.

 

 

laxmia
Occasional Advisor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

Hi ,

 

Thanks for the response.

 

The gdb stack for most of the cores as  below

 

hhtst010:/homes/home1/sysXadm/zeus_core_eq> gdb /system_x/bin/eq copy_core29-00-20-1
HP gdb 6.1 for HP Itanium (32 or 64 bit) and target HP-UX 11iv2 and 11iv3.
Copyright 1986 - 2009 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 6.1 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: Load module /system_x/bin/eq has been stripped.
Debugging information is not available.

(no debugging symbols found)...
Core was generated by `eq'.
Program terminated with signal 11, Segmentation fault.
SEGV_MAPERR - Address not mapped to object
(no debugging symbols found)...
warning: Some of the libraries in the core file are different from the libraries on this computer.  It might be possible to proceed with your debugging process successfully.  However, if you run into problems you must use packcore command or use the versions of the libraries used by the core.  The mismatches are:

  /usr/lib/hpux64/dld.so in the core file is different from
  /usr/lib/hpux64/dld.so used by gdb

  /usr/lib/hpux64/libc.so.1 in the core file is different from
  /usr/lib/hpux64/libc.so.1 used by gdb

  /usr/lib/hpux64/libCsup.so.1 in the core file is different from
  /usr/lib/hpux64/libCsup.so.1 used by gdb

  /usr/lib/hpux64/libxti.so.1 in the core file is different from
  /usr/lib/hpux64/libxti.so.1 used by gdb

  /usr/lib/hpux64/libdl.so.1 in the core file is different from
  /usr/lib/hpux64/libdl.so.1 used by gdb


warning: No unwind information found.
 Skipping this library /usr/lib/hpux64/libcl.so.1.

#0  0xc0000000003b4510:0 in memset ()
   from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0  0xc0000000003b4510:0 in memset () from /usr/lib/hpux64/libc.so.1
warning: Attempting to unwind past bad PC 0xc0000000003b4510
#1  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#2  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#3  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#4  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#5  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#6  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#7  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#8  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#9  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#10 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#11 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#12 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#13 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#14 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#15 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#16 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#17 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#18 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#19 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#20 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#21 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#22 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
---Type <return> to continue, or q <return> to quit---
--------------------------------------


hhtst010:/homes/home1/sysXadm/zeus_core_eq> gdb /system_x/bin/eq copy_core_autorev
HP gdb 6.1 for HP Itanium (32 or 64 bit) and target HP-UX 11iv2 and 11iv3.
Copyright 1986 - 2009 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 6.1 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: Load module /system_x/bin/eq has been stripped.
Debugging information is not available.

(no debugging symbols found)...
Core was generated by `eq'.
Program terminated with signal 11, Segmentation fault.
SEGV_MAPERR - Address not mapped to object
(no debugging symbols found)...
warning: Some of the libraries in the core file are different from the libraries on this computer.  It might be possible to proceed with your debugging process successfully.  However, if you run into problems you must use packcore command or use the versions of the libraries used by the core.  The mismatches are:

  /usr/lib/hpux64/dld.so in the core file is different from
  /usr/lib/hpux64/dld.so used by gdb

  /usr/lib/hpux64/libc.so.1 in the core file is different from
  /usr/lib/hpux64/libc.so.1 used by gdb

  /usr/lib/hpux64/libCsup.so.1 in the core file is different from
  /usr/lib/hpux64/libCsup.so.1 used by gdb

  /usr/lib/hpux64/libxti.so.1 in the core file is different from
  /usr/lib/hpux64/libxti.so.1 used by gdb

  /usr/lib/hpux64/libdl.so.1 in the core file is different from
  /usr/lib/hpux64/libdl.so.1 used by gdb


warning: No unwind information found.
 Skipping this library /usr/lib/hpux64/libcl.so.1.

#0  0xc0000000003b4510:0 in memset ()
   from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0  0xc0000000003b4510:0 in memset () from /usr/lib/hpux64/libc.so.1
warning: Attempting to unwind past bad PC 0xc0000000003b4510
#1  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#2  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#3  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#4  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#5  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#6  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#7  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#8  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#9  0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#10 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#11 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#12 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#13 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#14 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#15 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#16 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#17 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#18 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#19 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#20 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#21 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
#22 0xc00000000050bae0:0 in abort+0x120 () from /usr/lib/hpux64/libc.so.1
---Type <return> to continue, or q <return> to quit---

 

 

Also let me know how to unstrip the Debug mode.

 

Thanks in Advance

Laxmi

Dennis Handly
Acclaimed Contributor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

>The gdb stack for most of the cores as  below

 

These are useless.  In general you must do you corefile analysis on the SAME system that created them.

 

>warning: Some of the libraries in the core file are different from the libraries on this computer.  It might be possible to proceed with your debugging process successfully.  However, if you run into problems you must use packcore command or use the versions of the libraries used by the core.

 

As mentioned above, if you want to do the analysis elsewhere, you must use packcore/unpackcore to copy all of the load modules.

 

>let me know how to unstrip

 

You must rebuild your application without using the -s option to strip the symbol table.

Or not use the strip(1) command.  If you want access to the program's variables, you need to compile with -g.

laxmia
Occasional Advisor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

Hi Dannis,

 

Thanks for the response.

 

The core occurs in production machine & we do not have the gdb installed on production machine.  So needs to move the core to test boxes to run gdb on it.  Not sure whether packcore will work on Production boxe(as gdb not installed) but will try to get the core using packcore command.

 

Also the we used the libE binaries(basically the 'eq' binary which is receiving core)  was compiled & provided by HP to us . We are using those in our application further  .  So I do not have the source code of HP compile by eliminating -s option & including the -g option.

 

So is there any other way to debug the issue . Anyways let me try the packcore option.

 

Thanks

Laxmi

Dennis Handly
Acclaimed Contributor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

>The core occurs in production machine & we do not have the gdb installed on production machine.

 

Then you'll need to do a whole bunch of mickey-mouse to copy the load modules you need:

1) On your test box, run gdb on the executable and core.

Use the "info shared" command to get a list of shlibs.

 

From the gdb messages, perhaps you only need these files?

  /usr/lib/hpux64/dld.so in the core file is different from

  /usr/lib/hpux64/libc.so.1 in the core file is different from
  /usr/lib/hpux64/libCsup.so.1 in the core file is different from
  /usr/lib/hpux64/libxti.so.1 in the core file is different from
  /usr/lib/hpux64/libdl.so.1 in the core file is different from

 

2) tar up these files using a relative path.

3) untar these files to a different directory tree, do NOT overwrite existing files.

4) export GDB_SHLIB_ROOT to that directory tree.

5) Then run gdb.

 

(Installing gdb on your production box is much simpler.  You should be able to just copy the gdb executable there.)

http://h21007.www2.hp.com/portal/download/files/unprot/wdb/wdb_6.2/Corefiles-updated.pdf

 

>Not sure whether packcore will work on Production box (as gdb not installed)

 

packcore is a gdb command.

 

>we used the libE binaries was compiled & provided by HP to us.

 

You need to go back to HP and get a non-stripped version.

 

>is there any other way to debug the issue?

 

Not unless you want to debug at the assembly level.

laxmia
Occasional Advisor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

Hi Dannis,

 

Please find the packcore report  on production

 

-r-xr-xr-x   1 sysXadm    sysXadm      85680 Feb 15  2007 libuca.so.1
-r-xr-xr-x   1 sysXadm    sysXadm      84248 Feb 15  2007 libunalign.so.1
-r-xr-xr-x   1 sysXadm    sysXadm      69336 Feb 15  2007 libcl.so.1
-r-xr-xr-x   1 sysXadm    sysXadm    1510856 Jul 15  2009 libnsl.so.1
-r-xr-xr-x   1 sysXadm    sysXadm     913304 Nov 30  2009 librtc.so
-r-xr-xr-x   1 sysXadm    sysXadm     714608 Dec  6  2010 libunwind.so.1
-r-xr-xr-x   1 sysXadm    sysXadm    2996984 Dec  7  2010 libIO77.so.1
-rwxr-xr-x   1 sysXadm    sysXadm    1634400 Jul 24  2011 eq
-r-xr-xr-x   1 sysXadm    sysXadm    4901296 Dec  5  2012 libc.so.1
-r-xr-xr-x   1 sysXadm    sysXadm     298552 Mar 26  2013 libxti.so.1
-r-xr-xr-x   1 sysXadm    sysXadm    1191976 Jun  5  2013 dld.so
-r-xr-xr-x   1 sysXadm    sysXadm      78976 Jun  5  2013 libdl.so.1
-r-xr-xr-x   1 sysXadm    sysXadm     852240 Jun  5  2013 libCsup.so.1
-rw-------   1 sysXadm    sysXadm    13033128 Sep 16 12:55 core
-rwxr-xr-x   1 sysXadm    sysXadm          2 Sep 17 08:08 progname.txt
-rw-r--r--   1 sysXadm    sysXadm    15349760 Sep 17 08:08 modules.tar

 

We have gdb installed on production, I have asked my production support to run gdb on the core & provide the report.  He may get it to me by today.

 

laxmia
Occasional Advisor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

This is latest packcore

hhtst010:/homes/home1/sysXadm/packcore> r ls
ls -lrt
total 85398
-r-xr-xr-x 1 sysXadm sysXadm 85680 Feb 15 2007 libuca.so.1
-r-xr-xr-x 1 sysXadm sysXadm 84248 Feb 15 2007 libunalign.so.1
-r-xr-xr-x 1 sysXadm sysXadm 69336 Feb 15 2007 libcl.so.1
-r-xr-xr-x 1 sysXadm sysXadm 1510856 Jul 15 2009 libnsl.so.1
-r-xr-xr-x 1 sysXadm sysXadm 913304 Nov 30 2009 librtc.so
-r-xr-xr-x 1 sysXadm sysXadm 714608 Dec 6 2010 libunwind.so.1
-r-xr-xr-x 1 sysXadm sysXadm 2996984 Dec 7 2010 libIO77.so.1
-rwxr-xr-x 1 sysXadm sysXadm 1634400 Jul 24 2011 eq
-r-xr-xr-x 1 sysXadm sysXadm 4901296 Dec 5 2012 libc.so.1
-r-xr-xr-x 1 sysXadm sysXadm 298552 Mar 26 2013 libxti.so.1
-r-xr-xr-x 1 sysXadm sysXadm 1191976 Jun 5 2013 dld.so
-r-xr-xr-x 1 sysXadm sysXadm 78976 Jun 5 2013 libdl.so.1
-r-xr-xr-x 1 sysXadm sysXadm 852240 Jun 5 2013 libCsup.so.1
-rw------- 1 sysXadm sysXadm 13033128 Sep 18 00:03 core
-rwxr-xr-x 1 sysXadm sysXadm 2 Sep 18 07:37 progname.txt
-rw-r--r-- 1 sysXadm sysXadm 15349760 Sep 18 07:37 modules.tar
Dennis Handly
Acclaimed Contributor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

>We have gdb installed on production, I have asked my production support to run gdb on the core & provide the report.

 

While they were at it, should have asked for stacktrace.  :-)

(If it doesn't work there, packcore won't help.)

 

>This is latest packcore

 

Once you have one, you can use unpackcore (first time) or getcore to start debugging.

laxmia
Occasional Advisor

Re: Getting core dumps daily at 00.03 to 00.13 on zeus process

HI Daniis,

 

Thanks for the response.

 

Please find the stacktrace from production environment

 

gdb /system_x/bin/eq copy_core20140919001412
HP gdb 6.1 for HP Itanium (32 or 64 bit) and target HP-UX 11iv2 and 11iv3.
Copyright 1986 - 2009 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 6.1 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: Load module /system_x/bin/eq has been stripped.
Debugging information is not available.

(no debugging symbols found)...
Core was generated by `eq'.
Program terminated with signal 11, Segmentation fault.
SEGV_MAPERR - Address not mapped to object
(no debugging symbols found)...#0 0xc0000000003b46f0:0 in strcmp+0x50 ()
from /usr/lib/hpux64/libc.so.1
(gdb) bt
#0 0xc0000000003b46f0:0 in strcmp+0x50 () from /usr/lib/hpux64/libc.so.1
#1 0xc00000000050e1c0:0 in do_getconfig+0x3e0 ()
from /usr/lib/hpux64/libc.so.1
#2 0xc00000000050eea0:0 in __nsw_getconfig+0x150 ()
from /usr/lib/hpux64/libc.so.1
#3 0xc000000000510350:0 in _nss_db_state_constr+0x200 ()
from /usr/lib/hpux64/libc.so.1
#4 0xc000000000510c70:0 in nss_search+0xc0 () from /usr/lib/hpux64/libc.so.1
#5 0xc0000000005b10d0:0 in __getpwuid_r+0x100 ()
from /usr/lib/hpux64/libc.so.1
#6 0xc0000000005b1af0:0 in getpwuid+0xf0 () from /usr/lib/hpux64/libc.so.1
#7 0x400000000006a940:0 in <unknown_procedure> + 0x170 ()
(gdb) q

------------------

 

Also I could not do the unpackcore or getcore(May be I am executing wrong command) getting the below errors.

 

 

-rwxr-x---   1 al278g     gadm       28395520 Sep 22 01:09 packcore.tar
drwxr-x---   2 al278g     gadm            96 Sep 22 01:11 packcore
hhtst010:/homes/home1/al278g> unpack packcore
unpack: packcore.z: cannot open
hhtst010:/homes/home1/al278g> unpackcore
ksh: unpackcore:  not found
hhtst010:/homes/home1/al278g> getcore  packcore
ksh: getcore:  not found
hhtst010:/homes/home1/al278g>