1833852 Members
2225 Online
110063 Solutions
New Discussion

Re: program dumps core

 
Clemens van Everdingen
Honored Contributor

program dumps core

Hi @,

Now I am in need of some help:

We do start a program and it dumps core.
# file core
core: core file from 'dmscraper' - received SIGABRT

# what core
core:
HP DCE/9000 1.5 PHSS_27258-59 Module: libcma.1 (Export) Date: Jun 12 2002 12:26:42
PATCH-PHCO_25640 for 10.20; for 10.30, 11.x compatibility libc.1_ID@@/main/r10dav/libc_dav/libc_dav_cpe//1
/ux/core/libs/libc/shared_pa1/libc.1_ID
Nov 30 2001 06:35:42
SMART_BIND
92453-07 dld dld dld.sl B.11.33 020617

Also found: cma_dump.log

Messages in this file:

%Internal DCE Threads problem (version CMA BL10+), terminating execution.
% Reason: cma__open_general: unexpected fstat error
The current thread is 1 (address 0x7b014e78)
DECthreads scheduling database is locked.

Does anybody have a clue why this is happening, and how to troubleshoot this further.

Possible patching issue ? for example: PHSS_28386 ?

I could not relate this patch directly to the issue !

Kind regards for any help,
Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
22 REPLIES 22
Bruno Vidal
Respected Contributor

Re: program dumps core

Hi,
What I can suggest is using tusc to know when this signal is caught (which syscall). But you're right, try also to install latest patche related to libs that this process is using (libc, DCE). But it seems really related to a DCE problem, or a locking problem (do you have any messages in dmesg ??).

Cheers.
Robert-Jan Goossens
Honored Contributor

Re: program dumps core

Clemens,

Hmmmm not sure this is going to help you, it's an oldy.

Date: 5/9/97
Document description: Internal DCE Threads problem - cma__ts_open fd is too large
Document id: W3592711

Robert-Jan.

Clemens van Everdingen
Honored Contributor

Re: program dumps core

Hi,

No messages in dmesg ! Too bad :)
This is our production system, so patching is a bit of a problem, unless I can really point to a issue solved with this patch.

Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Robert,

Found that one as well, but was not able to relate this one to the current problem.

Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Alex Glennie
Honored Contributor

Re: program dumps core

Hi, couldn't tell me what the O/S is here ... I'm a bit confused as there's mention of patches for 11.11 and also mention of DCE 1.5 which is 10.20.

I've seen that error before and it was cured by a patch at 11.00 s700_800 11.00 HP DCE/9000 1.7 Runtime cumulative patch PHSS_27962

the defect being :

7. JAGad27398 : CMA dumps core as it doesn't check fo condition EINPROGRESS.

So getting the latest appropriate dce patch installed may well be a good idea along with libC.

other than than a stack trace gdb ./> ./core may help those more familiar with dce/threads programming something I've not yet got my head around.
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Alex,

The os is: B.11.11 U

Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Massimo Bianchi
Honored Contributor

Re: program dumps core

Hi,
what MC/SG release do you have ?

I saw some defects related also to the combination of MC/SG and DCE kernel thread.

They should be addressed by the patch you mentioned, but i wuold like to know your release.

They also mentioned some test against the cluster using nmap. Did you used that ?

HTH,
Massimo
Massimo Bianchi
Honored Contributor

Re: program dumps core

Hi,
for example

PHSS_28849: HANG
After an upgrade of ServiceGuard cluster from
version A.10.06 or earlier to A.11.13, cmrunnode
command may fail.

If a port-scanning utility such as the Linux
application "nmap" is executed against a node
running ServiceGuard, cmcld on the node may hang
and unexpectedly fail.


HTH,
Massimo
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Massimo,

Sorry it's not related to MC/SG anyway, I posted this by accident in this group.

By the way we are using MC/SG A.11.14

Thanks,
Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Alex Glennie
Honored Contributor

Re: program dumps core

Clemens,

Not found any info pertinent to the earlier named defect @ 11.11 but dce patches don't always refer to all the fixes they cover in the patch text. My advise would be since this is a live production box you

a) get the stack trace out and either raise a call with HP support or post outputs here.

b) if patches, libc & DCE and DCE client patches can be installed it may be a good starting point.

One other question : has this just started occuring and are any other systems / O/S affected ...
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Alex,

I would have tried gdb earlier, but gdb is not on this system, or any other sytem within this environment.
Could someone at HP analyse the core file with gdb? Or do we have to do this on this system ?
If so I will log a call at HP as well.

If the problem just started: no, but I just got started with this problem, since no one else have been able tosolve this, and I just started working with this company.

Patches, might be, but as already stated a bit of a issue right now.

Thx,
Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Alex Glennie
Honored Contributor

Re: program dumps core

HP support should be able to help with this BUT to analyse the core they'll need a system with the same patch level and access to the binary/application dumping core : it would be quicker/easier if you can install gdb locally and provide them with the stack.

appologies for not being able to help further at this point.
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Alex,

I already thought so !:(

Is gdb on the application or OS cd's ?
I thought also there is a difference between 32 bits and 64 bits !

I need 64 bits.

Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Alex Glennie
Honored Contributor

Re: program dumps core


http://h21007.www2.hp.com/dspp/tech/tech_TechSoftwareDetailPage_IDX/1,1703,1662,00.html

Will have the latest version ... I beleive the apps cd's have slightly older versions .... no need to worry about 32/64 bit ... it will work.
Massimo Bianchi
Honored Contributor

Re: program dumps core

Hi,
here you can find the gdb:

http://hpux.connect.org.uk/hppd/hpux/Gnu/gdb-5.3/


HTH,
Massimo

Paula J Frazer-Campbell
Honored Contributor

Re: program dumps core

 
If you can spell SysAdmin then you is one - anon
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Paula,

Thanks for the very detailed example !

Now I am in desperate need of a slot to install wdb on our production server, and try to find out more.

I will probably not be allowed to do this during production hours.

Too bad but I will have to wait for a maintenance slot. :(

Kind regards,
Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
H.Merijn Brand (procura
Honored Contributor

Re: program dumps core

Depending on how your program was compiled some versions of gdb will help, but some won't: bad luck.

I have installed several versions of the truth, and still cannot tell which version works best on which core dump. Advice: get them all. You don't need reboots for installing wdb/gdb/adb/dde/dbx/xdb

a5:/u/usr/merijn/tmp/itrc 109 > path -al [axgw]db\* dbx ddd dde
2336 100755 -rwx 1 merijn 21156572 16 Dec 2002 23:08 /pro/local/bin/gdb -> ../../../usr/local/pa20_32/bin/gdb
30 100755 -rwx 1 merijn 20864492 16 Dec 2002 23:38 /pro/local/bin/gdb64 -> ../../../usr/local/pa20_64/bin/gdb
70110 100755 -rwx 1 merijn 1612176 27 Apr 1999 12:28 /pro/local/bin/gdb4
201 100555 -r-x 1 bin 196608 10 Nov 1998 03:52 /usr/bin/adb
202 100555 -r-x 1 bin 442368 7 Nov 1997 09:00 /usr/bin/adb64
13716 100555 -r-x 1 bin 4136960 14 Jun 2002 00:58 /opt/langtools/bin/gdb -> ./gdb32
13716 100555 -r-x 1 bin 4136960 14 Jun 2002 00:58 /opt/langtools/bin/gdb32
13717 100555 -r-x 1 bin 4227072 14 Jun 2002 00:58 /opt/langtools/bin/gdb64
13902 100555 -r-x 1 bin 3702784 14 Jun 2002 00:59 /opt/langtools/bin/wdb
30 100755 -rwx 1 merijn 20864492 16 Dec 2002 23:38 /usr/local/pa20_64/bin/gdb
13186 100555 -r-x 1 bin 1003 14 Jun 2002 00:57 /opt/langtools/bin/dde
a5:/u/usr/merijn/tmp/itrc 110 >

l1:/pro/pu/lep/4gl 104 > path -al [axgw]db\* dbx ddd dde
168262 100777 -rwx 1 probev 209 30 Jul 1996 17:45 /pro/pu/local/bin/adbgvi
2243 100755 -rwx 1 merijn 20864492 16 Dec 2002 23:38 /pro/local/bin/gdb64 -> ../../../usr/local/pa20_64/bin/gdb
22951 100755 -rwx 1 merijn 1612176 27 Apr 1999 12:28 /pro/local/bin/gdb4
1189 100755 -rwx 1 merijn 21156572 16 Dec 2002 23:08 /pro/local/bin/gdb -> ../../../usr/local/pa20_32/bin/gdb
99038 100777 -rwx 1 merijn 109 31 Aug 2000 14:15 /pro/local/dbman/bin/xdbish
200 100555 -r-x 1 bin 196608 10 Nov 1998 03:52 /usr/bin/adb
201 100555 -r-x 1 bin 442368 7 Nov 1997 09:00 /usr/bin/adb64
3080 100555 -r-x 1 bin 4136960 14 Jun 2002 00:58 /opt/langtools/bin/gdb -> ./gdb32
3080 100555 -r-x 1 bin 4136960 14 Jun 2002 00:58 /opt/langtools/bin/gdb32
3081 100555 -r-x 1 bin 4227072 14 Jun 2002 00:58 /opt/langtools/bin/gdb64
3626 100555 -r-x 1 bin 3702784 14 Jun 2002 00:59 /opt/langtools/bin/wdb
2243 100755 -rwx 1 merijn 20864492 16 Dec 2002 23:38 /wrk/pa20_64/bin/gdb
1116 100555 -r-x 1 bin 1003 14 Jun 2002 00:57 /opt/langtools/bin/dde
l1:/pro/pu/lep/4gl 105 >

FWIW, gdb-5.3 is available from the HP-UX porting center, and in my gcc ports on https://www.beepz.com/personal/merijn/ or http://www.cmve.net/~merijn/

Enjoy, have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Jean-Louis Phelix
Honored Contributor

Re: program dumps core

Hi,

You seem to currently have DCE patch 27258. I had a look to its most recent successor (PHSS_28386) and guess what ...

JAGae52205 : A threaded application that links to libcma, some times, exits unexpectedly after the system call accept() has been made.

The accept() call in cma in turn calls fstat() to obtain socket information. Sometimes, fstat()returns an error number, which is not handled in the code, due to which the problem occurs.

Perhaps you could try it ...

Regards.
It works for me (© Bill McNAMARA ...)
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Jean,

Indeed, as you may have seen I already made a similar conclusion before even asking you guys.

But still there is no explicit answer if this would really solve this problem. Although as you also stated it might be the solution.

So there will be no possibility to install this patch on production.
Too bad I have no explicit similar test environment.
Building this will be to complex for now.

Thx anyway,
Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !
Jean-Louis Phelix
Honored Contributor

Re: program dumps core

Clements,

I'm very sorry not to have read correctly your post ... I think that this could solve the core dump (message "unexpected fstat error" from cma_dump.log). My hope was simply that if fstat traps your error, it will give you the reason why it doesn't work, which could help you more than reading a core to solve the problem.

I'm not sure that "The accept() call in cma in turn calls fstat() to obtain socket information. Sometimes, fstat()returns an error number, which is not handled in the code ..." was visible outside of HP.

No point please.

Good luck.
It works for me (© Bill McNAMARA ...)
Clemens van Everdingen
Honored Contributor

Re: program dumps core

Jean,

No problemo.
Maybe I will plan a maintenance in soem weekend to install the patch anyway, and hoping there will be more info available afterwards.

Thanks anyway for the help so far.

Clemens
The computer is a great invention, there are as many mistakes as ever, but they are nobody's fault !