Operating System - HP-UX
1833780 Members
2536 Online
110063 Solutions
New Discussion

Re: swagentd stops with core dump

 
Sorin Ifrim_1
New Member

swagentd stops with core dump

Hi,

We have a problem on an rp5450 machine running HPUX 11.11. Each time when invokink swinstall, the swagentd dies, a core dump is generated and also a cma_dump.log log is created.

Output of # what core:

core:
$Revision: SD-UX/B.11.20.00.06, CONTROLLER, UDL_11i_Mar02OEUR_IC3, Optimized, Build Env UDL_SDVBE_DAV/BE10.20_IC12, Built Dec 17 2001 07:05:23$
HP DCE/9000 1.5 PHSS_20608 Module: libcma.1 (Export) Date: Dec 8 1999 18:41:23
HP DCE/9000 1.5 PHSS_19739-40 Module: libdce.1 (U.S./Canada only) Date: Sep 4 1999 07:37:55
rec_seq.c 8.2 (Berkeley) 9/7/93
$RCSfile: environment.c,v $ $Revision: /main/HPDCE02/2 $ (OSF) $Date: 1994/12/05 19:53 UTC $
libXOM 1.9 (BULL S.A) 7/1/92
PATCH-PHCO_20098 for 10.20; for 10.30, 11.x compatibility libc.1_ID@@/main/r10dav/libc_dav/libc_dav_cpe/9
/ux/core/libs/libc/shared_pa1/libc.1_ID
Oct 8 1999 10:39:52
ObAM Version 4.2.16 (build date: Tue Sep 12 08:15:51 MDT 2000)
unknownTag product SysV.4
Dialog Manager IDMuser.h Version 3.4a
X11R6 Motif 1.2
HP-UX libm shared PA1.0 C math library 970220 (133940) UX 10.20
SMART_BIND
92453-07 dld dld dld.sl B.11.18 000922

The cma_dump.log file starts with:

%Internal DCE Threads problem (version CMA BL10+), terminating execution.
% Reason: cma__io_available: unexpected select error
The current thread is 2 (address 0x40096a78)
DECthreads scheduling database is locked.

Thanks.

BRs
Sorin
10 REPLIES 10
Eric Antunes
Honored Contributor

Re: swagentd stops with core dump

Hi Sorin,

I've found this thread, hope this will help you:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=117449

Best Regards,

Eric Antunes
Each and every day is a good day to learn.
Sorin Ifrim_1
New Member

Re: swagentd stops with core dump

Hi Eric,

Thanks for this.
Actually I already checked in the forums for similar problem, including the thread you mentioned.
Unfortunately three of the links there are not accesible and the fourth only lists a 10.20 problem and no answer.

BRs
Sorin
Robert-Jan Goossens
Honored Contributor

Re: swagentd stops with core dump

Hi Sorin,

Did you check your installed patches for corruption?

# swlist -l fileset -a state
or run
# check_patches

Best regards,
Robert-Jan
Cheryl Griffin
Honored Contributor

Re: swagentd stops with core dump

The fact that swinstall is dumping and DCE is involved indicates a potential problem with name resolution. DCE (rpcd) services are used to determine who the machine is.

Check the following:
# nslookup ip_addr (returns hostname)
# nslookup hostname (returns ip_addr)
# netstat -in
# ifconfig lan0

Make sure the network config looks correct.
Restart swagentd. See if you can swlist to check for corrupt patches as well.
# swlist -l fileset -a state |egrep -e "inst|tran|corr"
"Downtime is a Crime."
Kent Ostby
Honored Contributor

Re: swagentd stops with core dump

Sorin --

We have seen this a few times. Patch PHKL_25869 fixed it.

PHKL_27727 is the latest version of that patch.

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Dietmar Konermann
Honored Contributor

Re: swagentd stops with core dump

Sorin,

I'm quite sure that you're hitting this problem:

PHKL_27727:
( SR:8606261225 CR:JAGae25547 )
gettimeofday(2) causes unexpected abort.
Calling gettimeofday may return a timeval structure with
a negative microsecond field.
PHKL_25869:
( SR:8606225194 CR:JAGad94281 )
Calling gettimeofday() from user space may return a timeval
structure with a negative microsecond field.
ServiceGuard will routinely call gettimeofday(), and if it
receives one of these invalid timeval structures, it will
reset the machine unexpectedly.

The ServiceGuard aborts descibed here showed the same DCE Threads problem as you see with swagentd.

So I would strongly recommend to install PHKL_27727.

Best regards...
Dietmar.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)
Robert-Jan Goossens
Honored Contributor

Re: swagentd stops with core dump

Dietmar,

I don't want sound funny, but how can you swinstall a patch when there is a problem with swinstall ? (single user mode ?)

---
Each time when invokink swinstall, the swagentd dies
---

Best regards,
Robert-Jan
Sorin Ifrim_1
New Member

Re: swagentd stops with core dump

Hi,


Thanks everybody for the answers. Actually we have two of the problems that were listed here:
- a misconfiguration in the DNS ( nslookup hostname returns 2 ip addresses)
- the gettimeoftheday() problem. We noticed that already, as the system time was going back and forth with as much as 4192s.

Still the issue of how to apply the patch remains.

BRs
Sorin

Robert-Jan Goossens
Honored Contributor

Re: swagentd stops with core dump

Is it possible to correct the DNS problem.

could you check if the ip's are the same in the /etc/hosts file and in DNS.

If you have corrected the ip problem, kick the swagendaemon.

# swagentd -k
# swagentd -r

Try to install the patch now.
Dietmar Konermann
Honored Contributor

Re: swagentd stops with core dump

Robert-Jan,

>I don't want sound funny, but how can you
>swinstall a patch when there is a problem
>with swinstall ? (single user mode ?)

Hey! Good question. :) I missed the point that swagentd cores *always*. Then it get's somewhat more complicated, indeed.

You would need to patch the kernel manually, which means in general:

- Identify the object file(s) that are patched by the patch, clock.o in this case.
- Extract the 64bit version (for rp5450) from the patch depot file using tar(1) and gunzip(1).
- Patch the object into the related kernel library (/usr/conf/lib/libclock-pdk.a) using ar(1).
- Build a new kernel and reboot.
- Now install the patch again using swinstall... to get the IPD updated.

Best regards...
Dietmar.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)