Operating System - HP-UX
1826191 Members
2539 Online
109691 Solutions
New Discussion

Process hangs in sched_yield() in HP-UX11.31

 
kumasudh
Advisor

Process hangs in sched_yield() in HP-UX11.31

Hello Gurus,

I am facing strange problem on HP-UX11.31 platform. My process hangs in sched_yield() libc call. Here is the bt of the process:

(gdb) attach 22999
Attaching to program: /opt/OC/bin/smsaudit, process 22999
0x60000000c0423b50:0 in sched_yield+0x30 () from /usr/lib/hpux32/libc.so.1
(gdb) bt
#0 0x60000000c0423b50:0 in sched_yield+0x30 () from /usr/lib/hpux32/libc.so.1
#1 0x60000000ca3ff4b0:0 in skgesig_get_ownership () at skgesig.c:407
#2 0x60000000ca3ff5e0:0 in skgesig_sigactionHandler () at skgesig.c:726
#3
#4 0x60000000ca416790:1 in sskgds_getprocstart () at sskgds.c:509
#5 0x60000000ca416a10:0 in sskgds_getcall () at sskgds.c:580
#6 0x60000000ca3fc7a0:0 in skgdsgframe () at skgds.c:240
#7 0x60000000c9bc0b60:0 in kgdsdst () at kgds.c:526
#8 0x60000000c9bc4fb0:0 in kgdsdstsg () at kgds.c:902
#9 0x60000000caa65640:0 in dbgc_dmp () at dbgc.c:3961
#10 0x60000000ca87fb30:0 in dbgexExecuteDiagDmp () at dbgex.c:2648
#11 0x60000000ca80d320:0 in dbgeNoInvocationMode () at dbge.c:775
#12 0x60000000cb045b10:0 in $cold_dbgeBeginInvoke+0x2f0 ()
from /uorabin/app/oracle/client32/product/11/lib/libclntsh.so.10.1
#13 0x60000000ca80f130:0 in dbgePostErrorDirect () at dbge.c:1152
#14 0x60000000cab1c900:0 in kpeDbgSignalHandler () at kpedbg.c:1042
#15 0x60000000ca3ff710:0 in skgesig_sigactionHandler () at skgesig.c:740
#16
#17 0x60000000ca416790:1 in sskgds_getprocstart () at sskgds.c:509
#18 0x60000000ca416a10:0 in sskgds_getcall () at sskgds.c:580
#19 0x60000000ca3fc7a0:0 in skgdsgframe () at skgds.c:240
#20 0x60000000c9bc5ec0:0 in kgds_skip_frames () at kgds.c:1365
#21 0x60000000c9bc5160:0 in kgdsdsts_extra () at kgds.c:2206
---Type to continue, or q to quit---
#22 0x60000000c9bc5020:0 in kgdsdsts () at kgds.c:965
#23 0x60000000ca86a160:0 in dbgemdGetCallStackWFlag () at dbgemd.c:350
#24 0x60000000ca869fb0:0 in dbgemdGetCallStack () at dbgemd.c:277
#25 0x60000000ca86a200:0 in dbgemdFillCompFunNames () at dbgemd.c:380
#26 0x60000000ca86afd0:0 in dbgemdFillIncCtx () at dbgemd.c:691
#27 0x60000000ca8773c0:0 in dbgexPopulateIncCtx () at dbgex.c:1551
#28 0x60000000ca8764e0:0 in dbgexProcessError () at dbgex.c:1022
#29 0x60000000ca80e360:0 in dbgeExecuteForError () at dbge.c:1238
#30 0x60000000ca80f1d0:0 in dbgePostErrorDirect () at dbge.c:1153
#31 0x60000000cab1c900:0 in kpeDbgSignalHandler () at kpedbg.c:1042
#32 0x60000000ca3ff710:0 in skgesig_sigactionHandler () at skgesig.c:740
#33
#34 0x453e820:1 in conversion::removeFieldNamesAndFillDico ()
at build/OCSMPcore/code/Audit/conversion.C:2591
#35 0x453bb50:0 in sms_audit::writeRowToASCIIFile ()
at build/OCSMPcore/code/Audit/sms_audit.C:4072
#36 0x45424b0:0 in sms_audit::writeSMStableToFile ()
at build/OCSMPcore/code/Audit/sms_audit.C:3198
#37 0x452e480:0 in sms_audit::auditTB ()
at build/OCSMPcore/code/Audit/sms_audit.C:1528
#38 0x452bd10:0 in sms_audit::auditNS ()
at build/OCSMPcore/code/Audit/sms_audit.C:1021
#39 0x4529ef0:0 in sms_audit::auditDB ()
at build/OCSMPcore/code/Audit/sms_audit.C:837
#40 0x44e7e10:0 in sms_audit::auditAll ()
at build/OCSMPcore/code/Audit/sms_audit.C:647
#41 0x42985f0:0 in main () at build/OCSMPcore/code/Audit/Main.C:1355


the HP-UX version is

HPUX11i-VSE-OE B.11.31.0903 HP-UX Virtual Server Operating Environment

and the libc patches are

#swlist -l patch |grep -i libc
# COMPLIBS.LIBCPS-IA32 B.11.31 COMPLIBS
# COMPLIBS.LIBCPS-IA64 B.11.31 COMPLIBS
# COMPLIBS.LIBCPS-INC B.11.31 COMPLIBS
# COMPLIBS.LIBCPS-IS32 B.11.31 COMPLIBS
# COMPLIBS.LIBCPS-IS64 B.11.31 COMPLIBS
# COMPLIBS.LIBCPS-PS32 B.11.31 COMPLIBS
# COMPLIBS.LIBCPS-PS64 B.11.31 COMPLIBS
# PHCO_37128 1.0 libc manpage cumulative patch
# PHCO_38044 1.0 libc cumulative header file patch
# PHCO_38048 1.0 libc cumulative patch
# PHSS_37959 1.0 LIBCL patch

What all i can check in order to fix this problem?

Best Regards
Sudhir
11 REPLIES 11
kumasudh
Advisor

Re: Process hangs in sched_yield() in HP-UX11.31

Hello Gurus,

One more point to add, the process doesnt hang for root user. It hangs only for a non root user, but this user has excute privilages for this process.

Best Regards
Sudhir
kumasudh
Advisor

Re: Process hangs in sched_yield() in HP-UX11.31

One more point to be made here. the process hangs sometimes in sched_yield() and sometimes in gettimeoftheday(). Here is the bt for the second case.

(gdb) bt
#0 0x60000000c04220d0:0 in gettimeofday+0x30 () from /usr/lib/hpux32/libc.so.1
#1 0x60000000cc24ed40:0 in sltrgcs () at sltrg.c:168
#2 0x60000000ca3ff4c0:0 in skgesig_get_ownership () at skgesig.c:409
#3 0x60000000ca3ff5e0:0 in skgesig_sigactionHandler () at skgesig.c:726
#4
#5 0x60000000ca416790:1 in sskgds_getprocstart () at sskgds.c:509
#6 0x60000000ca416a10:0 in sskgds_getcall () at sskgds.c:580
#7 0x60000000ca3fc7a0:0 in skgdsgframe () at skgds.c:240
#8 0x60000000c9bc0b60:0 in kgdsdst () at kgds.c:526
#9 0x60000000c9bc4fb0:0 in kgdsdstsg () at kgds.c:902
#10 0x60000000caa65640:0 in dbgc_dmp () at dbgc.c:3961
#11 0x60000000ca87fb30:0 in dbgexExecuteDiagDmp () at dbgex.c:2648
#12 0x60000000ca80d320:0 in dbgeNoInvocationMode () at dbge.c:775
#13 0x60000000cb045b10:0 in $cold_dbgeBeginInvoke+0x2f0 ()
from /uorabin/app/oracle/client32/product/11/lib/libclntsh.so.10.1
#14 0x60000000ca80f130:0 in dbgePostErrorDirect () at dbge.c:1152
#15 0x60000000cab1c900:0 in kpeDbgSignalHandler () at kpedbg.c:1042
#16 0x60000000ca3ff710:0 in skgesig_sigactionHandler () at skgesig.c:740
#17
#18 0x60000000ca416790:1 in sskgds_getprocstart () at sskgds.c:509
#19 0x60000000ca416a10:0 in sskgds_getcall () at sskgds.c:580
#20 0x60000000ca3fc7a0:0 in skgdsgframe () at skgds.c:240
#21 0x60000000c9bc5ec0:0 in kgds_skip_frames () at kgds.c:1365
---Type to continue, or q to quit---
#22 0x60000000c9bc5160:0 in kgdsdsts_extra () at kgds.c:2206
#23 0x60000000c9bc5020:0 in kgdsdsts () at kgds.c:965
#24 0x60000000ca86a160:0 in dbgemdGetCallStackWFlag () at dbgemd.c:350
#25 0x60000000ca869fb0:0 in dbgemdGetCallStack () at dbgemd.c:277
#26 0x60000000ca86a200:0 in dbgemdFillCompFunNames () at dbgemd.c:380
#27 0x60000000ca86afd0:0 in dbgemdFillIncCtx () at dbgemd.c:691
#28 0x60000000ca8773c0:0 in dbgexPopulateIncCtx () at dbgex.c:1551
#29 0x60000000ca8764e0:0 in dbgexProcessError () at dbgex.c:1022
#30 0x60000000ca80e360:0 in dbgeExecuteForError () at dbge.c:1238
#31 0x60000000ca80f1d0:0 in dbgePostErrorDirect () at dbge.c:1153
#32 0x60000000cab1c900:0 in kpeDbgSignalHandler () at kpedbg.c:1042
#33 0x60000000ca3ff710:0 in skgesig_sigactionHandler () at skgesig.c:740
#34
#35 0x453e820:1 in conversion::removeFieldNamesAndFillDico ()
at build/OCSMPcore/code/Audit/conversion.C:2067
#36 0x453bb50:0 in sms_audit::writeRowToASCIIFile ()
at /opt/aCC/include_std/rw/stdmutex.h:387
#37 0x45424b0:0 in smsCx::getNextSMSrow () at /opt/aCC/include_std/string:993
#38 0x452e480:0 in sms_audit::auditTB ()
at build/OCSMPcore/code/Audit/sms_audit.C:1702
#39 0x452bd10:0 in sms_audit::auditNS ()
at build/OCSMPcore/code/Audit/sms_audit.C:952
#40 0x4529ef0:0 in sms_audit::auditDB ()
---Type to continue, or q to quit---
at build/OCSMPcore/code/Audit/sms_audit.C:917
#41 0x44e7e10:0 in sms_audit::auditAll ()
at build/OCSMPcore/code/Audit/sms_audit.C:0
#42 0x42985f0:0 in main () at build/OCSMPcore/code/Audit/Main.C:1363


Best Regards
Sudhir
Dennis Handly
Acclaimed Contributor

Re: Process hangs in sched_yield() in HP-UX11.31

Why do you think it is hanging? Whose application is this?

>#3

Do you know what signal you got?
kumasudh
Advisor

Re: Process hangs in sched_yield() in HP-UX11.31

Hi Denis,

Thanks a lot for your reply. I said the process was hanging based on the fact that the process finish its task within few seconds. This is basically a CLI which returns successfully within few seconds after execution. I dont see which signal has been sent as it is not evident in the stack backtrace.

Best Regards
Sudhir
Dennis Handly
Acclaimed Contributor

Re: Process hangs in sched_yield() in HP-UX11.31

>hanging based on the fact that the process finish its task within few seconds. This is basically a CLI

Was this a corefile or could you step and run in the debugger?

>I don't see which signal has been sent as it is not evident in the stack backtrace.

You would have to track the machine code to see where the signal # is stored. If not stored in the handler, you would have to extract it from the sigcontext.
kumasudh
Advisor

Re: Process hangs in sched_yield() in HP-UX11.31

Hi Denis,

I ran the CLI and then attached it to the debugger and thats how got the bt.

I dont have knowledge how to extract the signal from signal context :( .

Best Regards
Sudhir
rick jones
Honored Contributor

Re: Process hangs in sched_yield() in HP-UX11.31

Are you certain it is *hanging* in sched_yield() and not simply calling it over and over again? It could simply be chance that when you stop the process to get the backtrace that it is when the process is in sched_yield()? One way to test that hypothesis would be to take a tusc system call trace against the process.
there is no rest for the wicked yet the virtuous have no pillows
Dennis Handly
Acclaimed Contributor

Re: Process hangs in sched_yield() in HP-UX11.31

>I don't have knowledge how to extract the signal from signal context :( .

If you are lucky this might work:
(gdb) p $r32
$11 = 11
(gdb) p *(int*)($r33+8)
$12 = 11

>rick: not simply calling it over and over again?

That's what I suspected. The whole purpose of sched_yield is to "hang". ;-)
Dennis Handly
Acclaimed Contributor

Re: Process hangs in sched_yield() in HP-UX11.31

Oops, I noticed you have several signal frames:
(gdb) frame 2
(gdb) p $r32
(gdb) p *(int*)($r33+8)
(gdb) frame 15
(gdb) p $r32
(gdb) p *(int*)($r33+8)
(gdb) frame 32
(gdb) p $r32
(gdb) p *(int*)($r33+8)
kumasudh
Advisor

Re: Process hangs in sched_yield() in HP-UX11.31

Hello Denis,

I am able to see the signal.

Breakpoint 1, conversion::removeFieldNamesAndFillDico ()
at build/OCSMPcore/code/Audit/conversion.C:1930
1930 build/OCSMPcore/code/Audit/conversion.C: No such file or directory.
in build/OCSMPcore/code/Audit/conversion.C
(gdb) conti
Continuing.

Program received signal SIGSEGV, Segmentation fault
si_code: 1 - SEGV_MAPERR - Address not mapped to object.
0x453e820:1 in conversion::removeFieldNamesAndFillDico ()
at build/OCSMPcore/code/Audit/conversion.C:2591
2591 in build/OCSMPcore/code/Audit/conversion.C
(gdb) bt
#0 0x453e820:1 in conversion::removeFieldNamesAndFillDico ()
at build/OCSMPcore/code/Audit/conversion.C:2591
#1 0x453bb50:0 in sms_audit::writeRowToASCIIFile ()
at build/OCSMPcore/code/Audit/sms_audit.C:4072
#2 0x45424b0:0 in sms_audit::writeSMStableToFile ()
at build/OCSMPcore/code/Audit/sms_audit.C:3198
#3 0x452e480:0 in sms_audit::auditTB ()
at build/OCSMPcore/code/Audit/sms_audit.C:1528
#4 0x452bd10:0 in sms_audit::auditNS ()
at build/OCSMPcore/code/Audit/sms_audit.C:1021
#5 0x4529ef0:0 in sms_audit::auditDB ()
at build/OCSMPcore/code/Audit/sms_audit.C:837
#6 0x44e7e10:0 in sms_audit::auditAll ()
at build/OCSMPcore/code/Audit/sms_audit.C:647
#7 0x42985f0:0 in main () at build/OCSMPcore/code/Audit/Main.C:1355

Regards
Dennis Handly
Acclaimed Contributor

Re: Process hangs in sched_yield() in HP-UX11.31

>I am able to see the signal.
#0 0x453e820:1 conversion::removeFieldNamesAndFillDico

Are you able to figure out the cause?