Operating System - HP-UX
1847069 Members
5512 Online
110261 Solutions
New Discussion

SIGABRT after recvfrom system call on HP 11.00

 
ak sunil
New Member

SIGABRT after recvfrom system call on HP 11.00

hi ,
I am facing a strange problem on HP11.00 system and its a problem with that specific system only else on other HP system application is working fine. I have tried appying many pathces but its not working proerly.

The problem description is as follows :

I am running one thirdparty application on HP 11.00 but one one particular mahine its receiving SIGABRT after its done with recvfrom system call. I am not able to find out what is the problem .

Here is the partial tusc log :
--------------------------------
fstat(8, 0x7ebf1e28) ..................................... = 0
close(8) ................................................. = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
ksleep(PTH_CONDVAR_OBJECT, 0x400d7f40, 0x400d7f48, 0x7e797214) = -ETIMEDOUT
clock_gettime(CLOCK_REALTIME, 0x7e797214) ................ = 0
clock_gettime(CLOCK_REALTIME, 0x7e797290) ................ = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
creat("/SMS/vsm1.2/ThirdParty/iona/o2k/NxO2kDomain/iona_services.node_daemon.bnmds01.ior", 420) = 8
clock_gettime(CLOCK_REALTIME, 0x7ebf0a48) ................ = 0
utssys(0x7ebf0ad0, 2048, 5) .............................. = 0
getuid() ................................................. = 190 (190)
open("/var/spool/pwgr/status", O_RDONLY, 0) .............. = 9
mmap(NULL, 532, PROT_READ, MAP_SHARED|MAP_VARIABLE|MAP_FILE|MAP_ADDR32, 9, NULL) = 0xc00c6000
close(9) ................................................. = 0
socket(AF_UNIX, SOCK_DGRAM, 0) ........................... = 9
getpid() ................................................. = 6554 (6533)
unlink("/var/spool/sockets/pwgr/client6554") ............. ERR#2 ENOENT
bind(9, 0x7ebdaaa0, 37) .................................. = 0
fcntl(9, F_SETFD, 1) ..................................... = 0
time(NULL) ............................................... = 1086383911
poll(0x7ebf2a30, 1, 0) ................................... = 1
sendto(9, "\0\0\00 \0\0\001\0\0\0\0\0\0\001".., 48, 0, 0x7ebdaa40, 0x19) = 48
nanosleep(0x7e89f318, NULL) .............................. = 0
poll(0x7ebf2a30, 1, 1000) ................................ = 1
recvfrom(9, "\0\0\0K \0\0\0\0\0\0\0\0\0\0\0; ".., 2064, 0, 0x7ebf2bf8, 0x7ebf2bf4) = 75
getgid() ................................................. = 700 (700)
getpid() ................................................. = 6554 (6533)
poll(0x7ebf31f0, 1, 0) ................................... = 1
sendto(9, "\0\0\00 \0\0\001\0\0\001\0\0\003".., 48, 0, 0x7ebdaa40, 0x19) = 48
poll(0x7ebf31f0, 1, 1000) ................................ = 1
recvfrom(9, "\0\002c5\0\0\001\0\0\0\0\0\002b5".., 2064, 0, 0x7ebf33b8, 0x7ebf33b4) = 709
nanosleep(0x7e89f318, NULL) .............................. = 0
sigprocmask(SIG_UNBLOCK, NULL, 0x7ebf2468) ............... = 0
nanosleep(0x7e89f318, NULL) .............................. = 0
sigaction(SIGABRT, NULL, 0x7ebf2488) ..................... = 0
read(17, 0x737ebb98, 8192) ............................... = 0
close(1) ................................................. = 0
mprotect(0x73867000, 12288, PROT_READ|PROT_WRITE) ........ = 0
mprotect(0x7386a000, 4096, PROT_READ|PROT_WRITE) ......... = 0
_lwp_mutex_unlock(0x7eb42168) ............................ = 0
_lwp_mutex_unlock(0x7eb42168) ............................ = 0
sigaltstack(NULL, 0x737eb300) ............................ = 0
sigaltstack(0x737eb310, NULL) ............................ = 0
munmap(0x73c74000, 528384) ............................... = 0
lwp_detached_exit(0x39ce4, 0x737eb1b8, 16) ............... = 0
sigprocmask(SIG_BLOCK, 0x7ebf2468, NULL) ................. = 0
sigaction(SIGABRT, 0x7ebf2488, NULL) ..................... = 0
sigprocmask(SIG_UNBLOCK, 0x7ebf2468, NULL) ............... = 0
getpid() ................................................. = 6554 (6533)
Received signal 6, SIGABRT, in kill(), [SIG_DFL], no siginfo
kill(6, SIGABRT) ......................................... [entry]
exit(6) [implicit (kill failure)] ........................ WIFSIGNALED(SIGABRT)
select(22, 0x7e9361d4, 0x7e9362d4, 0x7e9363d4, 0x7e9364d4) = 1
read(19, 0x7386cad8, 8192) ............................... = 0
recv(21, 0x402a7678, 16384, 0) ........................... = 0
Received signal 18, SIGCLD, in mprotect(), [SIG_DFL], no siginfo
close(21) ................................................ = 0
mprotect(0x738e8000, 12288, PROT_READ|PROT_WRITE) ........ = 0
mprotect(0x738eb000, 4096, PROT_READ|PROT_WRITE) ......... = 0
_lwp_mutex_unlock(0x7eb42168) ............................ = 0
_lwp_mutex_unlock(0x7eb42168) ............................ = 0
sigaltstack(NULL, 0x7386c300) ............................ = 0
sigaltstack(0x7386c310, NULL) ............................ = 0
munmap(0x737eb000, 528384) ............................... = 0
lwp_detached_exit(0x39bfc, 0x7386c1b8, 16) ............... = 0
--------------------------------


I attached the swlist log also.

Any Ideas what going wrong with that application ?

Thanks in advance ,
AK .
3 REPLIES 3
Dietmar Konermann
Honored Contributor

Re: SIGABRT after recvfrom system call on HP 11.00

Well...

Received signal 6, SIGABRT, in kill(), [SIG_DFL], no siginfo
kill(6, SIGABRT) ......................................... [entry]

So you receive SIGABRT while being in kill()... this usually means that the process kills itself. Exatly that happens when an assertion (see assert(3X)) fails. Either you find some hints in the application's logs or the resulting core dump needs to be analyzed.

Best regards...
Dietmar.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)
ak sunil
New Member

Re: SIGABRT after recvfrom system call on HP 11.00

hi Dietmar,
Thanks for the hint. I will see If I can find any information out of core file.

I am suspecting something wrong with recvfrom system call. Following is the log for correct behaviour :
----------------------
creat("/home/kumvinod/vsm20_hp/ThirdParty/iona/o2k/NxO2kDomain/iona_services.node_daemon.hpk360.ior", 420) = 9
clock_gettime(CLOCK_REALTIME, 0x70fe0aa8) ................................ = 0
utssys(0x70fe0b30, 2048, 5) .............................................. = 0
getuid() ................................................................. = 1613 (1613)
open("/var/spool/pwgr/status", O_RDONLY, 0) .............................. = 10
nanosleep(0x70c92318, NULL) .............................................. = 0
mmap(NULL, 532, PROT_READ, MAP_SHARED|MAP_VARIABLE|MAP_FILE|MAP_ADDR32, 10, NULL) = 0xc0033000
close(10) ................................................................ = 0
socket(AF_UNIX, SOCK_DGRAM, 0) ........................................... = 10
getpid() ................................................................. = 12160 (12141)
unlink("/var/spool/sockets/pwgr/client12160") ............................ ERR#2 ENOENT
bind(10, 0x70fcaaa0, 38) ................................................. = 0
fcntl(10, F_SETFD, 1) .................................................... = 0
time(NULL) ............................................................... = 1086945866
poll(0x70fe2a90, 1, 0) ................................................... = 1
sendto(10, "\0\0\00 \0\0\001\0\0\0\0\0\0\001".., 48, 0, 0x70fcaa40, 0x19) = 48
poll(0x70fe2a90, 1, 1000) ................................................ = 1
recvfrom(10, "\0\0\0W \0\0\0\0\0\0\0\0\0\0\0G ".., 2064, 0, 0x70fe2c58, 0x70fe2c54) = 87
getgid() ................................................................. = 350 (350)
getpid() ................................................................. = 12160 (12141)
poll(0x70fe3250, 1, 0) ................................................... = 1
sendto(10, "\0\0\00 \0\0\001\0\0\001\0\0\003".., 48, 0, 0x70fcaa40, 0x19) = 48
poll(0x70fe3250, 1, 1000) ................................................ = 1
recvfrom(10, "\0\001) \0\0\001\0\0\0\0\0\00119".., 2064, 0, 0x70fe3418, 0x70fe3414) = 297
clock_gettime(CLOCK_REALTIME, 0x70fe1f68) ................................ = 0
send(6, "G I O P 0102\0\0\0\0\0ce\0\0\0V ".., 218, 0) .................... = 218
select(23, 0x70d291d4, 0x70d292d4, 0x70d293d4, 0x70d294d4) ............... = 1
recv(22, "G I O P 0102\0\0\0\0\0ce\0\0\0V ".., 16384, 0) ................. = 218
kwakeup(PTH_CONDVAR_OBJECT, 0x400d4150, WAKEUP_ONE, 0x70d29b98) .......... = 0
ksleep(PTH_CONDVAR_OBJECT, 0x400d4150, 0x400d4158, NULL) ................. = 0
clock_gettime(CLOCK_REALTIME, 0x70de0388) ................................ = 0
clock_gettime(CLOCK_REALTIME, 0x70de0e88) ................................ = 0
clock_gettime(CLOCK_REALTIME, 0x70de0288) ................................ = 0
clock_gettime(CLOCK_REALTIME, 0x70de0d88) ................................ = 0
send(22, "G I O P 0102\001\0\0\011\0\0\0V ".., 29, 0) .................... = 29

----------------------


If you compare the last recvfrom system call in the provided logs , you will see instead of 297 bytes its getting 709 bytes. Can you give some hint why is it happening so ?

Thanks ,
AK.
Dietmar Konermann
Honored Contributor

Re: SIGABRT after recvfrom system call on HP 11.00

Hard to tell that without internal knowledge how the application is communication. Essentially recvfrom() is called with len parameter set to 2064... so it's perfecty legal to return 709 bytes if the message is 709 bytes long.

Maybe the contents of this message cause the abort... however, no chance to tell why with interal knowledge.

BTW, with tusc's '-r all' option you should at least be able to see the received bytes completely.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)