General
cancel
Showing results for 
Search instead for 
Did you mean: 

pw_wait ERR#11 EAGAIN on oracle parallel slave process

Mariusz Mróz
Occasional Advisor

pw_wait ERR#11 EAGAIN on oracle parallel slave process

Hi,

when I use tusc 7.9e (or 7.9) to trace system call of oracle processes (tusc -p ) I got following error ERR#11 EAGAIN
in line with pw_wait() call.

Example:

tail /tmp/tusc_8732.out
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] gettimeofday(0x9fffffffffffcea0, NULL) ......................................................... = 0
[8732] pw_wait(0x9fffffffffffce70)
.................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN
[8732] pw_wait(0x9fffffffffffce70) .................................................................... ERR#11
EAGAIN

Regards,
Mariusz
6 REPLIES
Duncan Edmonstone
Honored Contributor

Re: pw_wait ERR#11 EAGAIN on oracle parallel slave process

...and?

You seem to have missed something out of your post here - why is this an issue?

pw_wait is part of the postwait subsystem, which is a lightweight synchronization mechanism... EAGAIN is a perfectly valid errno from pw_wait for a bunch of reasons... why do you think this is a problem?

HTH

Duncan

HTH

Duncan
Mariusz Mróz
Occasional Advisor

Re: pw_wait ERR#11 EAGAIN on oracle parallel slave process

Hi,

I try to clarify my question - the isuue is why some oracle processes produce ERR#11 on pw_wait() call and the others not ?

Exp. 1 :
tusc -p 1621
( Attached to process 16231 ("ora_p060_THDM") [64-bit] )
[16231] write(18, "W A I T # 1 : n a m = ' d i ".., 119) .......................................... = 119
[16231] write(18, "\n", 1) ............................................................................ = 1
[16231] pw_post(2609706318533982420) .................................................................. = 0
[16231] write(18, "W A I T # 1 : n a m = ' P X ".., 117) .......................................... = 117
[16231] write(18, "\n", 1) ............................................................................ = 1
[16231] pw_wait(0x9fffffffffff1400) ................................................................... = 0
[16231] write(18, "W A I T # 1 : n a m = ' P X ".., 119) .......................................... = 119
[16231] write(18, "\n", 1) ............................................................................ = 1
[16231] write(20, "\0\0\01d\0\0\0\09ffffffffc\vfe\0".., 48) ........................................... = 48
[16231] read(20, "9ffffffffc\n90d8\0\0\0\0\0\0\0\0".., 3076) .......................................... = 28
[16231] write(18, "W A I T # 1 : n a m = ' d i ".., 119) .......................................... = 119
[16231] write(18, "\n", 1) ............................................................................ = 1
[16231] pw_post(2609706318533982420) .................................................................. = 0

and we see pw_wait returns 0 - that's ok.

Exp. 2 :

$tusc -p 23610
( Attached to process 23610 ("ora_p010_THDM") [64-bit] )
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... [sleeping]
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] getrusage(RUSAGE_SELF, 0x9fffffffffff1650) .................................................... = 0
[23610] getrusage(RUSAGE_SELF, 0x9fffffffffff1660) .................................................... = 0
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] getrusage(RUSAGE_SELF, 0x9fffffffffff1650) .................................................... = 0
[23610] getrusage(RUSAGE_SELF, 0x9fffffffffff1660) .................................................... = 0
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] gettimeofday(0x9fffffffffff0380, NULL) ........................................................ = 0
[23610] write(18, "* * * 2 0 0 8 - 1 0 - 2 4 1 ".., 27) ........................................... = 27
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN
[23610] write(18, "W A I T # 1 : n a m = ' P X ".., 126) .......................................... = 126
[23610] write(18, "\n", 1) ............................................................................ = 1
[23610] getrusage(RUSAGE_SELF, 0x9fffffffffff1650) .................................................... = 0
[23610] getrusage(RUSAGE_SELF, 0x9fffffffffff1660) .................................................... = 0
[23610] pw_wait(0x9fffffffffff18c0) ................................................................... ERR#11 EAGAIN

pw_wait return ERR#11 - question is why ? Should be 0.

When I use tusc to trace processes call stack pw_wait with ERR#11 seems not good. Other calls seems ok.

Regards,
Mariusz Mroz
Mariusz Mróz
Occasional Advisor

Re: pw_wait ERR#11 EAGAIN on oracle parallel slave process

Hi,

BTW.

I post on forum because I got "WARNING oracle process running out of OS kernel I/O " in oracle trace files (mainly oracle parallel slave processes) on Oracle 10.2.0.4 Ent. Sever IA64 HP-UX 11.23.
I patched my db with p6687381 (Note.6687381.8 Bug 6687381 - WARNING oracle process running out of OS kernel I/O resources") but warnings are still produced. Oracele Devel. Team coudn't find reason why messages are produced.
My async device conf:

#1.
$kcmodule -a | grep -i async

asyncdsk static best

#2.
$ls /dev/asyn*
/dev/async /dev/asyncdsk

#3.
$ll /dev/async
crw-rw---- 1 oracle dba 101 0x000007 Sep 5 10:03 /dev/async

#4.
$ kcmodule -v asyncdsk
Module asyncdsk [4868CE13]
Description Asynchronous Disk I/O Driver
State static (best state)
State at Next Boot static (best state)
Capable static unused
Depends On interface HPUX_11_23:1.0

#5.
pg /etc/privgroup
dba MLOCK CHOWN RTSCHED RTPRIO

My server:
$model
ia64 hp server rx8640
$uname -a
HP-UX myhost B.11.23 U ia64

Regards,
Mariusz Mroz
Dennis Handly
Acclaimed Contributor

Re: pw_wait ERR#11 EAGAIN on oracle parallel slave process

>Oracle Devel. Team couldn't find reason why messages are produced.

I would suggest you have them pursue this with HP, rather than you do it.
Mariusz Mróz
Occasional Advisor

Re: pw_wait ERR#11 EAGAIN on oracle parallel slave process

Hi Dennis,
>I would suggest you have them pursue this with HP, rather than you do it.

I plane to make new call to HP, but I got an info from last call to use minor no 0x0 - only one supported by HP. It's not satisfy me (in. eg. I need delay on /dev/async - all my data file are on raw dev's).

BTW. Oracle Dev. Team provide me with a manual of correct set up asynch. dev. in HP-UX as follows Sybase AES 15 docs http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.dc35823_1500/html/uconfig/BBCBEAGF.htm
:)

In the next maintanance windows of my system I'll change the minor no from x07 to x04.
Mariusz Mróz
Occasional Advisor

Re: pw_wait ERR#11 EAGAIN on oracle parallel slave process

1.
As mam desc. of pw_wait():


[EAGAIN] pw_wait() was called with a timeout that expired
with no post(s) pending.

There is a normal situation for pw_wait system call.

2.
ad. /dev/async problem with oracle PX processes.

It helped me a patch : 8412426: WARNING COULD NOT LOWER THE ASYNCH I/O LIMIT TO 912 FOR SQL DIRECT
I/O.

Details:
Patch 8412426 : applied on Wed Sep 09 11:34:25 CEST 2009
Unique Patch ID: 11623362
Created on 6 Sep 2009, 23:16:35 hrs PST8PDT
Bugs fixed:
8412426
Files Touched:
/ksfd.o --> ORACLE_HOME/lib/libserver10.a