Operating System - HP-UX
1752275 Members
4909 Online
108786 Solutions
New Discussion юеВ

Re: "false" Oracle corruption messages

 
A. Clay Stephenson
Acclaimed Contributor

Re: "false" Oracle corruption messages


Hi:

This does look very strange:

The lseek is failing on fdes 0; EBADF indicates that fdes is not open
lseek(0, 33832960, SEEK_SET) ..... ERR#9 EBADF
...
...
open("/usr/lib/nls/msg/C/strerror.cat", O_LARGEFILE, 0177777) .......... = 0

This open returns 0 as a file descriptor; open returns the lowest available file descriptor.

You need to do some more digging to find where close(0) is called before the lseek that fails.
This is looking more and more like a software bug but I would definitely set timeslice to 10 because your current setting can cause all sorts of very stange behavior. If timeslice doesn't fix this, it's probably time to call Oracle support.

If it ain't broke, I can fix that.
Carlos Fernandez Riera
Honored Contributor

Re: "false" Oracle corruption messages


I recall a similar message on 8.1.6 ( more or less). The is a pacth from Oracle. See oracle??s alert.log on database server.

sar -v reports that inode is on his high value. If you are using HFS filesystems you need to raise ninode kernel parameter.



unsupported
Dirk Fieremans
Advisor

Re: "false" Oracle corruption messages

all filesystems are VxFS

I checked with Oracle and they recommend to "upgrade to 8.1.7.2 and than apply patch for BUG 1247796."
This is a patch related to ASYNCH_IO which we're not using however. It will be very difficult to upgrade since the product is only supported on Oracle 8.1.5 (which is no longer supported by Oracle, so I'm as usual stuck between a rock and a hard place!)

I'll try to convince our Change Control Board to give me some downtime to change the timeslice value.

many thanks,
Dirk
Ruediger Noack
Valued Contributor

Re: "false" Oracle corruption messages

Hi Dirk,

I'm interested in your lock.sh script. Would you please post this script as attachement?

Thanks a lot
Ruediger
Dirk Fieremans
Advisor

Re: "false" Oracle corruption messages

I had to do some preparational work before it would run on my HP-UX 11.00 64 bits:
#cp /stand/vmunix /stand/vmunix.orig
#q4pxdb /stand/vmunix

in the script change the line /usr/contrib/bin/q4 /stand/vmunix /dev/mem with /usr/contrib/Q4/bin/q4 /stand/vmunix /dev/kmem

#sh /tmp/lock.sh
#cat /tmp/outputfile

regards,
Dirk
Dirk Fieremans
Advisor

Re: "false" Oracle corruption messages

info from HP support:
The lseek() tries to seek on a file which has gotten filedescriptor 0 from the process. The lseek() can't open the filedescriptor because the filedescriptor isn't open anymore. (in other words was closed before with the close() command) How do we know this, because the filedescriptor 0 is assigned to the first file that is opened.

The result of this EBADF is that, oracle sends to the iwserver process a message that it gets the EBADF.
The message is passed via the file "/opt/oracle/product/8.1.5/rdbms/mesg/oraus.msb".
file. The iwserver then further communicates this to the iwclientprocess on the pc.

with the following open questions:
questions :

1/ When is the filedescriptor 0 of the "oracleVANPROD" process closed ?

2/ Why is the filedescriptor closed ?

3/ Why doesnt the oracle process notice that the filedescriptor is closed and persists in doing a lseek() ?

Doing a continuous tusc on the oracle process may reveal when its closed and by who.

Dirk
Dirk Fieremans
Advisor

Re: "false" Oracle corruption messages

We found the culprit: Apparently there's a bug in the UTL_HTTP package of Oracle on pre-8.1.7 versions.
We implemented a workaround of this package and this solved the "false" corruptions.

Dirk