Databases
cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Wodisch
Honored Contributor

Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi all,

I've got a strange behaviour in terms of the "commit" statements on a multi-cpu SPARC-based Oracle8.1.6 server:
- with 1 cpu "commit" is pretty fast
- with 2 (or more) even faster cpus it takes up to 10 times as long

My tracing/testing identified the access of the current online-redo-logfile as the point of lost time.

Does anybody know any cure?

TIA,
Wodisch
(back again, and happy with the new features of the ITRC)
22 REPLIES
Elmar P. Kolkman
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Could it be that a lot of transactions are run simultaneously ? With multiple CPU's they can run really simultaneously, with this result, whereas with 1 CPU they are more or less run sequentially. At least no other session is using the redo-log while the commit is running.
Every problem has at least one solution. Only some solutions are harder to find.
Massimo Bianchi
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi,
nedd some data:

- redo log: are on FS or on raw device ? think of putting them on raw devices, it could be ddone online.

- redo log size: how many and how big they are ? Usually, for performance, best is many of little size, arounf 40/80Mb each. On one of my customer, they have 90*80M redo log, it's a 2Tb DB

- redo log switch rate: checking from alter_sid, every how many minutes does a switch occur? best should be one every 15/20 minutes... change with suggestions above

- log buffer: how may Mb? It can help in such situation of high concurrency, up to 5Mb. It really depends on the typical use of the DB.

Massimo
Steven E. Protter
Exalted Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

At install time Oracle compiles its binaries based on how the system is set up at the time. If it detects two cpu's it may use different compile options.

Do you still have the install logs?

Based on whats going on, it was installed when there was one cpu and none of the smp features were turned on.

On my dual processor HP boxes, I've seen horrendous performance when one of the cpus went down, much worse than can be explained merely by the cpu taking the permanent retirement plan.

Welcome back.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Steven E. Protter
Exalted Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

What I was trying to tell you in my vague and fuzzy style was to relink oracle with all cpu's enabled and see if it helps. To early in the morning for me.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jean-Luc Oudart
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

This may help

http://www.ixora.com.au/tips/use_raw_log_files.htm

Rgds,
Jean-Luc
fiat lux
T G Manikandan
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi Wodisch,

Welcome back!Nice to see you back here!!
Hein van den Heuvel
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Wodish,

You seem to have identified the problem already judging by the subject line:

> Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

It's those sparcs! Come on over to HPUX and live happiply ever ever?
:-)
Slightly more seriously... why not ask SUN for support? Are they not helping you?

How is your archiving set up? You are not logging and archiving to the same disks are you?


Massimo writes....

- redo log size: how many and how big they are ? Usually, for performance, best is many of little size, arounf 40/80Mb each. On one of my customer, they have 90*80M redo log, it's a 2Tb DB

Please. That just has to be silly / wrong.
Just 2 or 3 reasonably sized logfiles will do fine. Any more is pointless. The data should/will be in the archive logs! You indicate this yourself...

- redo log switch rate: checking from alter_sid, every how many minutes does a switch occur? best should be one every 15/20 minutes... change with suggestions above

This I agree with... largerly. Make the log files big enough to hold 10 - 60 minutes of
production redo log. With < 100MB and _any_ load worth worrying about, you are going to see checkpoints every minute or faster. All the dirty block will have to be flushed out. The odd to re-dirty a block and save an IO are dramatically reduced.

Make those redo logs large and put them on alternating disk/controllers. This will ensure that the archive read IOs do not interfere with the linear redo log writes.
(btw... I recently ran into a 60GB yes, GB, redolog file. That seemed over the top, but hey this was for a benchmark. All is fair game in love and war no?)

Actually, I kinda agree with Oracle line of thinking that the redo size / time baces checkpointing is not so important. What really really count is the tiem it would take to recover should you have a problem.
This is a business problem, not a technical problem. The technical solution they offer is the new(ish) MTTR = Mean Time To Recover parameter. Using this ORacle will no longer do the harsh identifyable checkpoints, but settle for a (likely) slower constant buffer flush algoritme making sur eit will not have too much to recover should it ever need too.

Oooops, I'm straying a bit far from the original question.

So what kind of speeds and feeds were we talking about in the first place mb/sec? gb/redo file?...

Cheers,
Hein.


Massimo Bianchi
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

40/80Mb each. On one of my customer, they have 90*80M redo log, it's a 2Tb DB

I really meant that. It a very OLTP system, for telecommunication/billing/customer support purpose.
really really stressed.

They need many redolog for two main reason:
- very fast/online recover
- they generate many archive log, and speed is a must. many small redo achieve a better performance that bigger ones, and are fater to archive, even on 2Gb FC direct connection. this was the decision of their dba....

Massimo
Volker Borowski
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi Wodisch,

which patchlevel are you on ?
How did you measure the bottleneck and with esp. what values ?

I never watched this behavior on Solaris, but I have no more 8.1.6ers to support. Sounds to be the type of problem to be patchable.

Best regards
Volker
Wodisch
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi guys,

thanks for your responses so far, but the problem is:
- it is 8.16
- it runs on a SPARC
- the whole DB resides in "ufs" logging filesystem (on "MD" mirrors)
- Oracle *was* installed with all cpus active (and even relinked later)
- the size and amount of redologs is not the problem, as *they* (the cpus, or whoever else) slows down when accessing the *one and only current* redo-logfile
- a slower single-cpu SPARC is much faster (not for anything else, only for the "commit" statements) in total
- the customer cannot/will not change the patch level, as some other software depends on what is already there...

so, to ask in other words: does anybody know any problem of multi-cpu SPARCs running oracle getting into some
- race-condition
- deadly embrace (well, not deadly, but dog-slow)
- file I/O concurrency
- filesystem concurrency

what I cannot pinpoint is wether the delay is caused by oracle (internally) or by the filesystem (Solaris, then). From application point of view only the SQL-statement "commit" is slow, and the Oracle performance tools tell me that the most time is spent on "syncing" the data to the current redolog...

any hints?

TIA,
Wodisch
Volker Borowski
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hmm,

which Solaris version is it 7 or 8 ?

Is Solaris Async-IO enabled ?
If yes, there had been quite a few op-sys patches in Solaris for this feature (up to date ?).
If not, consider to set up
Oracle-LGWR_IO_SLAVES to parallelwrite the redo info (Oracles own implementation of application-async-io). DO NOT USE BOTH!

Logwriter writes, if log_buffer in one third full or if a commit is issued. How big is your log_buffer ? How many Commits do you get per minute ?

Metalink has some stuff for tuning redo_log_sync, mostly sugesting to reduce the amount of commits on application side *smile*.

What do you compare ? A single commit on both platforms ? A bunch of 1000 in parallel ?

I guess, this one will be hard to track down ....
Good hunting
Volker
Jean-Luc Oudart
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Wodish,

seen on Metalink some entries with same type of pb (SPARC) wher one cpu SPARC is better that multiple cpu box !

Check doc id 134172.999 on Metalink
This was 1st reported as a bug but then it was not a big ?!

Not sure this is xactly your pb , but this is close to it.
other docs, seacrh on 929709 (the bug number , but not a bug ?!)

Rgds,
Jean-Luc
fiat lux
Stan_17
Valued Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi Wodisch,

what's the top 5 waits events seen on the statspack reports ?

Is too many waits spent on log file sync, redo log file parallel write, log buffer space ?

how many txns are you commiting / sec (should be seen in statspack)

Is your FS using direct io, whats the value set for disk_async_io

-
Stan
Wodisch
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi again,

it is Solaris8, and I never use async-IO ;-)
The redologs are 100MB each, and logswitches happen only every couple of minutes, that does not seem to be the problem.
Reducing the amount of commits is NOT an option ;-)
The most simple way to reproduce the bottleneck was to do some inserts and a few selects then commit. Repeated a 1000 times.
So actually no parallelism inside the program itself...

The statspack's snapshot output shows:
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
log file sync 1,002 2,468 41.29
log file parallel write 1,004 2,456 41.09
control file parallel write 116 1,051 17.58
file open 32 1 .02
db file sequential read 4 1 .02
-------------------------------------------------------------

The values for a much slower single cpu system is 1/10th for the first two events!

Oh, and the disks on the multi-cpu system are much faster, too, as are the controllers...

I am pretty lost on that. My guess that Solaris8 is doing "spinlocks" on the multi-cpu system and waisting a hell-of-a-lot-of-time on that.
We do experience that behaviour on two pretty different systems, 2-cpu-system, and a 4-cpu-system, both with different hardware - so it looks pretty much like a software issue.

Any other ideas?

Thanks a lot!
Stan_17
Valued Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

comments interspersed.

it is Solaris8, and I never use async-IO ;-)
>
yes, solaris by default uses threaded async/io to simulate kernalized async/io. though its not as efficient as kaio, but still its okay as oppose to nothing :)

The redologs are 100MB each, and logswitches happen only every couple of minutes, that does not seem to be the problem.
>
yes, thats not your problem. i don't see any wait events alluding to checkpointing.

Reducing the amount of commits is NOT an option ;-)
>
absolutely not. many suggest these options, but ideally this will not be the solution to most of 3 party vendor applications ;) you know what i mean.

The most simple way to reproduce the bottleneck was to do some inserts and a few selects then commit. Repeated a 1000 times.
So actually no parallelism inside the program itself...

The statspack's snapshot output shows:
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
log file sync 1,002 2,468 41.29
log file parallel write 1,004 2,456 41.09
control file parallel write 116 1,051 17.58
file open 32 1 .02
db file sequential read 4 1 .02
-------------------------------------------------------------

>
seems like your fs is not coping with lgwr. do you multiplex redologs. if yes, then try not multiplexing. since your are mirrioring the logs, you should be okay. other thing is, use direct/io on the fs that has redologs on it. That will improve the lgwr performance.

The values for a much slower single cpu system is 1/10th for the first two events!

>
any chance the tests are carried out bit different as oppose to multiple cpus for e.g. test were done via pl/sql procedure, This can make lot of difference.

Oh, and the disks on the multi-cpu system are much faster, too, as are the controllers...

I am pretty lost on that. My guess that Solaris8 is doing "spinlocks" on the multi-cpu system and waisting a hell-of-a-lot-of-time on that.

>
If thats the case you should see cpu cycles being use a lot. do you see them? i don't think so.

We do experience that behaviour on two pretty different systems, 2-cpu-system, and a 4-cpu-system, both with different hardware - so it looks pretty much like a software issue.

Any other ideas?

Thanks a lot!
Alzhy
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Interesting... it seems you are comparing two different boxes - one with a single CPU and the other multiple... How sure are you that there OS rev and configuration are the same? On the multiple-CPU system, you can test by dropping it down to 1 CPU via the "psradm" command.
.
Also, if your're using cooked UFS for your datafiles -- make sure you are using at least Solaris 8 07/01 (cat /etc/release) and at least a -13 Kernel Rev.
Hakuna Matata.
Wodisch
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Thanks so far,

we are still working on that :-(
To clarify some points:
- only single-member redologs are used
- the problem happens on different HW and even on Oracle8.1.7

Regards,
Wodisch
Stan_17
Valued Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi Wodisch,

I'm very much intrested on the outcome of the issue and how you fixed it. Keep us updated.

-
Thanks,
Stan
Wodisch
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Stan,

if possible I'll do that.
But it may be one of those "unresolved" questions of mine :-(

Regards,
Wodisch
Volker Borowski
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

... stumbled about this one by accident.
Quite recent (Oct 2003 !)
http://www.sun.com/solutions/blueprints/1003/817-3835.pdf

Esp. Page 19 says how to analyze if you have a spin/block problem, although there is no solution given.

Hope this helps
Volker
Wodisch
Honored Contributor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi again,

I'll be looking into that, Volker - thanks!
Currently we've "tuned around" that problem, and as the application is now about ten times faster than before, the customer does not care about the *real* problem (no answer about it, yet, not from Oracle, nor from Sun)...
Well, I'll keep you up-to-date, guys :-)

Regards,
Wodisch

PS: I love the "preview" button ;-)
zhuchao
Advisor

Re: Oracle8.1.6 on SPARC 2-4 CPU takes ages on COMMITs

Hi,
Throughout the discussion, I cannot find out whether you are hitting real application commit slow or just your own benchmark script with just simple dml and one dml one commit.
AIO on solaris with raw device/QuickIO is pretty good and stable. I do not know why no AIO on solaris.Go on with raw device/Redo log.

Can you attach your full statspack report for your slow commit application benchmark?

You said you have solved the problem, And I think most guys here want to know how you did this job. Rewrite the application to reduce the commit rate, or something else?

My suggestion is whether it is possible to bind logwr to some physical processor and maybe it get a little faster.

www.happyit.net