Operating System - Tru64 Unix
1827892 Members
1713 Online
109969 Solutions
New Discussion

Re: performance issue

 
edi_4
Advisor

performance issue

Hi! We are running Tru64 V5.1B PK4.
There are performance problem - CPU idle is 0%, there are also high io on dsk5. We are running oracle - how do I determine what is bootleneck memory, storage...
Storage is HSH80 - disks are RAID1+0.
Thank you!
15 REPLIES 15
edi_4
Advisor

Re: performance issue

there are statistics - vmstat, collect...
Venkatesh BL
Honored Contributor

Re: performance issue

Can you post 'disklabel dsk5' output? Also,

# cd /etc/fdmns
# showfdmn *
Ivan Ferreira
Honored Contributor

Re: performance issue

From the output of vmstat, you will see that there are not page ins AND page outs, so this is not a memory problem. As paging does not occurs, the high I/O activity on the disk in not swap related.

You can see that most of the CPU is consumed by user applications (us) and not by the system (sy). This means that your database or applications are stressing the system.

I think that you need to optimize your database or applications (Idexes, queries, etc).
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Hein van den Heuvel
Honored Contributor

Re: performance issue


The CPUs are simply overloaded with the Oracle work.

I see some worries in system space: lowish free mem, high system call rate, possibly not using direct-io?

But since user time is 85% or more, that's where you should focus.

Why is Oracle burning the time it burns? Maybe it is legit, and there is simply too much work. On the other hand, maybe Oracle is being told to do silly things: Application tuning and DB tuning.

The way to make progress on this is to run Oracle statspack and drill down on the top few cpu burning queries and/or top "Buffer Gets" (probably the same ones :-).

I'd recommend digging into oracle (and/or get consultancy help to do so.). You could also throw more hardware at the problem... if this is a 4 cpu cable box.

Good luck,

Hein
HvdH Performance Consulting.



edi_4
Advisor

Re: performance issue

File domains are also near full - on disk5 are created 3 advfs filesets... there are 2 oracle databases files. I am going to move it to /raid7.

I can not add more cpu - it is DS25. Because oracle is memory intensive - I can add more momory. Will more memory help to get better performanse?
Venkatesh BL
Honored Contributor

Re: performance issue

Have you enabled direct I/O for Oracle? Have you followed the Oracle tuning recommendations?

From the file, I see that baza2_domain is the domain in question. Which cluster node i s the cfs server for this domain. You could have some perf implications if the node that is initiating disk activity is not a cfs server (particularly if direct I/O is not enabled).
edi_4
Advisor

Re: performance issue

Thank you! File system is served by another member

# cfsmgr /raid5

Domain or filesystem name = /raid5
Server Name = mravljaclu2
Server Status : OK

Can I do # cfsmgr -a server=member1 /raid5 online?
Venkatesh BL
Honored Contributor

Re: performance issue

yes, you can.
edi_4
Advisor

Re: performance issue

Thank you - how can I find out if direct i/o is enabled
Han Pilmeyer
Esteemed Contributor

Re: performance issue

cfsstat directio
Hein van den Heuvel
Honored Contributor

Re: performance issue

User mode time is waht hurts your system.
DirectIO, while great, is NOT going to save user mode time.
Whether a disk is served or not does NOT impact user mode time.
Whether a disk is near full or heavily fragmented, that's not good, but it is currently not hurting you because you still manage to burn all this usermode time.

[Hmmm I suppose oracle could be spinning during latch contention waiting for IO]

The only system things which influences user mode time is memory cache effectiveness and translation buffer effectivement.
Scheduling is one component to that, but you do not have excessive context switches.

GH memory for the SGA can help translation buffer effectivness, avoiding use mode processor cycle stalls.

Using SPIKE on the oracle image might increase the instruction cache effectivness.

But all of that is a drop in the bucket if the user mode calls are ineffective.

All the talking here is just trying to avoid the 'real work' which is getting an understanding what oracle is doing based on application request.

Once you have provided a clear argument showing the application and Oracle queries are tuned, then you can think further about processor binding, GH buffers, direct IO, Spike and so on to get the last ounces of performance out of the box. But you may not need to go there at all!

Good luck,
Hein.
edi_4
Advisor

Re: performance issue

Thank you! I am agree - it seems to be oracle related... but I can not tunning aplications.
I switched cfsmgr -a (file domain baza2_domain) is now served from member1, direct i/o is enabled, I also decreased vm_page_target from 1024 to 768 and increased ubc_max from 25 to 35%. There is no progres...

Do you thing - adding aditional memory will help? Now is 3GB, I can upgrade to max 16GB?

edi_4
Advisor

Re: performance issue

The only think (cpu busy related) I can find out from monitor is: high intr scall. When syscall is more than 50000 the cpu idle time become smaller and smaller. On 80000 - 100000 scall there is idle CPU state 0%....
Han Pilmeyer
Esteemed Contributor

Re: performance issue

I think what Hein is saying, although I have seen him state it more clearly at times ;-), is that you need to determine whether Oracle is working the way it should. That is, make sure that the queries it does are doing the right thing. If that's the case, you probably need more CPU's rather than memory. But if you do make the system faster, it won't be long before that dsk5 becomes the next bottleneck.

Also... remember we don't have a lot of information to work with.
Hein van den Heuvel
Honored Contributor

Re: performance issue

Thanks Han! :-)


>> Thank you! I am agree - it seems to be oracle related... but I can not tunning aplications.

Well, then you may not be able to solve this problem.
Find someone else that can!
You may have to fix 'management' first, before you can fix the technical problem.

Statspack data is likely the best step towards understand what can be done.

The high syscall rate is likely to be caused by having full statistics. So after you collecting a few relevant statspack snaps you can try changing that.
Either switch it off with ( SYSTEM SET timed_statistics = 0) or at least change the level from TYPICAL to BASIC.

And you may want to install a 'timedev'
(mknod /dev /timedev c 15 0
chmod 664 /dev /timedev
... restart Oracle)

But again, this will mostly reduce system time some, and can not really be expected to make a big dent in user time.

Cheers,
Hein.