cancel
Showing results for 
Search instead for 
Did you mean: 

System reponse very slow

SOLVED
Go to solution
Ninad_1
Honored Contributor

System reponse very slow

I am facing problem of very slow response from system. Its a ES40 server with 4.0F in cluster with another ES40 but it is currently down.
System config is 4 CPU , 6 GB RAM

I have attached the output of
1) kdbx -k /vmunix
2) uptime
3) ps aux
4) vmstat -P
5) vmstat
6) iostat
7) swapon -s

commands in a single text file to give you some idea about the load. On the console in System information utility CPU shows 100% busy
but in cpustat it shows major time in wait condition.
Please can anybody guide me what the problem can be ?

Thanks in advance

Ninad
11 REPLIES
Ninad_1
Honored Contributor

Re: System reponse very slow

Sorry

Here is the attachement
Michael Schulte zur Sur
Honored Contributor

Re: System reponse very slow

Hi,

my experience with a lot of wait is, that the system waits for I/O to complete. For example, when you are doing backup on tape.
I have noticed, that on rz17 there is a lot of I/O. Find out, what is causing it.

greetings,

Michael
Nicolas Dumeige
Esteemed Contributor
Solution

Re: System reponse very slow

You have Oracle connexion that take long time to finish there job.

uptime display a load average increase but still it's acceptable. You have a lot of wait io and little system time. vmstat show run queue with on average 8 process and at the same time large idle time. The w column seems hudge but I'm not familiar with the Tru64 output.

--> on your box the i/o subsystem is the bottleneck. The next thing to do is query Oracle system view to get thepicture from within RDBMS. I guess you'll see a lot of db file scattered read (dbf access) and db file sequential read (index access) on v$_session_wait


set linesize 200
set pagesize 200
set time off
set timing off

COL SID FORMAT 99999
COL SEQ# FORMAT 9999999
COL EVENT FORMAT A28 TRUNC
COL P1TEXT FORMAT A20 TRUNC
COL P1 FORMAT 999999999999
COL P1RAW NOPRINT
COL P2TEXT FORMAT A20 TRUNC
COL P2 FORMAT 999999999999
COL P2RAW NOPRINT
COL P3TEXT FORMAT A20 TRUNC
COL P3 FORMAT 999999999999
COL P3RAW NOPRINT
COL WAIT_TIME FORMAT 9999999 NOPRINT
COL SECONDS_IN_WAIT FORMAT 9999999
COL STATE FORMAT A10 NOPRINT

select * from v$session_wait
/


select EVENT,count(*)
from v$session_wait
group by EVENT
order by count(*) desc
/


All different, all Unix
Ninad_1
Honored Contributor

Re: System reponse very slow

Thank you Michael and Nicolas for your replies.
Michael - Yes there is a high queue depth on rz17 disk which actually is raidset on RA8000 connected via fibre SAN switch.This disk is in LSM and has the database on it.Hence the queue depth cant be helped. What shall I do to help solve the problem with current configuration and what shall I do if nothings possible in current scenario . Please suggest some solution for existing configuration.

we have a raid 5 of 6 disks in RA8000 with HSG80 controller and used a partition of the raid - with unit no D2 for database.


Nicolas - Thanks for the sql query. I ran the query but it shows
SQL*Net message from client with count=321

I am attaching the output of the query for your expert comments please.


Thanks

Ninad
Nicolas Dumeige
Esteemed Contributor

Re: System reponse very slow

SQL*Net message from client : This event indicates that a server process is waiting for work from the client process.
You can ignore it most of the time, but it can also be caused by network issue or large client side application (or low budget PC).

As for the rdbms ipc message, this is a background process idle event.
All different, all Unix
Michael Schulte zur Sur
Honored Contributor

Re: System reponse very slow

Hi,

has anything changed before the system went slow?
What oracle version do you have?
run an oracle statistics like utlbstat/utlestat to see, where the oracle spends time and reads/writes.

greetings,

Michael
Hein van den Heuvel
Honored Contributor

Re: System reponse very slow


Don't speculate... just run a STATSPACK and get a view on what oracle is doing.

What Oracle version?

Judging by the system call rate you have oracle timed_statistics enabled. What level?
Are you using that data or just burnig the CPU. If you are not using it, just disable.
If you are, be sure to run at BASIC level, not TYPICAL

SQL> show parameter statistics

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
statistics_level string TYPICAL
timed_os_statistics integer 0
timed_statistics boolean TRUE
SQL> alter system set statistics_level = please_tell_me_the_names;
alter system set statistics_level = please_tell_me_the_names
*
ERROR at line 1:
ORA-00096: invalid value PLEASE_TELL_ME_THE_NAMES for parameter
statistics_level, must be from among ALL, TYPICAL, BASIC


SQL> alter system set statistics_level = BASIC;

System altered.
Joris Denayer
Respected Contributor

Re: System reponse very slow

Ninad,

You can run also collect to have a global idea of the system performance.
This will give more detailed information of IO, network and CPU load.

Joris
To err is human, but to really faul things up requires a computer
Keith Moodie
Occasional Advisor

Re: System reponse very slow

Are your tables/files badly fragmented ?

Your IO demand could be higher than neccessary if the database tables need maintenance, or if the files that the tables reside in are themselves fragmented 'file fragmentation'.

If your raid is Advfs then you can use showfile to see how fragmented a file is.

NB re 'file fragmentation'
It probably doesn't matter if a large table is stored in 10 or even a hundred large fragments.
And most of the time it won't matter much if the table is stored in a few large fragments and a small percentage of the table is stored in small fragments.

Bad news is that it appears to me that the best time to turn defrag on is before you put any data on the disk.

Re: System reponse very slow

Ninad,

I assume you are using LSM with DRD services for your Oracle database. Also, I see that you have set the SGA size to be 822MB and your OS parameter vm-swap-eager to 1 (as per oracle recommendations).

Due to the above swap setting, the system is doing excessive paging. You can control this by vm-swap-eager to 0. This would affect you only if you have a huge shortage of memory, which is not visible from your system.

Check the memory utilization of various processes using "ps avgx" command and send the output. Based on which we can take some decision on tuning the SGA size of the database. Find out where is the 90% of the 6GB is really used.

If it is possible to create another LUN from the RA8000, you can consider striping your volumes across the two luns. But thats a far cry for improving I/O.

Also check your vm attributes setting. That could have some settings wrong also.

Regards
Baalki
Hein van den Heuvel
Honored Contributor

Re: System reponse very slow

Baalki wrote
"SGA size to be 822MB and your OS parameter vm-swap-eager to 1 (as per oracle recommendations). Due to the above swap setting, the system is doing excessive paging. You can control this by vm-swap-eager to 0."

Why do you say that, where to you see that?
The term "Eager swap" reflects to the reservation policy. It does not mean that the system will become eager to swap.

Look at the swapon -s output:
"Total swap allocation:
Allocated space: 1741946 pages (13608MB)
Reserved space: 705425 pages ( 40%)
In-use space: 8055 pages ( 0%)

Not a single page was swapped out, and why would it be, as there is still memory free.
All the allocated memory is reserved though ready to be swapped out if the memory pressure is there. (which it is not).

Now there is a lot op Page-in happening. Perhaps a bunch of image activations?
Ninad, does the Oracle (and "runform" application) code live on the same disk (RZ17) as the database data? For clarity you may want to seperate those! Is this a 'come & go and come come back' style application?

Baalki, I assume you picked that 822 based on the VSZ from ps? Good, keen eyes!
Ninad, it is a wild shot in the dark, but have you considered/try giving Oracle much more memory to play with?! If the SGA is say 750 MB, part of which pool, part buffers, then try giving 50% more pool and 100% more buffers. Judging by the memory pictures you shared you have it available! That might save lots of IO and CPU (Oracle parsing)
" free pages = 75929 (600MB)
active pages = 280794
inactive pages = 159297 (1200MB)
wired pages = 83662
ubc pages = 171334 (1300MB)"

Make sure ubc_min is set to 5% or less such that the system is allowed to trim it back if need be.

Give it a try!
(We'll need better data to give better advice!)

Regards,
Hein.