TruCluster
Showing results for 
Search instead for 
Do you mean 

poor trucluster performance after 5 days

Frequent Advisor

poor trucluster performance after 5 days

Hi Admins,
running trcluster V5.1A,pk1,2XES40 machines,10GB memory on each node,memory channel interconnect,
HSG80 controller 8.7,two F.C adapters on each node.
running OPS oracle version 8.1.3.7.0
my problem is the performance is very slow after 5 days when we restart the database.
every 5 days we shoud restart the database,
please advise.
attatch files is sysconfiqtab files
10 REPLIES
Honored Contributor

Re: poor trucluster performance after 5 days

Your patch kit version is very outdated. Try updating your patch kit version.

Is the first swap device a disk in the SAN? Change the swap device to use the local disks only.

You can also try modifying the vm-swap-eager=0 parameter.

Verify that the propitary of the filesystems are distributed accordly to the I/O patterns using cfsmgr.

Disable advfsd startup in /sbin/rc3.d.

Verify the configuration of the shared memory in oracle.

What is the raid type that are you using?

Verify the status and the window size of the memory channel interconnect as described in the trucluster hardware configuration guide.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Honored Contributor

Re: poor trucluster performance after 5 days


Was the performance ever good before ?

What changed when it became worse ?

Do you have fifo_do_adaptive=0 in the vfs subset of /etc/sysconfigtab ?

___ Johan.

_JB_
Frequent Advisor

Re: poor trucluster performance after 5 days

Hi Johan Brusche,
many thanks,
performance is good for 4 days,after that it become very slow,
yes I have fifo...=0
thanks.
Honored Contributor

Re: poor trucluster performance after 5 days


Jousif,

To investigate performance issues, you can use the "collect" utility. It collects resource utilisation and stores it in files so that you can use these files to generate reports (or graphs via collgui).

To start it do the following:

# /usr/sbin/rcmgr set COLLECT_AUTORUN 1
# /sbin/init.d/collect start

The resulting performance data will be stored in the directory /var/adm/collect.dated on a per day basis.
The GUI to examine this data can be installed from the 'Layered Prod'-CD#2 via setld.

__ Johan./

_JB_
Frequent Advisor

Re: poor trucluster performance after 5 days

Johan Brusche,
I need to know why performance become slow
after 5 days.
Respected Contributor

Re: poor trucluster performance after 5 days

Jousif,

That's why Johan suggests you to run collect.
You can compare then, at least, what happens with the system after 5 days, in terms of system resources

Joris
To err is human, but to really faul things up requires a computer
Frequent Advisor

Re: poor trucluster performance after 5 days

Jousif,
Collect will be the first stage to determine the performance degradation. But you should consider upgrading to latest.
Also, take in consideration your logs and check for any messages.
By the way, what environment is this. Are these two servers (cluster) as a NFS ? Using UDP, TCP, etc.

David
Honored Contributor

Re: poor trucluster performance after 5 days

Hmmm, fine answers so far, but instead of focussing on the Operating System, maybe we should focus on the database?
The question reads "every 5 days we shoud restart the database".
Jousif, do you really mean 'just' the database or the whole system?
How do the Oracle stats change over the days?
Did you try other commands like Oracle's "ALTER SYSTEM FLUSH SHARED POOL" or a similar command for whatever database you are using.

The oracle version you are using is 'wicked old' and the OPS found in that version was an early verion which needed some TLC (Tender Love and Care ... you really needed to know what how to use it.
You really should work with Oracle on how to get to a recent RAC version: 9.2.0.6 ?


When you say performance is slow, how do you observe that? Is overall system performance slow (like a simple cat /etc/hosts command or an ls /etc command), or are the database and/or application tasks 'slow'. We can not read your mind. Please help us help you!

When performance is slow, what database or system observations did you make which seemed 'odd'.
Collect, as suggested, is possibly the best way to go, but for now I would settle for a simple vmstat 10 10 while performance is bad (whatever 'bad' means!?). Is cpu time usage excessive? Running out of memory? Are the infinite processes building up?

There is at least one thing that strikes me as odd in the sysconfigtab... where is SSM_THRESHHOLD, and swap is set to eager, which is OKish, but many prefer Lazy.
Please be sure to check with Oracle what the recommend (required!) sysconfigtab settings are for your system. The questions you raise are 90% sure to be in the Oracle camp, and 10% about Tru64.

Hope this helps some,
Hein.

Frequent Advisor

Re: poor trucluster performance after 5 days

Hi Hein van den,
many thanks,
after 4 days the lock manager consume about 50%
of cpu time on cluster nodes,
vmstat,top,... dont display any thinks not ok,
pls,note one node is builling server when the builling begin its jobs like call collection ...
they sys the server is very slow I chech /var/../messages,top,vmstat,io
all ok,
please advise.
Honored Contributor

Re: poor trucluster performance after 5 days


If I read your last reply correctly, then it re-confirms that is is not a Tru64 OS problem but an Oracle issue: The OS stats look OK, Oracle 'misbehaves'.
Now this issue may or might not be caused by some Tru64 setting, but IMHO it you should really pursue this with Oracle first and foremost.
Of course the Oracle folks will want you to move to 9i like 9.2.0.6 and to a more recent Tru64 version... but they woudl be right! You investing a lot in this system: 2*10GB, memory channels, fibre infrastructure. HP invested in year of Tru64 engeineering giving you V5.1B pk X. Oracle invested a lot (hundreds of manyears) to get Oracle Parallel processing from a 'nice try' in V8 to production quality in 9i and beyond. Why not use those investments? Why worry about problems which may long since have been fixed?

Regards,
Hein van den Heuvel
//Add this to "OnDomLoad" event