Showing results for 
Search instead for 
Do you mean 

trucluster performance issue.

SOLVED
Go to Solution
Frequent Advisor

trucluster performance issue.

Hello.

I build for A customer 2 nodes cluster -
5.1b pk5 A4100*2 hsz50 acs 5.7-2 4g -mem 2*600m cpus. Oracle 9.2 single instance with
Caa service. everything is working fine But When one of the nodes is shuting dow the performance is rising by 20%.
for Example : ( working client -server )
to load data from database & build ABB map it takes 75sec when one system is down and 90sec
when both systems up. ( i tried it on both
systems, same results.

Any Ideas? is it normal ?

Thanks , Shalom.
10 REPLIES
Honored Contributor

Re: trucluster performance issue.


Do your caa scripts relocate cfd services?

You woudl want the devices with the DB files, and perhpas $ORACLE_HOME as well be served by the local system.

man cfsmgr (Cluster File System)

I would suggest that a lot of activity is outside direct-io.
You woudl want to verify Oracle use direct-io

I have some cfs relocation scripts floating around if you like, byt they are relatively easy to come up with.


Hein.

Honored Contributor

Re: trucluster performance issue.

Here is a silly script I wrote years ago to make sure cfs was serving from a cluster member named 'sapdb1'.
Adapt to use the node which is currently serving your DB.

Hein.

#!/bin/perl
#
foreach (`cfsmgr -c`) {
$i=4 if (/^ Dom.*:\//);
next if ($i-- > 0);

if (/On = (.*$)/) {
$on=$1;
}
if (/Name = (\w*)/) {
$na=$1;
next if ($on =~ /member/);
next if ($na =~ /sapdb1/);
print "cfsmgr -h $na -r -a SERVER=sapdb1 $on\n";
}
}
Honored Contributor

Re: trucluster performance issue.

Hi,

just for the record. Do you know that the hsz50 is no longer supported under 5.1B?
It may work but as HP says "In case of trouble you are on your own."

greetings,

Michael
Frequent Advisor

Re: trucluster performance issue.

Hello,
Thanks for your quick replies.

1. i checked the cfsmgr -v & the disks are
on the system with the service.

2. Michael - Are you sure HSZ50 is not
supported on 5.1B ?

Shalom.
Esteemed Contributor

Re: trucluster performance issue.

HSZ50 storage? I'd love to see how you have this connected. This could very well be where the problem is. dsk1 is used for root1_domain. dsk1 is also marked as "served disk" (see drdmgr command). For TruCluster to work well, the disks must be marked "Direct Access IO". At first I believed that the HSZ50 may not be capable of being a "Direct Access IO" disk, but all the other disks are correct.

If a disk is marked "Served", then just making sure that you are on the CFS server (cfsmgr command) may not be enough. You may also have to relocate the disk to another system (drdmgr command).

Does it matter which system you shutdown down to see the performance impact?

The test you're seeing this performance problem with, does that use the network at all?
Honored Contributor

Re: trucluster performance issue.

Hi,

we have a 2 member DS20E cluster with a HSZ50 that's why I know. Here is the copy from the release notes for 5.1B.
2.3.1.2 Array Controllers
Support for the following storage array controllers is retired in this release
of the operating system:
â ¢ HSZ10
â ¢ HSZ20
â ¢ HSZ40
â ¢ HSZ50

Michael
Honored Contributor

Re: trucluster performance issue.

Han,

this is the output from drdmgr of our HSZ50 cluster.
View of Data from member fra01d as of 2005-11-16:12:42:45

Device Name: dsk101
Device Type: Direct Access IO Disk

greetings,

Michael
Esteemed Contributor

Re: trucluster performance issue.

Michael,

Yes, I saw that in the attached sys_check. But there was one device which was "Served", i.e. dsk1, lun 0. Perhaps this is because of the way the command console LUN works on the HSZ50. Or perhaps it is some other problem. But it could be a potential cause for the performance issue.

Cheers.
Frequent Advisor

Re: trucluster performance issue.

Hello All.

I saw this disk too only now & I changed
it with drdmgr to be on the right machine
with no change. I also do caa_stop to the service & start the oracle manualy to see
if the caa is the 20% overhead - same results. no matter which machine is down
the performance is 20% faster.
I wonder why dsk1 is served disk & not direct I/O .

Thanks, Shalom.
Esteemed Contributor

Re: trucluster performance issue.

Then it's back to performance and tuning 101 I'm afraid.


- Can you see any difference in behavior between the cases? For instance does one use more system time?
- Does it do more I/O and if so where (check the counters with drdmgr)?
- If you believe it is an I/O problem, can you reproduce the problem with something like "dd"?

Do you have collect data for the two runs?
//Add this to "OnDomLoad" event