Operating System - OpenVMS
1839292 Members
1658 Online
110138 Solutions
New Discussion

Re: OpenVMS Cluster System Performance

 
SOLVED
Go to solution
Feige
Advisor

OpenVMS Cluster System Performance

Dear all,

I setup a Alpha cluster system: node A and nodes B. Configuration as follows:

Node A: Memory 4G,OpenVMS Ver8.2,Oracle 9.2.0.2
Node B: Memory 4G,OpenVMS Ver8.2,Oracle RAC 9.2.0.2

Now We run Startup Node A and Startup B, After checked $show cluster is ok, then I startup Oracle node A and B separately,instance names: "YUDB1" and "YUDB2",and DB name is "YUDB".

After successfully start oracle,then I follow two mode to start "YUAPP" application but get fully different result:

1) Standalone start "YUAPP" on node A,if need I firstly manually stop "YUAPP" on node A and then start "YUAPP" on node B. In this mode,this performance is very great. CPU utility is only 5%~10% at most,Page fault is 0~40.
2) I start application "YUAPP" to "ONLINE" mode on node A, and node B to "Standby" mode.
Two process "SWITCHMONITOR" is always monitoring "YUAPP" status every 1 seconds.Once "ONLINE" is lost then "STANDBY" automatically switch from "STANDBY" to "ONLINE".Unfortunately, the CPU utility is high than 25%. and Page fault some time is 400.

What's wrong? Please help me because customer is challenging me. Thanks in advance.
16 REPLIES 16
labadie_1
Honored Contributor

Re: OpenVMS Cluster System Performance

Hello

I am not sure I have understood what you want to do, but you shoudl run some tool in order to have some data to analyze, like
Ecp
http://h71000.www7.hp.com/openvms/products/ecp/index.html
or
Tdc
http://h71000.www7.hp.com/openvms/products/tdc/index.html

A basic
$ monitor dlock
on both nodes should be interesting, as
$ monitor process/topcpu
$ monitor process/topfault
$ monitor modes

Feige
Advisor

Re: OpenVMS Cluster System Performance

I set account: system/prclm=90 and oracle/prolm=50,

Maybe is too small? which is ok?

Karl Rohwedder
Honored Contributor

Re: OpenVMS Cluster System Performance

I doubt, that PRCLM is the culprit, it's the subprocess quota limit. To be sure, just increase it and try.

regards Kalle
Robert Gezelter
Honored Contributor

Re: OpenVMS Cluster System Performance

Feige,

What is running on the node that is "performing poorly"? What processes are actively consuming CPU time and generating page faults?

Much of this information can be determined using the MONITOR utility and supplemental tools such as T4.

- Bob Gezelter, http://www.rlgsc.com
Hein van den Heuvel
Honored Contributor

Re: OpenVMS Cluster System Performance

Typically incorrect process limits will cause an application to fail, not for it to go slower. That is unless the error handler would go into a tight loop trying again and again.

Are those 4-cpu boxes?
So could 25+ % cpu be explained by a process looping?

Did it ever work 'correctly'?

Do you have system performance tools running (ECP, T4,...).

I would suggest a simple $MONI PROC/TOPC and/or MONI PROC/TOPF to 'see' what process(es) are using the extra resources and take it from there.

I would not be surprised if that was an Application Process, nor Oracle, nor system.
You may want to contact the support organization for the application. (is that you? :-).


Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

Feige
Advisor

Re: OpenVMS Cluster System Performance

Hein van den,

Thank you very much!

Now a subprocess after running 10min then fails to quit. after restart the subprocess the same thing still happens. my god!

I think ORACLE account PRCLM is too small(PRCLM=50),so change to PRCLM=200,but not run AUTOGEN and restart alpha server. it still happens.

maybe after change the parameter PRCLM=200,must and should run AUTOGEN and restart.
Feige
Advisor

Re: OpenVMS Cluster System Performance

Hein,
The Application Process is developed by me.but if I only start oracle on node A or B, everything is ok.

If I start Oracle on node A and B, then it happens.

Thanks!
Feige
Advisor

Re: OpenVMS Cluster System Performance

Hi,Everyone

Anything else if it's possible?I still can't fix the problems.

Help me. Customer is challenging me.........

Thanks!
Hein van den Heuvel
Honored Contributor

Re: OpenVMS Cluster System Performance

>> I still can't fix the problems.
>> Help me.
>> Customer is challenging me.........

Sure. Send money first.

Seriously, if a customer is paying you to sort this out, and you can not, then you may need to engage professional help.
Maybe from Oracle, maybe from HP, maybe independent.
So far I have not seen enough pertinent data to suggest that the problem is in capable hands, nor enough data to allow well willing (and able!) folks here to help beyond basic stuff.

Best regards,
Hein.
Feige
Advisor

Re: OpenVMS Cluster System Performance

Hein,

Thanks!

I understand what you mean. But in my opinion, The project is developed by me and also I puchased HP hardware and software,so I need get help from HP. And also I know the HP service is the best in IT especial in China.

If you help get more any data,Please tell me,Thank you very much!.
labadie_1
Honored Contributor

Re: OpenVMS Cluster System Performance

>>> Anything else if it's possible?I still can't fix the problems.

Yes, install Ecp or T4 or TDC, collect data and post it, or post the various
$ monitor ...
already asked

Such a problem is not easy to solve through a forum, by the way.
Jan van den Ende
Honored Contributor

Re: OpenVMS Cluster System Performance

Feige,

>>>
Now a subprocess after running 10min then fails to quit. after restart the subprocess the same thing still happens. my god!
<<<

So, this definitely shows PRCLM is NOT the cause. Too low a value can prevent new processes to start, but does not influence already started processes.

I have not yet seen (or did I overlook it?) a process termination status for the failing processes.
$ ACCOUNT /IDEN= /FULL
-> the process will have a final status.

Let's see what we can make out of that.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Robert Gezelter
Honored Contributor

Re: OpenVMS Cluster System Performance

Feige,

First, since this is an OpenVMS cluster, polling is not the most efficient way to do the monitoring of the other task. A Lock (using the OpenVMS Lock Manager) is the correct way to do this. Additionally, this will inherit all of the underlying cluster presumptions without the risk of implementation problems.

I do not see if Oracle is configured properly for a cluster configuration.

If utilization is high in an otherwise idle system, I would suspect an error in the implementation of polling. The only way to understand precisely what is happening is a detailed review of the code involved.

Without more details, it is impossible to underestand precisely what is happening. As has been commented, Hein, myself, and others do provide consulting services in these areas.

- Bob Gezelter, http://www.rlgsc.com
Hoff
Honored Contributor
Solution

Re: OpenVMS Cluster System Performance

Oracle RAC on an OpenVMS cluster isn't a cheap configuration, and it isn't going to be cheap to keep it going when something goes wrong.

As a commercial entity, you're now faced with a decision. Specifically, find and the issue yourself (and -- free advice -- PRCLM is almost certainly either not involved, or is only peripherally involved), get somebody else to find and fix the issue (which will probably not be free), learn about the environment and how to fix the problem (training), or give your customer their money back.

More free advice: your code is the assumed and most likely culprit here, until proven otherwise. (I'm not intending this position to be derogatory, either. I assume my own code is the culprit, until I can prove the error lurks elsewhere.) Run your code under /DEBUG, or instrument your code, or both. See what's going on within your code.

Create a reproducer. Post it.

Which vendors have the best support -- or even any support -- is irrelevant here, if and when you're writing your own code. That's between you and your customers. But if you (or somebody you formally bring in) can prove Oracle or OpenVMS are broken, then the vendor might or will be interested.


Feige
Advisor

Re: OpenVMS Cluster System Performance

Thanks everyone!

Special thanks Hoff,Robert and Hein for your support!

Now it fixed.
Feige
Advisor

Re: OpenVMS Cluster System Performance

Thanks everyone!

Special thanks Hoff,Robert and Hein for your support!

Now it fixed.