Operating System - HP-UX
1752742 Members
5426 Online
108789 Solutions
New Discussion юеВ

Re: Oracle 10G RAC - crashes under load - consumes free mem

 
SOLVED
Go to solution
Robin T. Slotten
Trusted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

We still have an occasional system panic caused by Oracle evicting a node. No log entries anywhere. Currently working with Oracle trying to track it down. Some of the steps we did take were to replace the interconnect switch with a 1000MB switch after I caught the traffice surge over 100MB just before a panic. After replacing the switch, we saw a great improvement and have tracked a lot of interconnect traffic well over 100MB. We also recieved a document from Oracle about setting the realtime priority for the cssd process. This makes the interconnect traffic one of the top priority processes. This also helped quite a bit. Just an update, in case someone else is fighting this problem.

Rob...
IF you do it more than twice, write a script.
Robin T. Slotten
Trusted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

BTW, we also increased the total RAM memory to 48 GB on both nodes. This allowed us to increase the SGA.

Rob...
IF you do it more than twice, write a script.
Hein van den Heuvel
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

Thanks for the update.

It makes sense that you need a Gb interconnect.

I was closely involved with early Oracle RAC work, albeit on Tru64 fro Digital/Compaq. We used a dedicated interface called "Memory Channel" (Reflective Memory) for micro-second measured latency and high bandwith. Great technical solution, but too expensive requireing dedicated hardware. At the some time our competition (in those days) at HP using HPUX were using hyperfabric (or what is that name again) and everyone was considering Infiniband

To consider an 100 mb lan as a viable alternative seems like a strech to me and I am surprised Oracle support/consulting let you go that route.

You see, the RAC interconnect is NOT just a 'I'm alive' heartbeat kind of thing. It is very active, with two flavors of activity:
- Many short lock messages
- Fewer large database page block ships (cache fusion!)

The lock essages would readily saturate 100mb/sec in packets/sec well before the mb/sec limit is reached.
The block shipping will push the MB/sec limits.

In the final days of Tru64 they even considered a hybrid: MC for locks, GB for data.

Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting
Robin T. Slotten
Trusted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

The system was configured by a consultant before I got here. I have found a number of things I didn't agree with. It has been quite time consuming to get this cluster tuned and performing well. Some of it Oracle, some HP-UX and of course, bad SQL is bad SQL no matter how fast a machine you run it on. Thanks for the insight.
Rob...
IF you do it more than twice, write a script.
Eric Antunes
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

Hi Rob,

Maybe this consultant was sure about puting redologs in RAID5...

Just a tought, :)

Eric Antunes
Each and every day is a good day to learn.
Vladimir Fabecic
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

Robin
I also had problems with Oracle RAC (on TRU64 cluster with Memory Channel interconnect).
Even cluster interconnect was best type, it was not the only problem.
After some time I saw that the main problem were applications. RAC is not good for all types of applications. It is good for many "short connections", not for applications causing large number of locks.
I spent a lot of time for OS tuning and DBA spent lots of time for database tuning.
But only application tuning did some good.
Once also had a test TRU64 cluster with RAC, but with gigabit ethernet cluster interconnect. Performance was much worse than with Memory Channel (latency problem).
From my experiance, gigabit ethernet is minimum for cluster interconnect.
As Hein said, it not just hardbeat.
In vino veritas, in VMS cluster