Operating System - HP-UX
1751961 Members
4985 Online
108783 Solutions
New Discussion юеВ

Re: Oracle 10G RAC - crashes under load - consumes free mem

 
SOLVED
Go to solution
Robin T. Slotten
Trusted Contributor

Oracle 10G RAC - crashes under load - consumes free mem

Oracle RAC 2 node
(2) ia64 hp server rx4640
16 GB Memory with about 4 GB free under normal load
HP-UX 11.23
CRS and ASM
XP12000 storage array
* NOT USING MCSG

LAN = 1000 Full-Duplex
Interconnect = 100 Full-Duplex

LAN is going to a large Cisco Core switch
Interconnect is an isolated 100MB Cisco (2850?) switch with just these 2 machines.

System has worked for some time in development. We started to load test the system and have had a few crashes that appear to be TOC crashes.

Just before the system or systems crash, I can track the free memory suddenly disappear going from about 4 GB free to 0 free in less than 10 minutes.

One thing I have noticed is the logical disk IO seems especially high and seems to continually increase all the time that Oracle is running ( days and weeks ). Most of this traffic appears to be going through the interconnect.

My Obvious question: Is a 100Mb interconnect an issue? I have never been able to catch it pushing more than 50 MB max. Usually it is down around 10-20 MB.

Has anyone seen this memory consumption issue?

What ever happens to trigger the event happens so fast it does not leave any dumps or very little information in any logs. Most of my clues have come from Measureware logs.

Rob...


IF you do it more than twice, write a script.
15 REPLIES 15
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Oracle 10G RAC - crashes under load - consumes free mem

This is rather tough to track down given the limited data available. The interconnect should never cause the system to crash and if your metrics are accurate, the bandwidth is sufficient. I assume that you have tuned to dbc_max_pct value down to a reasonable level (no more than 10%). Since this box is running out of memory, the very first thing that I would do is reduce maxdsiz_64bit and reduce shmmax so that no one process is able to grab all the memory in sight. I would expect the application to then possibly fail with application errors (or warnings) rather than crashing the system. The system then at least has a chance of telling you what is actually happening.

You should also have a look a MetaLink for any available Oracle patches and/or any reccomended HP-UX patches.

Whenever I see huge numbers of logical I/O's related to a database, that immediately suggests inefficient SQL because the system is being asked in essence to re-read data that it should already know. That doesn't cause the system to crash but it does indicate that some SQL tuning is probably in order.
If it ain't broke, I can fix that.
Robin T. Slotten
Trusted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

Thanks Clay,
I always try to maintain dbc_max_pct at about 500MB or less. in this case it is 5% or 746MB.

maxdsiz 1073741824 maxdsiz_64bit 4294967296
shmmax 1073741824

Basically Oracle's target parms across the board.

Patches are current as of last Dec.
The load is somewhat artificial as it is being done in a test mode. I suspect a lot of duplicate querys., so that would be in line with your statement. What I do think is strange is the logical IO seems to continue to grow even after the load testing has dropped of as if there is a process stuck in a loop somewhere. System CPU use is purportionally high for other systems I have worked with, but I attribute that to ASM running under the control of root.

I'll run this by our team tomorrow and see if we can give it a try.

Thanks,
IF you do it more than twice, write a script.
Yogeeraj_1
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

hi rob,

you may also wish to run STATSPACK report or through the Enterprise Manager Database console, verify the overall database performance. Any bottlenecks will be highlighted there..


hope this helps too!


kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Eric Antunes
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

Hi Robin,

I think your issue must be related with bad apps sql or RAC issues.

About RAC, I think metalink is the best place to search for bugs, notes, alerts, etc..

About possible bad apps sql's, check them with the following script:

select substr(s.username,1,20) "User Name",
s.osuser "OS User",
s.status "Status",
lockwait "Lock Wait",
substr(s.program,1,30) "Program",
substr(s.machine,1,15) "Machine",
p.program "Process Program",
si.consistent_gets "Consistent Gets",
s.process "Process PID",
p.spid, p.pid, s.serial#, si.sid
from sys.v_$sess_io si, sys.v_$session s, sys.v_$process p
where s.username is not null and
si.sid(+)=s.sid
and p.addr(+)=s.paddr
order by si.consistent_gets desc

If the first rows have much bigger consistent_gets than the others than it is likely that there are bad sql.

Best Regards,

Eric
Each and every day is a good day to learn.
Robin T. Slotten
Trusted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

The strange thing about this problem is we don't see a lot of stress on the system other than the logical IOs. We have been running all types of statistics and the system and oracle seem to be fairly happy until the "event" that crashed the machine. Thanks for the SQL we'll give it a shot.

Thanks for the help folks.
IF you do it more than twice, write a script.
Robin T. Slotten
Trusted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

I was finally able to capture an time that the interface LAN was showing 156MB of traffic. We replaced the 100MB hub with a temporary D-link 1000MB hub. Solving that problem, the application soon consumed memory. Yesterday I installed an additional 32 GB of Mamory on each machine. ( DBA had tracked a separate issue to lack of SGA memory.

We will be load tasting again soon.
Rob..
IF you do it more than twice, write a script.
Steven E. Protter
Exalted Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

Shalom Rob,

Common Oracle problem.

The two nodes do not have the same OS patches. I'd make sure they have memory leak and consumption patches from HP.

http://www.hpux.ws/system.perf.sh
Might want an idea where all the memory is going.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Bill Hassell
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

If you're not using raw I/O for Oracle data, 11.23 has major buffer cache enhancements that modify the previous recommendations for dbc_max_pct. For 11.23, you may find significant cache performance benefits by increasing the cache size into several Gb. Try 3Gb as a start. I've seen logical I/O rates as high as 125,000 with a 6Gb cache. One great feature of 11.23 is that the cache can be expanded and reduced online and it takes just a few seconds to take effect.

As far as memory usage, I would use measureware and perhaps a once/minute ps analysis of local data for each process, something like this:

#!/usr/bin
date
UNIX95=1 ps -e -o vsz,pid,ruser,args | sort -rn | head -20

Run this script in cron every minute, appending the output to a log file:

* 1,2,3,4,5,6,7,...etc...58,59,60 * * * /usr/contrib/bin/ramusage.sh >> /var/tmp/ramusage.log

The ps list will show any process that suddenly increases local RAM usage. It won't document shared memory, so ipcs -bmop may need to be run in a loop too.


Bill Hassell, sysadmin
Yogeeraj_1
Honored Contributor

Re: Oracle 10G RAC - crashes under load - consumes free mem

hi Robin,

one further step you can take into analysing the performance of your database is to periodically verify your v$sqlarea to see which SQL statements are not using BIND VARIABLES.

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)