1753587 Members
6761 Online
108796 Solutions
New Discussion юеВ

Re: Memory problems

 
Bruno Bossier_1
Regular Advisor

Memory problems

We have 2 rp3440's with each 2GB physical memory both configured into a Serviceguard cluster. We are running 4 Oracle 10g environments distributed over 4 Serviceguard packages.

From time to time, one the servers just completely hangs. You can't logon anymore, you don't get any response anymore from the shell, but ping answers back.

When this situation happened, I did a TC so that a dump was forced which I send to HP. HP analysed the dump and told us that the system was very heavily working on trying to get some free memory. I attached the result of the dump. HP concluded that more memory needs to be added.

However, we feel not completely satisfied, because this does not happen all the time. This only happens on an occasianal basis.

My question now is if it is possible to activate some memory diagnostics which could help us to find more in detail what process(es) is consuming a lot of memory.

One process that could be in the picture is a gateway between Oracle and DB2 (on the mainframe). But that is just a guess ....

All help would be appreciated !

Cheers,
Bruno
8 REPLIES 8
Mark Nieuwboer
Esteemed Contributor

Re: Memory problems

Hi Bruno,

I agree with HP because oracle can need a lot of memery from time to time. you only have 1 gb a oracle envirement and this is not much.

grtz. Mark
Devender Khatana
Honored Contributor

Re: Memory problems

Hi,

The dump clearly mentioned heavily utilized memory. Some free memory is always required for a system to work smoothly. The problem comes only sometimes as the system is not always equally utilized. You can use other commands to find the utilization from time to time. The best bet will be to running the one heavily loaded package on the other node and see if the similar behaviour is exibited over a period or time. I do not think suspecting memory in such scenarion will help.

#swapinfo -atm ( To see memory/swap utilization)

#sar -w 3 10

#top ( to see if vhand is using a lot of resources)

Also check what is the setting of dbc_max_pct and dbc_min_pct kernel parameters.

HTH,
Devender


HTH,
Devender
Impossible itself mentions "I m possible"
Keith Bryson
Honored Contributor

Re: Memory problems

Hi Bruno

You don't mention the version of HP-UX, but I had a very similiar problem on 11i. Did HP recommend installing patch PHKL_30796? If your system is low on memory, apparently vhand process runs out of control during heavy utilisation.

Keith
Arse-cover at all costs
Frank de Vries
Respected Contributor

Re: Memory problems

Devender gave a good outline to start with,
Are you sure it is hanging or just the
inetd daemon that is not working ?

you could also check your
/var/adm/syslog/syslog.log
for problems with forking processes or
inetd deamon ?

and have a peak at if anything unusual appears
sar -c (system calls)
sar -q (run and swap queue)
sar -v (kernel nproc and inode)

also check the value of your swchunk and swapmen_on
(It is not because your swapinfo is not full
, it could be fragmented to bits)

Also what is the relationship between your
primary and secondary swap.

Do you have any unusual behaviour on your oracle listeners (check /tmp/.oracle) ?

rgds


Look before you leap
Bruno Bossier_1
Regular Advisor

Re: Memory problems

Patch PHKL_30796 is already installed, but thanks for the feedback.

dbc_min_pct = 5
dbc_max_pct = 50

I remember now that HP indeed suggested to lower these values. Maybe that can help.

sar -c

HP-UX hpux3 B.11.11 U 9000/800 01/09/06

00:00:00 scall/s sread/s swrit/s fork/s exec/s rchar/s wchar/s
00:20:00 125942 127 10263 1.79 1.58 2748990 140795
00:40:00 125752 119 10255 1.74 1.54 2324228 123688
01:00:01 126184 121 10292 1.74 1.54 1217956 120186
01:20:00 125896 111 10261 1.74 1.54 1621087 134914
01:40:00 126174 122 10303 1.75 1.54 2177781 124814
02:00:00 126542 114 10314 1.74 1.54 882546 122032
02:20:00 125990 122 10274 1.75 1.55 2739018 133019
02:40:00 125854 119 10264 1.74 1.54 2520870 122604
03:00:01 125940 110 10257 1.74 1.54 2363198 123314
03:20:00 125531 125 10233 1.75 1.55 2046316 144448
03:40:00 126492 112 10319 1.74 1.53 1221622 122293
04:00:00 126568 120 10331 1.74 1.54 1303983 122393
04:20:00 126274 124 10306 1.75 1.54 2373570 131316
04:40:00 125878 108 10262 1.75 1.54 2461839 123122
05:00:00 126165 123 10286 1.75 1.54 1244144 122774
05:20:00 124691 358 10022 4.76 4.01 2795121 523938
05:40:01 125715 119 10261 1.74 1.54 1640617 121785
06:00:00 126030 120 10277 1.75 1.54 3154739 120281
06:20:00 125864 112 10260 1.74 1.54 2421769 134141
06:40:00 126037 121 10284 1.75 1.54 341847 124584
07:00:00 126086 127 10276 1.75 1.55 3125551 126195
07:20:00 126035 122 10272 1.75 1.54 3308288 133158
07:40:00 126181 121 10294 1.75 1.54 2091309 123733
08:00:01 125537 128 10237 1.75 1.55 2905768 123020
08:20:00 125334 126 10229 1.78 1.57 763117 138359
08:40:00 127264 118 10382 1.84 1.63 706766 2007762
09:00:00 134817 126 10986 1.85 1.63 958527 126508
09:20:00 78571 328 6286 2.01 1.74 3360580 2290756
09:40:00 2465 52 11 1.93 1.66 512079 137739
10:00:00 18286 92 81 1.94 1.71 1159512 625916
10:20:01 52642 129 223 1.79 1.58 1780200 1701507
10:40:00 46502 99 149 1.77 1.56 1380918 1047857
11:00:00 45454 125 215 1.76 1.55 1905085 1737907
11:20:00 2167 50 8 1.82 1.62 650828 131451
11:40:01 7262 119 29 3.82 3.30 1312749 379769
12:00:00 3990 316 161 3.76 3.37 2063376 174314

Average 2471 133 7921 1.97 1.73 1877453 89072
-----
>sar -q

HP-UX hpux3 B.11.11 U 9000/800 01/09/06

00:00:00 runq-sz %runocc swpq-sz %swpocc
00:20:00 5.5 51 9.5 2
00:40:00 4.2 66 9.0 1
01:00:01 4.4 54 0.0 0
01:20:00 5.7 59 9.2 2
01:40:00 5.1 60 9.0 1
02:00:00 2.4 72 2.0 0
02:20:00 3.7 56 9.0 1
02:40:00 5.4 54 0.0 0
03:00:01 3.5 65 0.0 0
03:20:00 3.6 71 10.0 1
03:40:00 3.5 60 0.0 0
04:00:00 4.4 56 0.0 0
04:20:00 4.2 59 9.2 3
04:40:00 4.7 54 0.0 0
05:00:00 3.6 56 2.0 0
05:20:00 3.3 81 1.0 0
05:40:01 3.6 56 0.0 0
06:00:00 5.3 59 7.8 2
06:20:00 3.6 75 9.2 1
06:40:00 3.2 83 7.4 1
07:00:00 3.7 60 8.3 1
07:20:00 2.8 91 7.9 1
07:40:00 4.2 75 9.0 1
08:00:01 3.2 86 7.2 3
08:20:00 3.4 56 7.5 1
08:40:00 5.0 61 0.0 0
09:00:00 3.4 57 7.1 2
09:20:00 1.8 56 7.0 1
09:40:00 1.6 64 16.5 13
10:00:00 1.4 53 9.6 2
10:20:01 2.0 73 3.8 3
10:40:00 1.7 65 12.9 13
11:00:00 2.1 74 9.0 13
11:20:00 2.0 54 0.0 0
11:40:01 1.3 65 8.6 3
12:00:00 1.2 69 6.9 5

Average 3.4 64 10.3 2
-----
>sar -v

HP-UX hpux3 B.11.11 U 9000/800 01/09/06

00:00:00 text-sz ov proc-sz ov inod-sz ov file-sz ov
00:20:00 N/A N/A 2344/4096 0 2013/34816 0 3675/63498 0
00:40:00 N/A N/A 2344/4096 0 2013/34816 0 3675/63498 0
01:00:01 N/A N/A 2344/4096 0 2013/34816 0 3676/63498 0
01:20:00 N/A N/A 2344/4096 0 2013/34816 0 3675/63498 0
01:40:00 N/A N/A 2342/4096 0 2011/34816 0 3667/63498 0
02:00:00 N/A N/A 2344/4096 0 2013/34816 0 3675/63498 0
02:20:00 N/A N/A 2344/4096 0 2013/34816 0 3675/63498 0
02:40:00 N/A N/A 2343/4096 0 2012/34816 0 3670/63498 0
03:00:01 N/A N/A 2347/4096 0 2016/34816 0 3690/63498 0
03:20:00 N/A N/A 2343/4096 0 2012/34816 0 3670/63498 0
03:40:00 N/A N/A 2344/4096 0 2013/34816 0 3674/63498 0
04:00:00 N/A N/A 2343/4096 0 2013/34816 0 3670/63498 0
04:20:00 N/A N/A 2343/4096 0 2012/34816 0 3670/63498 0
04:40:00 N/A N/A 2344/4096 0 2013/34816 0 3674/63498 0
05:00:00 N/A N/A 2348/4096 0 2017/34816 0 3684/63498 0
05:20:00 N/A N/A 2344/4096 0 2014/34816 0 3674/63498 0
05:40:01 N/A N/A 2343/4096 0 2012/34816 0 3670/63498 0
06:00:00 N/A N/A 2344/4096 0 2013/34816 0 3675/63498 0
06:20:00 N/A N/A 2344/4096 0 2014/34816 0 3674/63498 0
06:40:00 N/A N/A 2344/4096 0 2013/34816 0 3674/63498 0
07:00:00 N/A N/A 2347/4096 0 2016/34816 0 3699/63498 0
07:20:00 N/A N/A 2345/4096 0 2014/34816 0 3687/63498 0
07:40:00 N/A N/A 2346/4096 0 2015/34816 0 3691/63498 0
08:00:01 N/A N/A 2344/4096 0 2013/34816 0 3683/63498 0
08:20:00 N/A N/A 2346/4096 0 2019/34816 0 3686/63498 0
08:40:00 N/A N/A 2352/4096 0 2029/34816 0 3702/63498 0
09:00:00 N/A N/A 2358/4096 0 1982/34816 0 3734/63498 0
09:20:00 N/A N/A 243/4096 0 2016/34816 0 1555/63498 0
09:40:00 N/A N/A 256/4096 0 2065/34816 0 1716/63498 0
10:00:00 N/A N/A 232/4096 0 1612/34816 0 1492/63498 0
10:20:01 N/A N/A 229/4096 0 1609/34816 0 1476/63498 0
10:40:00 N/A N/A 230/4096 0 1614/34816 0 1475/63498 0
11:00:00 N/A N/A 231/4096 0 1615/34816 0 1475/63498 0
11:20:00 N/A N/A 234/4096 0 1622/34816 0 1471/63498 0
11:40:01 N/A N/A 193/4096 0 1623/34816 0 1415/63498 0
12:00:00 N/A N/A 205/4096 0 1827/34816 0 1545/63498 0
Devender Khatana
Honored Contributor

Re: Memory problems

Hi,

It is indeed a misconfigured system. Reduce dbc_max_pct to somewhere inbetween 10-15 and see the difference. Setting it to 50 means 50% of the memory is resereved for buffer cache and rest 50% is available for processes.

I would suggest to change it to atleast 10 for sometime and see the difference.

Also take some time to assign points to the responses to your earlier threads.

HTH,
Devender
Impossible itself mentions "I m possible"
Alzhy
Honored Contributor

Re: Memory problems

Lowering your dbc_max_pct so it is around 600MB to 800MB (the "mythical sweet spot") may not be enough. 2GB on Oracle10G is barely enough for even the smallest DB instance. How large is your biggest instance's SGA. Also, how are you storing your Oracle storage - on filesystems or RAW? If filesystems - do you have your filesystems set for directIO?

Also you may check MWA stats for your memory statistics -- I suppose you have it since you're running ServiceGuard and HP-UX/MC. Most Admins/sites always overlook this fact - that they have MWA and it's been collecting valuable stats from day 1.
Hakuna Matata.
TwoProc
Honored Contributor

Re: Memory problems

Your max_dbc_pct is way to high, set it to something like 5 to 10 percent, depending.

Also, you really need more ram than that to run Oracle Server. I know many people who have that much ram in their Windows Desktops.
Memory is very cheap for the performance gain you'll get just from configuring the OS and the Oracle system correctly. Your Oracle buffer cache and shared pool will thank you with programs that perform much better.
We are the people our parents warned us about --Jimmy Buffett