Operating System - HP-UX
1836762 Members
2588 Online
110109 Solutions
New Discussion

Re: Performance - Is my system thrashing?

 
Greg Laws
Advisor

Performance - Is my system thrashing?

Hello everyone,

I've got an rp3440 running HP-UX 11.11, Informix 7.31, and PeopleSoft. Lately we've had some serious issues with batch processing where jobs are taking 2-3 times as long to finish. I've got OVPA installed so I'm able to take a look at historical perfomance data. One thing we discovered today is that our Informix database is grabbing four shared memory segments, and according to our documentation when this happens on a PA-RISC box it causes thrashing and, of course, very poor performance due to limitations of PA-RISC.

What I'd like to know is what are the best OVPA metrics to take a look at to show that there is thrashing going on, and also has anyone else run into this type of problem with Informix before?

Thanks!
9 REPLIES 9
Bill Hassell
Honored Contributor

Re: Performance - Is my system thrashing?

There is nothing in a PA-RISC box where shared memory segments cause thrashing. Like any virtual memory OS, if HP-UX runs out of RAM, portions of programs as well as shared memory areas will be paged out. This is nothing more than a lack of RAM. When you run Glance, type the command: m (selects the memory page) and look at the Page Out rate. If the numbers are 2 digits or more, especially during batch porcessing, then the system is badly under-sized. You'll probably need to double your RAM (or reduce the number and size of programs you run at the same time).


Bill Hassell, sysadmin
RAC_1
Honored Contributor

Re: Performance - Is my system thrashing?

As Bill informed. First check what is causing thashing. The excessive pageouts would be the measure to check that. Other than that also check cpu,mem,network and disk utilization, particularly for batch processing period.
There is no substitute to HARDWORK
Rajesh SB
Esteemed Contributor

Re: Performance - Is my system thrashing?

Hi Greg,

Bill explained trashing cause in well mannaer. Analyse the memory utilisation metrics.
Gather vmstat metrics. So You can ensure the shortage of memory also by analysing paging activity using vmstat.

Thanks & Regards,
Rajesh
Ninad_1
Honored Contributor

Re: Performance - Is my system thrashing?

Hi,

The underlying meaning of the vendor's statement could be that you need to have a single contiguous shared memory segment rather than having the shared memory split across regions of memory. If that is so then check your SHMMAX parameter if it is greater than the total shared memory segment required by your database. It is true that if shared memory segment allocated to a database is not contiguous then it can cause performance problems.
Next thing would be to check if your memory utilisation is high > 90-95% , and if there are pageouts - if yes then also check which are the top memory using processes
UNIX95= ps -e -o "pid,user,sz,args" | sort -nr -k 3 | more
The OVPA metrics would be GBL_MEM_UTIL , GBL_MEM_PAGEOUT_RATE , GBL_MEM_SWAPOUT_RATE

Regards,
Ninad
Darrel Louis
Honored Contributor

Re: Performance - Is my system thrashing?

Hi,

Can you see if you've a CPU/Memory or I/O bottleneck.
I know from the past that we had do a DB reorganisation(Export - Import) every now and then.
You can check this by checking the number of extents you have per table.
If I still can remember it correct, should be like:
dbaccess sysmaster
select dbsname,tabname,count(*) from sysextents
group by dbsname,tabname
order by 3 desc

Otherwise check it with your DBA.

Darrel

Steve Lewis
Honored Contributor

Re: Performance - Is my system thrashing?

What you were told about was the problem of protection ID thrashing when you have 7 or more shared memory segments on a PA_RISC 32 or 64 bit server.

See this fantastic thread for the evidence of how it would happen:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=147349

I am not sure if this still applies to HP-UX 11.11 or above. I certainly never see it on my larger servers which have 4-16 cpus so maybe its more of a single cpu problem.

As everyone above says, you have probably just run out of memory.

As an aside, informix7.31 is nearing the end and informix 9.x or 10.x will take up much more memory just to run itself, as will Oracle 9 or 10 by the way - the more features they build in, the more memory they consume.

You can temporarily save memory by reducing the BUFFERS and increasing the primary V segment. Look at the output of onstat -g seg and comparing with your value of SHMVIRTSIZE in the onconfig file.

If you have extra segments allocated but not used, then it may have been a single dodgy query or a peak of extra users which caused it to add segments. You can remove those on-line by using onmode -F.







Greg Laws
Advisor

Re: Performance - Is my system thrashing?

Thanks everyone for your replies!

Memory utilization on the system during normal operation for the last month has ranged from 73%-81%. According to OVPA there hasn't been a single page out in the last month. CPU Utilization averages about 40%-60%, with one spike to 80% for about five minutes. It never hits 100%. Disk utilization is usually around 20%-30%, but there are a few spikes to 100% that last for about 5-10 minutes once a week. Swap space util stays at about 60%-65% (swapmem_on =1).

Informix shared memory segment sizes are set at 100MB with a max of 2GB total. SHMMAX is set for 190MB in the kernel.

These metrics don't seem to indicate any hardware bottlenecks or thrashing. Would you agree?

This one is a head scratcher for me, especially since my clients want me to fix the problem with the miracle cure ... a reboot.
A. Clay Stephenson
Acclaimed Contributor

Re: Performance - Is my system thrashing?

One thing that I would look for is a high context-switch rate. That can be caused by someone setting a very low timeslice value rather than leaving it very near the default of 10. A timeslice of 1 can lead to very severe degradation as the system begins to load significantly.
If it ain't broke, I can fix that.
Greg Laws
Advisor

Re: Performance - Is my system thrashing?

Thanks for the additional input Clay.

I'm running some batch processing right now and I get the following from glance CPU Report:

CPU REPORT Users= 7
State Current Cumulative High
--------------------------------------------------------------------------------
Load Average 1.0 0.7 1.2
Syscall Rate 25268.6 25090.7 39189.6
Intrpt Rate 8847.8 6147.3 11794.7
CSwitch Rate 20235.7 11311.2 23065.6

Top CPU user: PID 22476, oninit 44.2% cpu util
Active CPUs: 4

I'm not sure what is considered a high value for CSwitch. This system has 4 CPUs and 12GB of memory.