1834391 Members
1726 Online
110066 Solutions
New Discussion

Re: Cache Monitor

 
SOLVED
Go to solution
Alfred_7
New Member

Cache Monitor

Hi ,

I am working on tuning a application for cache misses. Is there any tool for monitoring cache - hit/miss for my application.or is there some mechanism by means of which i can achieve this.

Thanks,
Alfred
6 REPLIES 6
Rammig Claus
Frequent Advisor

Re: Cache Monitor

Hi Alfred,

you can achieve cache hit/misses with the command sar -b. Look at %rcache and %wcache.

Or you could use gpm (glance)
- Reports
--- Disk info
----- Disk report
There you can see read and write cache hits.

But this are global values and no values for an application.

Best regards ...
Claus
No risc no fun
Sridhar Bhaskarla
Honored Contributor

Re: Cache Monitor

Hi Alfred,

Look are sar -b option.

sar -b 2 20

%wcache and %rcache describe write cache hit ratio and read cache hit ratio. If you want numbers, you can get them with the metrics bread/s lread/s and bwrit/s lwrit/s.

Look at the man page of "sar" for more options.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Michael Ehrig
Advisor

Re: Cache Monitor

Hi Alfred,
It is not 100% clear from your question if you ask for CPU cache optimization or buffercache optimization. For buffercache use the 'sar' command as pointed out by others. But I suspect you talk about CPU cache (data/instruction) monitoring. If you have an Itanium based HP-UX system the answer is easy. Yes, monitoring is available, use the tool 'caliper'. If you're looking for PA-RISC based systems, it is more complicated. You can use 'Cxperf' for cache miss measurement BUT you must have V-Class hardware to do that. For other PA-RISC based systems you're pretty much out of luck.
Michael Tully
Honored Contributor

Re: Cache Monitor

If your concerned about buffer cache usage, you may also need to look at your current kernel parameters. The recommendation is that your buffer cache should be between 300-500Mb no more.
The parameters in question are 'dbc_max_pct' and 'dbc_min_pct'

Having these any highr may distort your buffer cache hit ratio. As stated using 'sar -b' and glance are the correct tools for monitoring.
If you don't have glance, you can install a 60 day trial copy from your application CD set.
Anyone for a Mutiny ?
Alfred_7
New Member

Re: Cache Monitor

Hi ,

Thanks a lot for your replies.

I am actually working on a circuit simulator code, and i need to tune it for cache miss.i am running the code on PA-8600 with 1.5MB cache and it is a A-class machine.

Cxperf is performance monitor tool which gives insight into cache misses. But is avaible only on V , D and K class of machines.

I also tried to get other tools like PAPI( Performance application programming interface) and PCL ( Performance counter library), etc. But none of them are available for HP-UX PA- 8600 platforms.

Also i am not looking for overall performance of the machine interms of cache misses( sar -b and glance ). I am looking specifc to my application / process , how it utilises the cache.

Any info is welcome regarding this issue.

Thanks,
Alfred
Mike Stroyan
Honored Contributor
Solution

Re: Cache Monitor

The most important factor in tuning an application is finding out where most of the execution time is spent. You should get the prospect profiler from http://www.hp.com/go/prospect and use
prospect -P; prospect -V2 -e -e -f my.prospect a.out
to gather a detailed application profile running your application with a real data set. That prospect output will include an instruction profile. Important cache misses will show up as load and store instructions with really high hit counts.
That will go beyond measuring cache misses to finding out which functions have the biggest cache miss penalties.

If you are using big enough data to have a lot of cache misses, you probably need to use the +pd linker option to get large pages. Large pages will prevent TLB thrashing on big address ranges. You should also set the kernel parameters dbc_min_pct and dbc_max_pct to the same value to keep the file system buffer cache from fragmenting physical RAM and interfering with large page allocations.

If you identify cache misses the next step will be reducing their impact on performance. You can try to either regroup your data to fit better in cache lines or add cache prefetching to get data into cache just before it is needed. The compiler will do simple cache prefetching with +Odataprefetch. That works for big loops where the compiler can tell what addresses will be used well ahead of the actual use.
If the compiler can't figure out what to prefetch, you can hint at it yourself with "#pragma prefetch" as documented in http://docs.hp.com/hpux/onlinedocs/2212/A-03-37relnotes.html