Re: pstat_getprocvm unusual behaviour

Virgil Chereches_2 · ‎06-01-2006

Hi everybody!
I've encountered today a very strange issue:
after an Oracle 9.2.0.7 issue (a lot of Oracle processes eating 100% CPU time in SYS mode) and after stopping/cleaning them we have the following unusual behaviour:
the lsof utility hangs forever when instructed to show all system processes but lsof -u ^oracleuser return almost immediately.
After some research we have found that lsof spent most of his time in pstat_getprocvm syscall so we have compiled a small program which calls that function; timex shows a six times more time spent in sys mode for an oracle process comparing with some other process and the behaviour is quite sistematic.
Have anyone seen something similar? Any hint about this?

The OS is HP-UX 11iv2 with September 2005 patch bundle installed.

Don Morris_1 · ‎06-01-2006

Is the time consistent for each pstat_getprocvm() call on a problematic process, or does it take more time on particular process regions?

If it is consistent - then it sounds more like you've got contention getting process-wide VM locks... likely because that same process is doing a lot of virtual address space modfication requests at the same time (mmap, brk, sbrk, malloc, fork, etc.). For example, if Oracle was eating 100% CPU because it was stuck in a shmat/shmdt loop, you'd see this sort of behavior.

If it isn't consistent (i.e. you always seem to get "stuck" on a particular region type or most especially on a constant offset where that offset is in Shared address space and mapped to all Oracle processes) then you're likely getting contention on sub-object locking because there are I/O operations in flight which are either being issued like crazy or timing out/retrying a lot.

If in fact any single call to pstat_getprocvm() isn't taking all that long -- your total time per process is what's 6 times larger.. I have to ask if the Oracle processes have 6 times as many process regions (you get 1 per call... 5x more regions == 5x more calls if you want to visit them all).

Can you post your program that you used and the timing data generated?

Virgil Chereches_2 · ‎06-01-2006

Thank you very much for your well documented response.
I have investigated a little bit more and I have found that the system call is taking a large ammount of time only when executed against the shared memory segment of the instance (we have configured shmmax large enough to accomodate Oracle SGA in only one shared memory segment based on Oracle recommandations). The behaviour is consistent with one of the possibilities you have mentioned. Moreover, it does not depend on the numbers of memory regions the process have (for a python program with more than 100 regions the small program which uses pstat_getprocvm returns much quickly than an oracle process with 40 regions).
Now some numbers: the SGA has almost 16GB of RAM, the number of oracle processes is somewhere between 1000 and 1700, the system running the workload is a SD partition with 24 CPU/32GB total RAM and 8GB of RAM configured as CLM. The system consumes almost 4GB of RAM.
Any recommandation will be highly appreciated.

Virgil Chereches_2 · ‎06-01-2006

Another significant detail: utilities which parse /dev/kmem to gather informations about memory regions (such as kmeminfo,procsize) respond very well for the same processes.

Don Morris_1 · ‎06-02-2006

Ah... I think I see it.

Being only on the SGA limits the possibilities - you aren't having locking issues (the locks acquired in this path don't care which virtual memory object in the process is being looked at, if you had contention here - you'd see contention on non-SGA objects within the Oracle process as well).

Note that /dev/kmem utilizing tools are completely different paths -- they don't worry about locks at all (you can get garbage running on a live system sometimes because of this) and they're not always reporting the same things. pstat has to be a good kernel citizen, kmeminfo is reading the raw data and doesn't have to be as polite.

Also (see below), I don't think kmeminfo generates per-page statistics for process scans hence it won't hit what I believe is your scaling problem here. If you have vpsinfo, I would expect it to take longer to process your Oracle processes than others on the system for much the same reason pstat does.

In any event -- the only path that makes any sense for your slowdown is the generation of the page size statistics. That path is large page aware (thank goodness), so this implies to me that your 16Gb SGA is backed by small page sizes. The statistics generation is perforce a linearly scaling algorithm -- so it costs time in proportion to the number of unique pages present in the object. Large page usage == fewer pages, so you'd get faster time. For all the smaller virtual objects on your system (and I think it is safe to assume you don't have all that many more 16Gb objects) there are correspondingly a lot less pages.. hence you take less time.
You've got a 16Gb SGA -- is your Oracle binary chatr'ed with a large data size hint? What are your vps_* tunables set to -- especially vps_chatr_ceiling? You really should try for 4Gb large pages with a 16Gb SGA for performance (reduces TLB miss rates)... you may not get them, especially if you start Oracle when the system has little free memory to begin with -- or if Oracle is using IPC_MEM_FIRST_TOUCH to get CLM within the SGA and your CLM doesn't have 4Gb in single pages left -- but you definately want as large of pages as possible.

If you are configured for large pages, are getting large pages and still see this slowdown then I would expect the SGA to have holes [where the system hasn't yet created memory because Oracle never referenced that virtual address]. Untranslated virtual pages are equivalent to 4k pages in the scanning method (I don't want to delve into a discussion of alternate scanning methods and the tradeoffs made here -- it would get way too internal in nature). I have to confess that I don't expect this to be the case since Oracle usually locks the SGA in memory (either for async I/O purposes if you've configured for async or just for performance - which I thought was the default)

In summary:
What's the page size information you get back from pstat for the SGA, your chatr settings on Oracle and your vps_* tunable settings?
If you have vpsinfo, that output would be handy as well.

Virgil Chereches_2 · ‎06-02-2006

The strange symptom seems to have dissapeared whithout any known change from our side (I mean no restart of Oracle,OS or parameters change).
I have attached bellow the required information. Thank you very much for all the insights you gave me.

The page size hint for shared memory segment is 1024MB (as seen from gpm).
The distribution of different size pages is as follows:
4KB: 2
16KB: 1
64KB: 0
256KB: 184
1MB: 791
4MB: 167
16MB: 11
64MB: 8
256MB: 2
1GB: 13
# chatr /oravapp/product/9.2.0.1/bin/oracle
/oravapp/product/9.2.0.1/bin/oracle:
64-bit ELF executable
shared library dynamic path search:
LD_LIBRARY_PATH enabled first
SHLIB_PATH enabled second
embedded path enabled third /oravapp/product/9.2.0.1/rdbms/lib/:/oravapp/product/9.2.0.1/lib/:/usr/lib/pa20_64:/opt/langtools/lib/pa20_64:
shared library list:
libodm9.sl
libskgxn9.sl
libjox9.sl
libcl.2
librt.2
libpthread.1
libnss_dns.1
libdl.1
libm.2
libc.2
shared library binding:
deferred
global hash table disabled
global hash table size 1103
shared library mapped private disabled
shared library segment merging disabled
shared vtable support disabled
explicit unloading disabled
segments:
index type address flags size
6 text 4000000000000000 z-r-c- 64M
7 data 8000000100000000 ---m-- L (largest possible)
executable from stack: D (default)
static branch prediction enabled
kernel assisted branch prediction enabled
lazy swap allocation for dynamic segments disabled
nulptr references disabled
# kctune -q vps_ceiling
Tunable Value Expression
vps_ceiling 64 64

Virgil Chereches_2 · ‎06-02-2006

Sory!
I forgot to mention vps_chatr_ceiling:
# kctune -q vps_chatr_ceiling
Tunable Value Expression
vps_chatr_ceiling 1048576 Default

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: pstat_getprocvm unusual behaviour

pstat_getprocvm unusual behaviour