A few doubts about Shmmax Semaphore and async driver

vinodan · ‎03-12-2008

Dear All,

We have a system running in a superdome partition having 64 CPU and 64 GB RAM .32 GB has been allocated for SGA of a 10 TB Database ( Oracle 9i ).whenever i am checking ipcs -mbop i could see 1200 processes are attached to Shared memory segment - NATTACH ( Since we have set a big shmseg valueonly one segment allocated as shared memory segment .

1. How this processes are accessing this shared memory segment. One at a time ???. If Why it shows 1200 processes attached ??? .

2. What is the role of semaphore in this ? . As a arbitrator ??? . How it is doing this job ? .

3. Often i could see a lot of processes are waiting for SEM in glance . Does it indiacting any issues realted to Shared meory . ?

4. Which is the best practise ---- Setting shmmax as big as possible or a medium value therby making 3-4 shared memory segments for a single Db instance . Oraclle Docs are saying 3-4 are OK .

5. Does the size of SGA having any relation with /dev/async device file ??? . Recently HP has recommended us to change the /dev/async device file's minor number from 0x000000 to 0x000100 . Is it having any relation with the number of async IO operation that can access this device ?

fuser -u /dev/async shows around 200 processes

Thanks and Regards

Regards
Vinod

Don Morris_1 · ‎03-13-2008

Ok, caveat up front -- I'm a kernel engineer, not an Oracle admin. A lot of your questions I can't answer, but I'll reply to what I can and let others fill in the blanks.

1) Each process shares the segment by having it mapped into its virtual address space. A segment is simply a virtual object, and since HP-UX is a Single Address Space OS, shared objects use the same virtual address across multiple processes [that's how they're shared]. Access of such an attached object is by simply referencing the virtual memory addresses (just like how you would access memory from malloc() or mmap() ). Since the object is attached to the process, a fresh access will generate a TLB miss/virtual fault -- which is satisfied when the object is found within the process [security check 1: if you don't have the object mapped, you'll fail virtual faults]. A subsequent protection fault will occur because the shared objects use unique protection keys (or sets of keys in the case of User Relaxed Isolation) - when the protection fault locates the object in the process it will load the key into the process's key set, allowing the access [this is security check 2 since delayed/lazy TLB flushes for performance mean there's a chance you can miss the virtual fault from 1 -- the key sets are changed on context switch]. So 1200 processes in your example have this object within their virtual address space and are potentially accessing the segment concurrently.

Which raises what I think may be your _real_ question here (especially based on your later queries) -- how is synchronization between writers (normally no one cares about multiple readers since there's no data change, only writers synchronized with readers and other writers matter) accomplished? The answer is: however the processes using the segment wish to... and there are frankly too many ways to enumerate. There could be a single per-segment semaphore or even a single semaphore for _all_ shared object access. Or each process could have a working set area within the segment which other processes are allowed to read but not write and stale data is acceptable (say for a log buffer) which then uses no locks... or sections of the segment each with their own locks. There's really no way to know unless you check with the implementor -- in this case, Oracle. Hopefully someone more familiar with Oracle can point you to a document if they choose to expose this aspect of their implementation (I frankly wouldn't be surprised if they didn't since it could change whenever they wish, after all).

2) See above. There's no implicit relationship between a shared object (SysV shared memory or otherwise) and any semaphore [well, there are kernel per-page semaphores that will come into play... but that's true for any physical memory, those locks are used during translation build/teardown/modification].

3) If so, indirectly. If the implementation uses a single semaphore per segment and all access must acquire said semaphore, then a 1200-way concurrency issue would doubtless cause such contention.

4) Only Oracle would know [or experienced Oracle admins, which I'm not]. See original caveat.

5) None that I'm aware of. The /dev/async device file is simply used to control/access the async driver. The async driver can access SGAs large or small (well, it works on virtual objects/ranges as far as I know).

Checking out the 0x100 setting from documentation such as http://docs.hp.com/en/11iv3IOPerf/IOPerformanceWhitePaper.pdf
and whatnot, 0x100 changes the Async driver behavior from always locking the virtual range in memory when such a range is added to the driver working set (memory locking will fault in any physical backing needed, check permissions and prevent translation state changes beyond this point). With 0x100 set, the range locking is done when an actual I/O request is generated instead -- which in your case is quite going to result in faster Oracle startup times (if the whole SGA is added to the async set and locked as the process starts, that's a lot of processing to do to make sure all 32Gb is locked, etc. if each I/O is only 1Mb or so, that can be done as needed instead).

There's no bearing between 0x100 and the number of processes using async at a time that I see, that's much more a matter of the max_async_ports(5) kernel tunable.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

A few doubts about Shmmax Semaphore and async driver

A few doubts about Shmmax Semaphore and async driver

Re: A few doubts about Shmmax Semaphore and async driver