1753378 Members
4895 Online
108792 Solutions
New Discussion юеВ

Re: High MPSYNC - help

 
SOLVED
Go to solution
Thomas Thacker
Occasional Advisor

High MPSYNC - help

We have recently been experiencing periods of very high MPSYNC activity. I'd like to know what are the primary (typical) causes of high MPSYNC activity.

We are running OpenVMS 7.3-2 on a 3-node cluster running on GS1280 systems. The system in question is a 16 CPU system with 96GB of memory. We are running Cerner's Millenium software suite, using an Oracle database.

We use host-based shadowing and have HBMM enabled. The system in question has 68 shadow disk sets.

Cluster communication is through dual CIPCA adapters (the second one is for failover).

The disks are dual-fiber connected to a Storageworks SAN (HSG80s).

The activity has gone from approximately 100% (which was normal) to periods of 800-900% for MPSYNC mode (MONITOR MODE).

Any information and/or hints where I should start looking would be appreciated.

There have been no major changes in hardware or software configuration (including VMS updates) since last August.
20 REPLIES 20
Jim_McKinney
Honored Contributor
Solution

Re: High MPSYNC - help

Oracle... lots of IOs, lots of locking, lots of CPUs. Wild guess here... are you using dedicated CPU lock manager? If not, check out the

$ mcr sysgen sys_par lckmgr_mode

It's dynamic, so you might experiment... and if you're not already using FAST_PATH for the CIPCA, take a look at that SYSGEN parameter as well (it's not dynamic).
Andy Bustamante
Honored Contributor

Re: High MPSYNC - help


Along with Jim's comment, make sure to look at sysgen parameter LCKMGR_CPUID. You don't want to have the dedicated lock manager running on a FAST_PATH cpu. I've heard 8.3 will check for conflicts before assigning a lock manager cpu.

For our in house database application the deciated lock manager makes a noticeable performance improvement.

LCKMGR_CPUID and LCKMGR_MODE are dynamic and can modified on the fly. We tested dedicated lock manager on GS-80s and GS-1280s.

Andy Bustamante
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
John Gillings
Honored Contributor

Re: High MPSYNC - help

Thomas,

>what are the primary (typical) causes
>of high MPSYNC activity.

Generic answer - contention on spinlocks between CPUs. The question is WHICH spinlocks.

You can use the SDA Spinlock Tracing Utility to see which spinlocks are being hit. See

$ ANALYZE/SYSTEM
SDA> SPL

for some cursory documenation. See Chapter 8 of "HP OpenVMS System Analysis Tools Manual" for more details.

Talk to your local CSC if you need further assistance in driving the tool, or interpreting the results.

What you do depends on which spinlocks are most contentious.

If it's the LCKMGR spinlock, using a dedicated lock manager CPU is a good idea. A 16P system is a likely candidate. The obvious question is "so why should I pay for an entire CPU to run the VMS lock manager?". The answer is "so lock management doesn't cost you MORE than an entire CPU!"

At the moment you're using 8 or 9 CPUs just spinning waiting for spinlocks. You could STOP/CPU up to 6 or 7 CPUs and expect IMPROVED performance from the system as a whole!

It's all about how code scales to multiple CPUs. From the workload which can be performed by a single CPU, adding a second will give "almost" 2x performance because of resource contention between the multiple streams of execution. Each additional CPU will add slightly less than the last one, until you reach a plateau where an extra CPU adds nothing. This is the "knee" in the scaling curve for your particular workload. Adding more CPUs will REDUCE the overall throughput because contention is increased by more than the additional compute power added. At 8-900% MPSYNCH, your workload could be past the knee and well on it's way to the ankle ;-)
A crucible of informative mistakes
Volker Halle
Honored Contributor

Re: High MPSYNC - help

Thomas,

there is a procedure SYS$EXAMPLES:SPL.COM which collects SPINLOCK information and also has some comments and background information included.

Volker.
Thomas Thacker
Occasional Advisor

Re: High MPSYNC - help

Thanks to all for the great responses. It's exactly the kind of info I was looking for.

FYI - we have FAST_PATH enabled, but not dedicated lock manager for this node. I'll try that today.

Based on the spinlock data I saw, it looks like the MQ interface that Cerner uses is the biggest spinlock user...

Regards,
Tom
Art Wiens
Respected Contributor

Re: High MPSYNC - help

Thomas, such expert, timely, support level advice (for free!) is surely worth assigning some points! Don't worry, they're free as well ;-)

Cheers,
Art
Thomas Thacker
Occasional Advisor

Re: High MPSYNC - help

I dedicated a CPU to the lock manager this morning.

So far, I've noticed no MPSYNC improvement. Still seeing periods of over 800% MPSYNC time...not good... At times, MPSYNC spikes to consume almost all 16 CPUs!.
Volker Halle
Honored Contributor

Re: High MPSYNC - help

Thomas,

maybe it's time to use SYS$EXAMPLES:SPL.COM and provide the output for us to look at...

T4 would also be a very useful tool to collect system performance information in such a situation. OpenVMS engineering is using this tool for performance analysis.

http://h71000.www7.hp.com/OpenVMS/products/t4/index.html

Volker.
Thomas Thacker
Occasional Advisor

Re: High MPSYNC - help

We do collect t4 data. I don't see anything in T4 that might help determine the cause of the MPSYNC issue. The data does show that the worse MPSYNC abuse occurs between 10-11AM. I'll fire up the SPL procedure during that time tomorrow morning and post the results here.

I suspect that MQ may be part of the problem, but I have no proof. We are a bit behind in VMS patches. The last update was v7.3-2 Update 4. I was trying to determine if MQ V3.0 was included in Update 4 or not. If not, we are 2 patches behind with MQ. The last MQ update I can see in the patch history on our system is MQ V2.0. Unfortunately, I've not been able to find information on-line about previous updates.

Maybe there are other patches that address spin-lock issues that we have not applied yet?

Regards,
Tom