Operating System - OpenVMS
1828667 Members
1862 Online
109984 Solutions
New Discussion

High Interrupt CPU when shadow copy

 
SOLVED
Go to solution
Cass Witkowski
Trusted Contributor

High Interrupt CPU when shadow copy

We have a 4 node ES40 cluster running OpenVMS V7.3. The disks are on HSG80s
When performing shadow copy operations I see high CPU usage. On a single ES40 node with 4 CPUs I see the following:

CPU Interrupt stack %
0 84
1 1
2 89
3 1

CPU 2 is the fastpath for the FGA0 device.

Do these Interrupt stack values seem within reason?

SHADOW_MAX_COPY = 2
SHAD$MERGE_DELAY_FACTOR = 400

Thanks

Cass
10 REPLIES 10
Jur van der Burg
Respected Contributor
Solution

Re: High Interrupt CPU when shadow copy

That's a known issue. I fixed this just before I left HP, it has to do with pool management by SYS$DKDRIVER. It allocates and zeroes huge packets in nonpaged pool, and because of the size and if you have a fragmented free list it has to walk that list over and over again, causing the high interrupt stack time. The latest version has some smarts in the buffer management which makes a real difference. You can verfiy the freelist size by going into SDA and do a CLUE MEM/FREE. If it's more than a couple of 1000's you will start to notice this.

Upgrade to at least V7.3-2 and apply the latest fibre_scsi patchkit.

Jur.
Cass Witkowski
Trusted Contributor

Re: High Interrupt CPU when shadow copy

Would this also affect normal operations? I did the clue mem/free and had over 38,000 elements
Jur van der Burg
Respected Contributor

Re: High Interrupt CPU when shadow copy

Sure, with 38000 it's amazing that the system still runs. Make sure you have the system parameters NPAG_GENTLE and NPAG_AGGRESSIVE set to 100 (default in V8.2).

If you want to play then I have a program to fragment nonpaged pool even worse :-)

Jur.
Ian Miller.
Honored Contributor

Re: High Interrupt CPU when shadow copy

Cass,
do you have npag_aggressive set to 100, and npag_gentle to set 100? You should as this turns off some pool management algorithms which can lead to fragmented pool.

____________________
Purely Personal Opinion
Volker Halle
Honored Contributor

Re: High Interrupt CPU when shadow copy

Cass,

with that many packets on the free packet list, you are quite likely to run into CPUSPINWAIT crashes, when one of the CPUs needs to traverse that list to obtain or release a nonpaged pool packet and another CPU is trying to acquire the POOL spinlock.

Volker.
Hoff
Honored Contributor

Re: High Interrupt CPU when shadow copy

Above and beyond the fix in the ECO that Jur points to, an upgrade to V7.3-2 is going to show multiple other performance benefits. Benefits that can be quite significant. The shadowing- and SMP-related performance updates may well assist with your goals and requirements here, for instance. Various HP customer folks have publicly reported significant speed-ups.

V7.3-2 is a PVS release.

As it is not a major upgrade for OpenVMS Alpha (and was centrally release-numbered to track OpenVMS I64 release numbering), I'd also look at an upgrade to the current release; to V8.3. The OpenVMS Alpha V8.3 release is in the roadmap as being the current release and then probably (if the current HP CVS/PVS process holds) being PVS-eligible (ECOs) through 2011 or so.


Cass Witkowski
Trusted Contributor

Re: High Interrupt CPU when shadow copy

NPAG_GENTLE was 85 and NPAG_AGGRESSIVE was 50.

I have set these 100. Will this help defragement the NPAGE pool or will I have to reboot?

Cass Witkowski
Trusted Contributor

Re: High Interrupt CPU when shadow copy

Hoff,

As for upgrading to a newer version, we are working with our customer to that end.

It is not my job to run the train,
or even ring the bell.
But watch this baby jump the tracks,
and see who catches hell!
Jur van der Burg
Respected Contributor

Re: High Interrupt CPU when shadow copy

If you changed the parameters you have to reboot to get it fixed.

Jur.
Cass Witkowski
Trusted Contributor

Re: High Interrupt CPU when shadow copy

Thanks again for the quick and helpful responses

Solution is to upgrade to at least OpenVMS V7.3-2 plus Fibre_SCSI patches and set NPAG_GENTLE and NPAG_AGRESSIVE SYSGEN parameters to 100