Operating System - OpenVMS
1839158 Members
5808 Online
110136 Solutions
New Discussion

Re: CPU in high interrupt mode

 
SOLVED
Go to solution
Frank Zecca
Occasional Contributor

CPU in high interrupt mode

Hello,

I have a 2 processor ES40 server running VMS 7.3 that is part of a cluster that is experiencing high interrupt mode times. My CPUs are spending between 30-50% of the time in interrupt state and MP synchronization. Any pointers as to what to look for? Thanks.

Frank
8 REPLIES 8
Martin P.J. Zinser
Honored Contributor

Re: CPU in high interrupt mode

Given the information you have provided so far it is difficult to tell, but an upgrade to 7.3-1 should at least help some (A number of shared spinlocks were improved with this release). Else some information on the application load (mix, SMP enabled etc) will be useful.
Frank Zecca
Occasional Contributor

Re: CPU in high interrupt mode

The high interrupt state occurs during peak usage(between 8:00am and 5:00pm)of a home written webmail application that runs under CSWS(apache). If I stop the apache web server, the system runs normal, as soon as apache is restarted, the high interrupt state occurs. I guess what I really want to find out is what tasks the CPUs are performing when in interrupt state or what devices are causing the interrupts. Thanks again.

Frank
Willem Grooters
Honored Contributor

Re: CPU in high interrupt mode

First you should try to find out what cuases this high CPU usage. Possible cause - sometimes overlooked - is distributed Lock Management. There are some good articvles on this subject on OpenVMS.ORG, I think by Keith Parris. Its gives some hints as well to find out, how to repair and how top prevent.
Willem Grooters
OpenVMS Developer & System Manager
Martin P.J. Zinser
Honored Contributor

Re: CPU in high interrupt mode

In the case of Apache, obviously also IP becomes an issue (again 7.3-1/5.3 is your friend). Also you might want to check into subprocess creation.
Åge Rønning
Trusted Contributor

Re: CPU in high interrupt mode

Frank,
first of all I agree with Martin that you should start planning for upgrade to V7.3-1 and even better V7.3-2 when it ships in a few weeks.
You haven't mentioned what tools you're using for Performance monitoring. If you're only using Monitor I would suggest you have a closer look at both Availability Manager and OpenVMS Data Collector and Performance Advisor. Both are free to use and available at http://h71000.www7.hp.com/openvms/system_management.html


VMS Forever
Craig A Berry
Honored Contributor

Re: CPU in high interrupt mode

Since this is your own application, you have a number of options for poking around in the offending process, including use of the OpenVMS Debugger and/or the system dump analyzer. You might also do a quick review of the source code and check for possible problems like calling system services in a tight loop, or, in other words, polling operations that might be better performed some other way.
Keith Parris
Trusted Contributor
Solution

Re: CPU in high interrupt mode

To specifically identify the source(s) of interrupt state time, one can use PC Sampling, such as provided by Computer Associates' Unicenter Performance Management for OpenVMS software (known earlier by such names as Polycenter Performance Solution Data Collector, DECps, etc.). This gives a report of what areas of VMS code are being executed while in interrupt state, which can give clues about the origin of interrupts.

Your 2-CPU ES40 is an SMP system, and VMS 7.3 supports Fast_Path (which is a way to shift some of the interrupt-state workload off of the Primary CPU (typically CPU 0) onto a non-Primary CPU, CPU 1 in your 2-CPU case) for several types of devices, including CI, some SCSI adapters, and Fibre Channel. If Fast_Path is not enabled, and you are using a device with Fast_Path support, consider enabling Fast_Path to try to shift some of the interrupt-state workload from CPU 0 to CPU 1.

To identify the major sources of MP Synch time, use the SDA extension SPL ($ANALYZE/SYSTEM and then type SPL at the SDA> prompt) to take a trace of spinlock activity and report on that trace.

I concur with the poster who recommended an upgrade to 7.3-1 if possible, as it had improvements both in the areas of interrupt-state and MP_Synch time.
Lokesh_2
Esteemed Contributor

Re: CPU in high interrupt mode

Hi Frank,

Please find below summary for interrupt stack time . Hope this helps in analysis:

*** Interrupt stack time (Interrupt State Time on Alpha) consists of processor time spent processing interrupts.
1. The interrupts are typically initiated by a hardware request.
2.This request may be an I/O completion or a clock interrupt.
3.To schedule lower priority processing within the context of an interrupt, the software processing a hardware interrupt may request a software interrupt.
4.Software interrupts include the lowering of the priority of a device driver processing an interrupt (fork processing), software timer processing, I/O completion, and rescheduling.


*** Both hardware- and software-requested interrupts have a special priority associated with them.
a)This priority is referred to as interrupt priority level (IPL).
b)Activities processed at an elevated IPL (that is, an IPL greater than 2) block all process-related activities and inhibit scheduling.


*** Interrupts are preemptive in blocking all process activities and are, therefore, causes of response time degradation.
Of the CPU time spent processing activities in the given modes, interrupt stack time is the most costly because it affects all processes on the system, not just the current process.
*** Causes of interrupt stack time are as follows:

1)Processing I/O requests
--The most common cause of excessive interrupt stack time is I/O-related activity.
Minimizing the number of I/O requests by requesting larger I/Os less frequently is probably the best solution to this problem.

2)Rescheduling and Timer Processing
--The primary reason for this increase in overhead is a lower setting for the SYSGEN parameter QUANTUM.
QUANTUM is measured in units of 10-ms intervals.
Thus, even at its smallest setting of 2, QUANTUM should not cost more than 2 to 3 percent additional interrupt stack time.

3)Distributed lock requests
--The processing of distributed locks in a OpenVMS Cluster is performed on the interrupt stack. Distributed locking strategies will be discussed later.

4)Mass Storage Control Protocol (MSCP) serving
--The serving of disk blocks in a OpenVMS Cluster to a remote node is performed on the interrupt stack.
Interrupt stack time costs for serving these blocks can be fairly high.
With a large number of server requests, a boot member or server node can become quickly bogged down in terms of CPU time.
What would you do with your life if you knew you could not fail?