1753474 Members
4559 Online
108794 Solutions
New Discussion юеВ

Re: losing ASTs rapidly

 
SOLVED
Go to solution
Volker Halle
Honored Contributor

Re: losing ASTs rapidly

Tim,

did this process ever work correctly on OpenVMS I64 ? When did it start to behave like this ? What has been done to the system prior to the first 'failure' ?

Volker.
Volker Halle
Honored Contributor

Re: losing ASTs rapidly

Tim,

consider to use ANAL/SYS and provide the following information from this process, when it's 'lost' a couple of ASTs:

$ ANAL/SYS
SDA> READ SYSDEF
SDA> SET PROC/ID=
SDA> SHOW PROC
SDA> SHOW PROC/PHD
SDA> FORM PCB
SDA> EXIT

Collect the output into a .TXT file and attach it to your next reply.

Volker.
tim lloyd_1
Frequent Advisor

Re: losing ASTs rapidly

Thanks for all the advice folks. This system was ported to iA64 and went live in January 2007. The system worked fine until we dropped on a new software release in April 2008.

The problem I face is that this issue occurs sporadically (on average one time in a month - system is up 7 days a week). Due to the nature of the application I can not interrogate the problem when it occurs.

I am going down the IOSB path. In other places we do use this facility. I can't see why we don't do it here in one of the most critical part of the whole system!


Cheer
Volker Halle
Honored Contributor

Re: losing ASTs rapidly

Tim,

does the process crash, if the problem happens ? Or does it just hang ? I assume you have mechanisms in place to re-start the process, if the problem happens.

If it crashes, run it with /DUMP or issue a SET PROC/DUMP before starting the image.

If it hangs, include a SET PROC/DUMP=NOW before stopping and re-starting the process.

If it just issues an error message and exits by itself, call a LIB$STOP(SS$_IMGDMP), this will force an image dump.

You can then do the analysis offline in the image dump with ANAL/PROC. Most of the process-related system data is also available in the image dump.

Do you disable AST delivery somewhere in the application ? And not re-enable it ?

Volker.
tim lloyd_1
Frequent Advisor

Re: losing ASTs rapidly

Volker, the program hangs rather than crashing. I will integrate the suggestions you make.

I had not thought about disabling ASTs. Obviously this is not intentional but possible. Any ideas how I would do this?
John Gillings
Honored Contributor

Re: losing ASTs rapidly

tim,

>So, I am treating this as a continuous leak

I'd still recommend testing your program with a higher limit, just to make sure you're not experiencing a spike in load. It's unlikely to cause any resource problems, and you may find your program recovers itself.

Instead of assuming it's a leak, make sure!
A crucible of informative mistakes
Robert Gezelter
Honored Contributor

Re: losing ASTs rapidly

Tim,

Concerning Disabling ASTs.

In short, read the code. Also, scan the code base for references to $SETAST or SYS$SETAST.

Obviously, also check the routines which call the routines which invoke those routines, particularly error paths.

I make several recommendations about how to do AST programming with a fair degree of safety in my DECUS presentation [mentioned earlier in this thread].

One good rule: Always use an IOSB that cannot re-cycled before the AST is processed AND never use event flags in conjunction with ASTs.

Another good rule is to include a logic check in the program to ensure that a buffer/IOSB combination is not recycled while it has a pending operation. Such a logic check often identifies an incorrect set of logic in the program long before the evidence is disturbed.

- Bob Gezelter, http://www.rlgsc.com
GuentherF
Trusted Contributor

Re: losing ASTs rapidly

What about the AST queue hanging off the PCB when the process hangs?

SDA> READ SYSDEF
SDA> SHOW SUMMARY ! to get the process index
SDA> SET PROCESS/INDEX=...
SDA> VALIDATE QUEUE PCB+PCB$L_ASTQFL_U
SDA> VALIDATE QUEUE PCB+PCB$L_ASTQFL_E

And a couple of..
SDA> FORMAT PCB+PCB$L_ASTQFL_U
SDA> FORMAT @.

Also...
SDA> SHOW CALL
SDA> SHOW CALL/NEXT

Get a linker map of the program image and find where the PCs are in the source code.

This all under the asumption that indeed the process is running out of ASTLM.

/Guenther
tim lloyd_1
Frequent Advisor

Re: losing ASTs rapidly

Hi All, Robert's paper on ASTs talks about "access modes". Eg. "queueing by access mode". And then 5 queues - special kernel, kernel, executive, supervisor, user.

My system has about 35 sub processes hanging off a main process. These sub processes have varying priorities from 4 to 15.

Basically what I am trying to establish is that if one of these processes calls setast to halt ASTs, this affects the whole bunch rather than just the process itself. Does that sound correct?
Volker Halle
Honored Contributor

Re: losing ASTs rapidly

Tim,

AST quota is NOT a pooled quota. It is PER PROCESS not PER JOB.

You said that 'the program hangs'. What is the state of this or these processes as reported by SHOW SYSTEM/PROC=xxx ?

As this problem show up only very intermittently, capturing a process dump is the most important work item. Then you can check and answer all the question about where the outstanding ASTs may be pending, whether ASTs are disabled etc.

Volker.