Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

 
Dan R Farrell
Occasional Visitor

OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

We are using a SYS$SCHDWK call to run a process 50 times a second. It works except every six hours from boot time it misses a few cycles. This is very repeatable. It does not matter if the process is running for hours or for a few minutes. Six hours after boot time and every six hours after that (almost to the millisecond) it misses cycles. I think the system time may be getting reset but we have shut down almost every on the system except for VMS and network and it still happens. NTP is not running. Has anyone seen this issue before?
21 REPLIES 21
Robert Gezelter
Honored Contributor

Re: OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

Dan,

Have you verified that the computation of wake times does not have some slip in it?

- Bob Gezelter, http://www.rlgsc.com
Richard Whalen
Honored Contributor

Re: OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

50 times a second is very often!

I know that VMS on ia64 (not necessarily 8.3-1H1) stores the value of the TOY clock on disk approximately every 6 hours. I suspect that your problem is related to this.
Jon Pinkley
Honored Contributor

Re: OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

Dan,

Anything that can block scheduling has the potential to cause a wake up to be delayed.

Or as Bob suggested, depending on how you are computing the next wakeup, you may be getting rounding errors. For example, the LIB$CVTF_TO_INTERNAL_TIME may be subject to rounding errors with small intervals. If you are using integer math, then I doubt that is the cause of the problem.

There are several possibilities, either higher priority processes could be preventing the process from being scheduled or some high IPL code could be blocking process scheduling or even the hardware clock interrupt.

I am not sure how often the BBW battery backed up watch gets updated, but hopefully that couldn't cause cycles to be lost, and I wouldn't expect any disk I/O to be blocking scheduling.

If it is that repeatable, I would fire up the PRF SDA extension to collect samples starting 10 seconds or so prior to a 6-hour epoch and see what is happening. I'm reasonably sure it isn't driven of the HDWCLK interrupt so I believe it has a chance at seeing code executing at HWCLK IPL. If PRF is using EXE$GQ_SYSTIME in its time stamp calculations, it may lead to false conclusions about "when" something happened.

Do you have other timer based code running, or some performance data collector that could be running something at high IPL periodically?

Jon
it depends
John Gillings
Honored Contributor

Re: OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

Dan,
Just to clarify...

I'm assuming you're calling $SCHDWK with a "reptim" value of 20msec, as opposed to calling it every cycle with a "daytim" of 20msec?

Could you show us the actual code?

A few things I'd worry about...

1) first 20msec is only 2 quanta. I'd want my value to be as close to the real value as possible. If I cared about it a lot, I'm not sure I'd trust a $BINTIM conversion to do that for me. I'd be checking the bits in the time value.

2) The timing of the $WAKE has no influence on when the target process responds (ie: actually wakes up). How can you distinguish between the $WAKE being late, and the process "sleeping in"?

3) $HIBER/$WAKE seems like a rather blunt instrument to use if you require high precision ticks. Maybe you should consider other possibilities? For really accurate, high frequency timing, you pretty much have to dedicate a CPU and busy wait.

Things to try...

What happens if you double the frequency to 10msec?

If you haven't done so already, build an absolutely minimal test program. On waking, don't do anything other than sample the time and put the results in a ring buffer.

Are you running with multiple CPUs? Have you tried using affinity?
A crucible of informative mistakes
Hoff
Honored Contributor

Re: OpenVMS 8.3-1H1 Itanium SYS$SCHDWK call

I'm actually somewhat surprised this works as well as this does and you're only seeing a few cycles every six hours; this looks to be a polling-based design, though somewhat cloaked in the garb of a multiprocessing application. And I'd expect to see a few cycles going to other tasks here and there.

I might well look to abscond with a core here and go to full-on polling, rather than a 50 Hz (60 Hz in the US?) solution. That, or (depending on what is going on) I'd look to start dealing with the cruft in an out-board processor here, as those are cheap. There are also ways to release the processor through the scheduler interface, too.

Do call HP, as they're the arbiters of this sort of thing and (if you're doing 50 process activations a second) you probably have a support contract.