- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- batch jobs starting before they are supposed to
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2010 06:00 AM
тАО10-12-2010 06:00 AM
Our batch jobs are starting 15 seconds before they are supposed to on a node in our cluster.
Queue manager runs on a different node, but the times are all the same on all nodes ( we run NTP) and I've checked - all time appears to be in sync throughout cluster.
Has anyone seen this before
Thanks
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2010 07:10 AM
тАО10-12-2010 07:10 AM
Solution$ SET NOON
$ MC SYSMAN
SET E/C
DO SHOW TIME
CONFIG SHOW TIME
EXIT
$ EXIT
Just to *prove* that this is the case.
Which node does the queue manager run on?
Craig
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2010 07:12 AM
тАО10-12-2010 07:12 AM
Re: batch jobs starting before they are supposed to
A local rule of thumb: always look under the rocks that "appear" unrelated to the problem, and always look at anything that "appears" correct. Confirm that the area is or is not correct.
And here specifically, confirm the (lack of) skew in the cluster:
SYSMAN> SET ENVIRONMENT /CLUSTER
SYSMAN> DO SHOW TIME
Whether the NTP servers are locked is of less interest, as I've seen many cases of skewed ntp times among pools of servers, too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2010 07:17 AM
тАО10-12-2010 07:17 AM
Re: batch jobs starting before they are supposed to
1)SHO QUE/A (for the scheduled job)
2) Accounting info that shows the time the job actually starter.
Thanks,
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2010 07:19 AM
тАО10-12-2010 07:19 AM
Re: batch jobs starting before they are supposed to
the "conf show time", showed a difference in this node.
I need to look into why NTP didnt correct this time difference
Thanks so much for replying - I really do appreciate it
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-12-2010 07:40 PM
тАО10-12-2010 07:40 PM
Re: batch jobs starting before they are supposed to
No matter how good your synching, there is always the possibility of a time discrepancy between cluster nodes. Maybe (hopefully!) not as large as 15 seconds, but certainly large enough to potentially fail a test like:
$ IF F$CVTIME(F$TIME()).LTS.F$CVTIME(ExpectedStartTime)
Remember that synchronizing time across nodes is not continuous, and there's always a tolerance threshold, most likely larger than the granularity of display time format (0.01 seconds).
I would usually code some tolerance into a test like the above. Choose the maximum your time synch code will allow and write the test like this (I've assumed 5 seconds)
$ IF F$CVTIME(F$TIME()+"+0-0:0:5.0").LTS.F$CVTIME(ExpectedStartTime)
$ THEN
$ job has started too early
$ ENDIF
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2010 03:01 AM
тАО10-13-2010 03:01 AM
Re: batch jobs starting before they are supposed to
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2010 06:20 AM
тАО10-13-2010 06:20 AM
Re: batch jobs starting before they are supposed to
If you'd like to verify this, queue a job then move the clock past the batch job release time on any other cluster node and watch your batch job start. Try moving the clock ahead on various cluster nodes, one test at a time, your job should kick off regardless.
This is a known vms cluster thingy (technical term), although I don't remember seeing it documented anywhere. The first node in a cluster that reaches batch job release time will cause the job to run regardless which node has the job queued.
One thing I have not tried is to use multiple queue managers within the cluster, although I'm guessing the behavior will be identical.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-13-2010 12:57 PM
тАО10-13-2010 12:57 PM
Re: batch jobs starting before they are supposed to
>The first node in the cluster that reaches
>your batch job release time will cause the
>job to start.
I don't think this is correct. My understanding is it is the clock on the node running QUEUE_MANAGER which determines when a job will start. The issue is, you can't necessarily predict which node will be the queue manager.
Think about implementation, you do really think anyone would replicate all the queue timer events on every cluster node and then attempt to deal with all the potential race conditions? Especially when there is a pervasive assumption in OpenVMS that clocks across a cluster will always be synchronized. Such a model is WAY more complex than necessary and would cause significantly more problems than it would resolve (indeed, what problem(s) would it be a solution for?).
The documentation is fairly specific about the possibility of jobs starting early (indeed, it hints that Paul's observation may be correct, if so, it's news to me!)
See $ HELP SUBMIT/AFTER
...
In an OpenVMS Cluster, a batch job submitted to execute at a specific time may begin execution a little before or after the requested time. This occurs when the clocks of the member systems in the OpenVMS Cluster are not synchronized. For example, a job submitted using the DCL command SUBMIT/AFTER=TOMORROW may execute at 11:58 P.M. relative to the host system's clock.
This problem can occur in a cluster even if a job is run on the same machine from which it was submitted, because the redundancy built into the batch/print system allows more than one job controller in the cluster to receive a timer asynchronous system trap (AST) for the job and, thus, to schedule it for execution. Moreover, this behavior is exacerbated if the batch job immediately resubmits itself to run the next day using the same SUBMIT command. This can result in having multiple instances of the job executing simultaneously because TOMORROW (after midnight) might be only a minute or two in the future.
A solution to this problem is to place the SUBMIT command in a command procedure that begins with a WAIT command, where the delta-time specified in the WAIT command is greater than the maximum difference in time between any two systems in the cluster. Use the SHOW TIME command on each system to determine this difference in time. Use the SYSMAN command CONFIGURATION SET TIME to synchronize clocks on the cluster. For complete information on the SYSMAN command CONFIGURATION SET TIME, see the HP OpenVMS System Management Utilities Reference Manual.