Operating System - OpenVMS
1822144 Members
3390 Online
109640 Solutions
New Discussion юеВ

Batch job completes before start time?

 
SOLVED
Go to solution
Mike Smith_33
Super Advisor

Batch job completes before start time?

I searched the archives but I did not see anything. Application analyst brought it to my attention. They have a self-submitting job that has been completing before it should have started. The submit command in the job is:

$ submit/noid/que=sys$batch/log=sys_log:g_sub_eod.log /nonoti/noprin/parameters=('env')/after="tomorrow+07:00:00" sys_com:g_sub_eod.com

The entry in the queue indicates a 7am start.
Entry Jobname Username Blocks Status
----- ------- -------- ------ ------
156 G_SUB_EOD XXXXXX Holding until 31-JUL-2007 07:00:00
On idle batch queue XXX$BATCH
Submitted 30-JUL-2007 07:00:00.27 /KEEP /LOG=DSA102:[XXXXXX.][NEWS.PROD.LOG]G_SUB_EOD.LOG; /PARAM=("PROD") /NOPRINT /PRIORITY=100
File: _DSA102:[XXXXXX.NEWS.PROD.COM]G_SUB_EOD.COM;8

The mod and creation times on the last few logs shows
G_SUB_EOD.LOG;1008 30-JUL-2007 06:57:43.68 30-JUL-2007 06:57:47.62
G_SUB_EOD.LOG;1007 29-JUL-2007 06:57:44.54 29-JUL-2007 06:57:48.33
G_SUB_EOD.LOG;1006 28-JUL-2007 06:57:45.44 28-JUL-2007 06:57:48.97
G_SUB_EOD.LOG;1005 27-JUL-2007 06:57:46.31 27-JUL-2007 06:57:50.05

Inside the logs the termination time

XXXXXX job terminated at 30-JUL-2007 06:57:47.61

Is there something simple I am missing? Have you had this issue before? Any help is appreciated. The 3 mins early is causing a problem with their calculations.
13 REPLIES 13
Andy Bustamante
Honored Contributor

Re: Batch job completes before start time?

Are you in a cluster without either ntp or dtss running? The start time may or may not reflect the time on the node where the job is running.


Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Mike Smith_33
Super Advisor

Re: Batch job completes before start time?

Andy, you are correct and I seem to remember seeing something written on this subject. One node did show a 2 min difference in time but I dismissed it as it did not seem to be a part of this problem.
Robert Gezelter
Honored Contributor

Re: Batch job completes before start time?

Mike,

I concur with Andy. Has the time been checked on all members of the cluster?

- Bob Gezelter, http://www.rlgsc.com
Mike Smith_33
Super Advisor

Re: Batch job completes before start time?

After Andy's email, I set the time consistent across all three nodes. I will set what it does in the morning.

Jon Pinkley
Honored Contributor

Re: Batch job completes before start time?

Mike Smith >>>"$ submit/noid/que=sys$batch/log=sys_log:g_sub_eod.log /nonoti/noprin/parameters=('env')/after="tomorrow+07:00:00" sys_com:g_sub_eod.com

The 3 mins early is causing a problem with their calculations."

------------------------

The correct thing to do is to make sure your clocks are synchronized.

The simple fix if they want to make sure the job does not start before the local clock has reached 7:00, is to pass the next run time as one of the parameters, and when it starts to run, have the job wait until that time.

For example:

Add "$ wait "''p2'" to beginning of g_sub_eod.com

Do something like to following to get time of next run.

$ this_proc = f$environment("procedure")
$! if you want latest version instead of current version
$ this_proc = f$element(0,";",this_proc) ! assumes ";" only valid as version seperator
$ next_run = f$cvtime("TOMORROW+07:00:00","ABSOLUTE")
$ submit 'this_proc'/noid/que=sys$batch/log=sys_log:g_sub_eod.log /nonoti/noprin/parameters=('env',"''next_run'")/after="''next_run'"

NOTE WELL: This is using an undocumented behaviour of wait, specifically waiting for an absolute time. For this to work, you must pass 2 parameters, with a space between the date and the time.

I.e.

$ wait 30-JUL-2007 15:40:00.00 ! trailing digits are significant, if omitted, the fields from the current time will be used.

This does not solve the problem of the job starting 3 minutes late, if the node running the queue manager is 3 minutes slower than the clock on which the job runs.

Here's an example using wait in the undocumented fashion.

In these examples, I typed show time into the typeahead buffer while the wait was in progress.

$ wait 30-jul-2007:15:35 ! the blank between date and time is required.
%DCL-W-IVDTIME, invalid delta time - use DDDD-HH:MM:SS.CC format
\0 30-JUL-2007:15:35\
$ wait 30-jul-2007 15:35 ! entered at 15:34:24
SIGMA::JON 15:34:35 (DCL) CPU=00:01:50.92 PF=50635 IO=345965 MEM=268
SIGMA::JON 15:34:44 (DCL) CPU=00:01:50.92 PF=50635 IO=345966 MEM=268
SIGMA::JON 15:34:55 (DCL) CPU=00:01:50.92 PF=50635 IO=345967 MEM=268
SIGMA::JON 15:35:16 (DCL) CPU=00:01:50.92 PF=50635 IO=345968 MEM=268
$ show time
30-JUL-2007 15:35:24
$ wait 30-jul-2007 15:36:00.00
$
SIGMA::JON 15:36:39 (DCL) CPU=00:01:50.93 PF=50636 IO=345979 MEM=269
$ wait 30-jul-2007 15:37:00.00
$ show time
30-JUL-2007 15:37:00
$
it depends
Art Wiens
Respected Contributor

Re: Batch job completes before start time?

The time on the node running the QUEUE_MANAGER process will be used to determine when the jobs start. The time on the node on which the queue is running might be different and cause these irregularities. I see that from time to time (no pun intended ;-) we have a job that runs every two minutes and the time on the nodes tend to drift apart just a bit more than two minutes every 6 months or so. The node with the execution queue will seem real busy, but you won't really see what is taking all the cpu, it's the job constatntly starting, resubmitting, starting again etc.

Do an occasional (every couple of months?):

$ mcr sysman set env/cluster
SYSMAN> config set time hh:mm:ss

using some reliable time source to keep 'em close.

Cheers,
Art
Andy Bustamante
Honored Contributor
Solution

Re: Batch job completes before start time?

The easiest option is to use NTP and synch time on all nodes in the cluster. If you don't have external access to a time server, pick a node to be the "master" time server and configure the other nodes to synch to the master. Clocks will drift over time, and will drift differently between nodes.

I've also seen an (older) recommendation to add a WAIT in sylogin.com for batch jobs or to use a "set time" job to keep cluster time.


Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Wim Van den Wyngaert
Honored Contributor

Re: Batch job completes before start time?

We had the problem once on 6.2 that jobs started 0.1 sec before their scheduled time.
We think it was due to DTSS that is/was used.
We added a wait 00:00:00:10 to sylogin to solve the problem.

Don't have that problem on 7.3 anymore (or is our sylogin or vms7.3 login procedure in geberal heavier than on 6.2 ?

Fwiw

Wim
Wim
Bojan Nemec
Honored Contributor

Re: Batch job completes before start time?

Jon,

Interesting undocumented behavior of WAIT (probably a direct corelation with SYS$SETIMR).

Mike,

Have the system times sinchronized is the better way to solve the problem.
If the time is not sinchronized you can probably put at the begining of the command procedure a line like this (using the undocumented wait behavior):

$ WAIT 'F$GETQUI("DISPLAY_JOB","AFTER_TIME",,"THIS_JOB")'

or without the undocumented behavior:

$ START_TIME = F$GETQUI ("DISPLAY_JOB","AFTER_TIME",,"THIS_JOB")
$ NOW = F$TIME()
$ IF F$CVTIME(START_TIME).GES.F$CVTIME(NOW) THEN GOTO DONE
$ WAIT 'F$DELTA_TIME(START_TIME,NOW)'
$DONE:

Bojan
Wim Van den Wyngaert
Honored Contributor

Re: Batch job completes before start time?

Test on an GS160 2cpu 7.3.

I submitted 20 jobs and compared accounting with the schedule date. They all started 0.15 sec BEFORE the scheduled date.

Even after removal of login.com they got at the point of the first script line 0.06 sec AFTER the scheduled time. And logout said that they had been active for 0.06 sec (not 0.06 + 0.15).

Same test on AS 500 but sylogin and login removed. They all started about 0.02 sec after the scheduled time (in accounting) and needed 0.14 sec to get to the first statement.

Fwiw

Wim
Wim
Hoff
Honored Contributor

Re: Batch job completes before start time?

Consider cron (freeware), Kronos (freeware), CA Schedule IT (formerly DECscheduler; commercial) or another available process scheduler.

Batch queues and the queue manager is and has been a comparatively course mechanism, entirely manual/local set-up and management, and less than easy to deal with failures.

Here are links to cron and Kronos: http://64.223.189.234/node/97

Dean McGorrill
Valued Contributor

Re: Batch job completes before start time?

I used to have a self submitting job at
midnight and I'd have it wait several minutes before resubmitting, so it would
not run twice on the cluster. btw nice
to know the wait feature! implies wait until xxx. nice! Dean
Mike Smith_33
Super Advisor

Re: Batch job completes before start time?

Thanks everyone for all the great points that were made. The obvious problem which I initially dismissed was a time difference of a little over two minutes between two of the nodes. I will check with the NT guys to see if we have an NTP server that everyone should be using. That way everyone will be consistent.