1745788 Members
3342 Online
108722 Solutions
New Discussion юеВ

DECSCHEDULER

 
Kevin Raven (UK)
Frequent Advisor

DECSCHEDULER

Where does scheduler get it's history data from ?
We have been running the following command for the last 3 years with no issues ..but now it fails with an ugly error message.
Is this an indication that scheduler database file , is about to go pop ?

$ ! get history of all job which ran in the last 24 hours
$ ! -----------------------------------------------------
$ minus1day = F$CVTIME("-1-00:00:00","ABSOLUTE")
$ SCHED SHOW HIST/GROUP=*/RECORD-
/OUT=RDTS_LOG:SCHED_200907102240.LIS-
/COMPLETION=ALL/USER=AUTO_OP/START_TIME="9-JUL-2009 22:40:05.34"
%NSCHED-I-NORUNREC, No specified run records found for job 6
%NSCHED-I-NORUNREC, No specified run records found for job 7
%NSCHED-I-NORUNREC, No specified run records found for job 41
%NSCHED-I-NORUNREC, No specified run records found for job 61
%NSCHED-I-NORUNREC, No specified run records found for job 33
%NSCHED-I-NORUNREC, No specified run records found for job 32
%NSCHED-I-NORUNREC, No specified run records found for job 31
%NSCHED-I-NORUNREC, No specified run records found for job 52
%NSCHED-I-NORUNREC, No specified run records found for job 35
%NSCHED-I-NORUNREC, No specified run records found for job 34
%NSCHED-I-NORUNREC, No specified run records found for job 55
%NSCHED-I-NORUNREC, No specified run records found for job 53
%NSCHED-I-NORUNREC, No specified run records found for job 27
%NSCHED-I-NORUNREC, No specified run records found for job 42
%NSCHED-I-NORUNREC, No specified run records found for job 62
%NSCHED-I-NORUNREC, No specified run records found for job 63
%NSCHED-I-NORUNREC, No specified run records found for job 43
%NSCHED-I-NORUNREC, No specified run records found for job 44
%NSCHED-I-NORUNREC, No specified run records found for job 38
%NSCHED-I-NORUNREC, No specified run records found for job 58
%NSCHED-I-NORUNREC, No specified run records found for job 40
%NSCHED-I-NORUNREC, No specified run records found for job 60
%NSCHED-I-NORUNREC, No specified run records found for job 54
%NSCHED-I-NORUNREC, No specified run records found for job 37
%NSCHED-I-NORUNREC, No specified run records found for job 39
%NSCHED-I-NORUNREC, No specified run records found for job 57
%NSCHED-I-NORUNREC, No specified run records found for job 59
%NSCHED-I-NORUNREC, No specified run records found for job 46
%NSCHED-I-NORUNREC, No specified run records found for job 65
%NSCHED-I-NORUNREC, No specified run records found for job 28
%NSCHED-I-NORUNREC, No specified run records found for job 45
%NSCHED-I-NORUNREC, No specified run records found for job 64
%NSCHED-I-NORUNREC, No specified run records found for job 36
%NSCHED-I-NORUNREC, No specified run records found for job 56
%NSCHED-I-NORUNREC, No specified run records found for job 29
%NSCHED-I-NORUNREC, No specified run records found for job 4
%NSCHED-I-NORUNREC, No specified run records found for job 5
%NSCHED-I-NORUNREC, No specified run records found for job 30
%NSCHED-I-NORUNREC, No specified run records found for job 49
%NSCHED-I-NORUNREC, No specified run records found for job 48
%NSCHED-I-NORUNREC, No specified run records found for job 51
%NSCHED-I-NORUNREC, No specified run records found for job 50
$
$ ! create a new log if necessary (every 90 DAYS)
$ DAYS_90 = F$CVTIME("-90-00:00:00","COMPARISON",)
$ schedlog_file = f$trnlnm("NSCHED$LOGFILE")
$ if f$search(schedlog_file) .nes. ""
$ then
$ if F$CVTIME(F$file(schedlog_file,"CDT"),"COMPARISON",) .les. DAYS_90
$ endif
$ endif
$
$
$ Above all works fine ...then it checks if the job is being run on a Friday and ....

$ SCHED SHO JOB GET_STATS_DATA_WKLY/SYM ! will set sched$type
$ if SCHED$TYPE .eqs. "OFFLINE" then - ! if offline job (fri)
SCHED SHOW HIST/GROUP=* /COMPLETION=ALL/USER=AUTO_OP/OUT=RDTS_LOG:SCHED_HIST_200907102240.LIS
SUB VSS$GET_HISTORY: Error 51 at line 29 ?Integer error or overflow
%SYSTEM-F-HPARITH, high performance arithmetic trap, Imask=00000000, Fmask=00000504, summary=12, PC=0000000000000003, PS=82B29380
-SYSTEM-F-FLTINV, floating invalid operation, PC=0000000000000003, PS=82B29380
-SYSTEM-F-FLTUND, arithmetic trap, floating underflow at PC=0000000000000003, PS=82B29380
$ L_ERROR:
$ My_status = $status
$ if My_status .eq. My_aborted then goto L_ABORTED
$ log "SCHED-E-GET_STATS_DATA_WKLY terminated. Error status : %X00000504"
19 REPLIES 19
Peter Elliott
Occasional Advisor

Re: DECSCHEDULER

Hi,
Not too sure about the actual error message you are receiving... But generally, I believe Scheduler writes History records to a Default logfile of NSCHED$:.log
This also requires the logical NSCHED$DEFAULT_LOG to be set up in the Scheduler startup routine (normally to 5)
I've always found most scheduler issues can be fixed by running NSCHED$:DB_UTILITY and compressing the Database.
(Try the command: Sched Show Delete - to see how many deleted record there are in the DB)
Also - if you haven't already - maybe look to upgrade to the latest version provided by CA: http://supportconnectw.ca.com/public/unijobmgtopenvms/unijobmgtopenvms_supp.asp

Hope this helps...

Peter.
Peter Elliott
Occasional Advisor

Re: DECSCHEDULER

Sorry - the link I sent is probably not the best - The data sheets on the following link are better:

http://www.ca.com/us/products/product.aspx?id=1485#documents
Kevin Raven (UK)
Frequent Advisor

Re: DECSCHEDULER

We rebuild the DB on a regular basis ...once every 6 months ....
Do the historystats come from the VSS.DAT ?
We have a record of jobs from 2006 when the history command is run on another system (mirror of production , that has the issue).


(LNM$SYSTEM_TABLE)

"NSCHED$" = "NSCHED$DATA"
= "NSCHED$COM"
= "NSCHED$EXE"
"NSCHED$CLEAR_RESTART_PARAM" = "TRUE"
"NSCHED$COM" = "SYS$COMMON:[NSCHED.COM]"
"NSCHED$DATA" = "SYS$COMMON:[NSCHED.DATA]"
"NSCHED$DEFAULT_JOB_MAX" = "10"
"NSCHED$EXE" = "SYS$COMMON:[NSCHED.EXE]"
"NSCHED$LBAL$CPU_WEIGHT" = "0.5"
"NSCHED$LBAL$INTERVAL" = "0 00:00:30"
"NSCHED$LBAL$MEMORY_WEIGHT" = "0.5"
"NSCHED$LOGFILE" = "SYS$COMMON:[NSCHED]DECSCHEDULER.LOG"
"NSCHED$MAILBOX" = "_MBA58:"
"NSCHED$REMOTE_SUPPORT_ENABLED" = "TRUE"
"NSCHED$TERM_MAILBOX" = "_MBA63:"
"NSCHED$UID" = "NSCHED$:SCHEDULER$MOTIF.UID"
"NSCHED_DEFAULT_SD_ACTION" = "SKIP"
"RDTS_SCHED" = "SYS_SYSTEM:[OMN$SCHED]"
"RDTS_SCHEDULER_REPORTS" = "SYS_STATS:[RDTS.REPORTS.SCHEDULER]"
"SCHED$FLAG" = "OMN$FLG"
"SCHED$LBAL_GROUP_HOST" = "RDTSC RDTSD"
"SCHEDULER$NODE" = "RDTSC::"
"SCHED_RTL" = "NSCHED$:SCHED_RTL.EXE"
"SCHED_RTL_TV" = "NSCHED$:SCHED_RTL.EXE"
"SCHED_SHUTDOWN" = "AUTO-RESTART"

(LNM$SYSCLUSTER_TABLE)

(SYSMAN$NODE_TABLE)
$ $ DIR NSCHED$:*.LOG/SIZE/DATE

Directory SYS$COMMON:[NSCHED.DATA]

RDTSA.LOG;2 3 14-NOV-2006 10:40:01.45
RDTSA_REMOTE_EXECUTOR.LOG;2
1 14-NOV-2006 10:40:02.35
RDTSB.LOG;2 3 14-NOV-2006 10:41:40.16
RDTSB_REMOTE_EXECUTOR.LOG;2
1 14-NOV-2006 10:41:40.18
RDTSC.LOG;35 4 22-NOV-2006 15:29:09.19
RDTSC_REMOTE_EXECUTOR.LOG;34
1 22-NOV-2006 15:29:09.27
RDTSD.LOG;21 3 22-NOV-2006 15:29:10.52
RDTSD_REMOTE_EXECUTOR.LOG;21
1 22-NOV-2006 15:29:10.58
SCHED$ERR.LOG;1 0 10-NOV-2006 11:58:45.84
VERMONT_CREAMERY.LOG;1
288 10-NOV-2006 11:40:47.78

Total of 10 files, 305 blocks.

Directory SYS$COMMON:[NSCHED.EXE]

SCHED$ERR.LOG;1 0 13-NOV-2006 16:10:35.21

Total of 1 file, 0 blocks.

Grand total of 2 directories, 11 files, 305 blocks.
$

The log file gets rolled over every 90 days see below ....
$ DIR SYS$COMMON:[NSCHED]DECSCHEDULER.LOG/SIZE/DATE

Directory SYS$COMMON:[NSCHED]

DECSCHEDULER.LOG;1 9744 4-MAY-2009 21:24:59.96

Total of 1 file, 9744 blocks.
$

What is the VERMONT_CREAMERY.LOG;1 file used for ???? Funny name .....lol



Peter Elliott
Occasional Advisor

Re: DECSCHEDULER

I think (having read up some more), unless you specify otherwise - the Default Logfile which contains Job History is Vermont_Creamery.log (I believe they were the first company to run Scheduler).
There are also suggestions that on some older versions - Event history was sometimes written to the Vermont_cremery.old file instead of the .log

What version of scheduler are you running ?
($Sched show stat)
marsh_1
Honored Contributor

Re: DECSCHEDULER

hi,

vermont creamery - thats where they started with it , well thats the story they gave us when we had decscheduler when digital still had it.

fwiw

Kevin Raven (UK)
Frequent Advisor

Re: DECSCHEDULER

Looking at our system ......

The logical to assign the scheduler log file is set ...see below and data is being logged into the log file.
This file is rolled over every 90 days ....
The vermont Creamery log file is dormant and not being logged into.

The VSS.DAT file was last rebuilt in October 2007 .....

So when I run the History command ...where is picking the data up from ...
the history shows job runs prior to the date stamps on both the log file and VSS.DAT file.

Version as Below .....

$ sched show stat
Node Version Started Jobs Jmax Log Pri Rating
RDTSC v3.0 15-JUL-2009 14:02:18 0 10 5 4 283 <-- Default
RDTSD v3.0 15-JUL-2009 14:02:19 0 10 5 4 261
$

$ show log *sched*log*

(LNM$PROCESS_TABLE)

(LNM$JOB_80CFEEC0)

(LNM$GROUP_000102)

(LNM$SYSTEM_TABLE)

"NSCHED$LOGFILE" = "SYS$COMMON:[NSCHED]DECSCHEDULER.LOG"

(LNM$SYSCLUSTER_TABLE)

(SYSMAN$NODE_TABLE)
$

Rolled over every 90 days ....

Directory SYS$COMMON:[NSCHED]

DECSCHEDULER.LOG;1 File ID: (5215,9685,0)
Size: 9744/9744 Owner: [SYSTEM]
Created: 4-MAY-2009 21:24:59.96
Revised: 15-JUL-2009 14:02:14.56 (5)
Expires:
Backup:
Effective:
Recording:
Accessed:
Attributes:
Modified:
Linkcount: 1
File organization: Indexed, Prolog: 3, Using 2 keys
Shelved state: Online
Caching attribute: Writethrough
File attributes: Allocation: 9744, Extend: 500, Maximum bucket size: 2
Global buffer count: 0, No version limit
Record format: Variable length, maximum 156 bytes, longest 0 bytes
Record attributes: Carriage return carriage control
RMS attributes: None
Journaling enabled: None
File protection: System:RWED, Owner:RWED, Group:RE, World:
Access Cntrl List: None
Client attributes: None

Total of 1 file, 9744/9744 blocks.
$

Vermont log file ....not an active log file because of the logical being defined....
Directory SYS$COMMON:[NSCHED.DATA]

VERMONT_CREAMERY.LOG;1 File ID: (1181,2,0)
Size: 288/288 Owner: [SYSTEM]
Created: 10-NOV-2006 11:40:47.78
Revised: 14-NOV-2006 15:28:00.69 (4)
Expires:
Backup:
Effective:
Recording:
Accessed:
Attributes:
Modified:
Linkcount: 1
File organization: Indexed, Prolog: 3, Using 2 keys
Shelved state: Online
Caching attribute: Writethrough
File attributes: Allocation: 288, Extend: 500, Maximum bucket size: 2
Global buffer count: 0, No version limit
Record format: Variable length, maximum 156 bytes, longest 0 bytes
Record attributes: Carriage return carriage control
RMS attributes: None
Journaling enabled: None
File protection: System:RWED, Owner:RWED, Group:RE, World:
Access Cntrl List: (IDENTIFIER=*,ACCESS=READ+WRITE)
Client attributes: None

Total of 1 file, 288/288 blocks.
$


Date stamps on VSS.DAT ---- Scheduler database ?


$ dir SYS$COMMON:[NSCHED...]vss.dat/full

Directory SYS$COMMON:[NSCHED.DATA]

VSS.DAT;1 File ID: (4794,8,0)
Size: 2208/2208 Owner: [SYSTEM]
Created: 31-OCT-2007 11:57:54.00
Revised: 15-JUL-2009 15:10:34.71 (7696)
Expires:
Backup:
Effective:
Recording:
Accessed:
Attributes:
Modified:
Linkcount: 1
File organization: Indexed, Prolog: 3, Using 6 keys
Shelved state: Online
Caching attribute: Writethrough
File attributes: Allocation: 2208, Extend: 300, Maximum bucket size: 3
Global buffer count: 0, No version limit
Record format: Fixed length 1024 byte records
Record attributes: Carriage return carriage control
RMS attributes: None
Journaling enabled: None
File protection: System:RWED, Owner:RWED, Group:, World:
Access Cntrl List: (IDENTIFIER=*,ACCESS=READ+WRITE)
Client attributes: None

Total of 1 file, 2208/2208 blocks.
$


Output from History command .....(This is all on a system where it works) ...mirror of production ....
$ SCHED SHOW HIST/GROUP=* /COMPLETION=ALL/USER=AUTO_OP/out=a.a
$ type a.a/page

Current Minimum Average Maximum Scale
Elapsed Time : 0.00 0.22 7.96 15.81 Seconds
CPU Time : 0.00 0.10 0.39 0.61 Seconds
Page Faults : 0 1906 4715 5798
Pk Wk Set Size : 0 8192 8224 8496
Buffered I/O : 0 446 1007 1220
Direct I/O : 0 240 839 1132
Mounted Vols : 0 0 0 0
--------------------------------------------------------------------------------
Job : 21 - BACKUP_DB_EDUMP_WKLY Owner : AUTO_OP
Count : 167 SUCCESS and FAILURE Records Combined Server : Local
Earliest Login Time : 11-NOV-2006 17:57:17.25
Last Completion Time : 10-JUL-2009 22:30:37.77
Current Login Time : Job not running

Current Minimum Average Maximum Scale
Elapsed Time : 0.00 0.01 57.21 6610.08 Minutes
CPU Time : 0.00 0.00 1.55 7.28 Seconds
Page Faults : 0 0 3230 3515
Pk Wk Set Size : 0 0 42854 47792
Buffered I/O : 0 0 3072 14952
Direct I/O : 0 0 12518 70813
Mounted Vols : 0 0 0 0
Press RETURN

--------------------------------------------------------------------------------
Job : 21 - BACKUP_DB_EDUMP_WKLY Owner : AUTO_OP
Count : 167 SUCCESS and FAILURE Records Combined Server : Local
Earliest Login Time : 11-NOV-2006 17:57:17.25
Last Completion Time : 10-JUL-2009 22:30:37.77
Current Login Time : Job not running

Current Minimum Average Maximum Scale
Elapsed Time : 0.00 0.01 57.21 6610.08 Minutes
CPU Time : 0.00 0.00 1.55 7.28 Seconds
Page Faults : 0 0 3230 3515
Pk Wk Set Size : 0 0 42854 47792
Buffered I/O : 0 0 3072 14952
Direct I/O : 0 0 12518 70813
Mounted Vols : 0 0 0 0



History command has data on jobs that run fist run in 2006 ...yet VSS.DAT and DECSHEDULER.LOG ...post date this ...
So where is the History command getting its data from ?

Wim Van den Wyngaert
Honored Contributor

Re: DECSCHEDULER

Is it possible that it is exactly what it says ? No run record found = the job was aborted before it started ?

Wim without VMS
Wim
Thomas Ritter
Respected Contributor

Re: DECSCHEDULER

Raven,
we use nsched$:VSS_REPORTS.EXE to produce reports.
Kevin Raven (UK)
Frequent Advisor

Re: DECSCHEDULER

I have been digging and found the following ....

The history stats do indeed come from the DECHEDULER log file ...
Because we use the /summ switch when we roll the log file over ...it passes the run history from the old log into the new log.
Thus even with a log file less than 90 days ...it still knows about job history back in 2006.

The command to produce the history stats worked on the 3rd of July ...
It now falls over when run a wek later on the 10th of July.

It bombs out just prior to reporting on a job that has run over 260,000 times since 2006 (?) ....
So I think we might have hit a magic number between the history command working on the 3rd and not on the 10th ...
greater than 262143 or 111111111111111111....

So I have written a test script to run via scheduler every 1/2 a second ...script is below...
$!
$exit
$!

lol ....

So now waiting to see what happens on a server where the history command works ...when this sched job has run more than 262143 times.