Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

how to optimize queue parameters

 
SOLVED
Go to solution
J Ancheta
Advisor

how to optimize queue parameters

Greetings. We replaced our AS1200 with another one that has two 2 CPU's, 2GB memory, and Raid drives. Prior setup was 1 CPU, 768MB memory, and SCSI drives. Also upgraded from OpenVMS 7.1 (I think) to 7.3-2

We run a maintenance program that submits a lot of jobs to one queue. (each jobs contains a script with CONV and COPY commands for one file) We expected that the maint program would run much faster however it appears that we only shave off 10 percent compared to before.

Queue:
$ INIT/QUE/BATCH/START/JOB_LIMIT=4
/BASE_PRIORITY=4
/WSDEFAULT=1024
/WSQUOTA=8192
/WSEXTENT=40000
/CPUMAX=INFINITE MQUE

My question is, can this queue be optimized? I noticed that these parameters are the same as our old machine but perhaps it can be tweaked as we have a better equipped machine than before. I've tried to research on my own however I'm not familiar with various OpenVMS terminology.

I've included an attachment that provides info on our system (sh work, sh cpu, sh mem)

TIA.
17 REPLIES 17
Jan van den Ende
Honored Contributor
Solution

Re: how to optimize queue parameters

J.

Firstly, does this system also do interactive work? In that case, I would lower the queue base priority (to 3 or possibly 2) to achieve better EXPIRIENCED performance.

Second, I do not see any system utilisation, but if "a lot" of jobs means as many as to get jobs waiting in the queue, then JOB_LIMIT might be enlarged.

But, most importantly, WHAT are the (SYSUAF) params for the account that runs the jobs?

If you drive a Ferrari, but the driver only uses 1st gear, speed records will be rare!

Please post the output of

$ MCR AUTHORIZE SHOW /FULL, and perhaps we can get some further.

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.
Hein van den Heuvel
Honored Contributor

Re: how to optimize queue parameters

The attachment mentions 20% cpu from monitor topcu.
That is wide open for interpretation, but let's assume that was for the 4 jobs during the maintenance window.
Well, then 1 cpu can handle all the work, but serializes some, so only a modest improvement can be expected.
So one potential way to speed this up is to set the JOB_LIMIT to 8 or 10, allowing more jobs to run.
Check $ HELP SET QUE /JOB

I suspect though that you will soon run into a CPU

How many jobs are there to run through ? 'a lot' does not mean too much.

TO get a better overview of where time is spend you want to see resource consumption from either SHOW ACCOUNTING, or maybe start with a simple through all batch jobs for a day
$ SEAR/SINCE *.LOG terminated,elapsed

Attach a text in a reply if ned be.
It should show you whihc jobs took long and give a first indication why (cpu or not)

Maybe one long job 'hides' the speedup in the rest?

Now about the jobs.... can those not be tweaked to become faster? With new disks, maybe the IO load can be spread wider (CONVWORK, SORTWORK, 'temp'...)?

You mention files are converted and also copied? Why ?
Can the convert not put the file in the right place? ( logical names for directories/devices? )

If you have many similar jobs, then can you show the core commands for one as example?

During the batch jobs, check out MONI MODE and MONI DISK.
Beter still, try to get T4 going (google: openvms t4)

Good luck!
Hein van den Heuvel
HvdH Performance Consulting
Hein van den Heuvel
Honored Contributor

Re: how to optimize queue parameters


Somehow I posted>>> I suspect though that you will soon run into a CPU

Meant to write:

"I suspect though that you will soon run into an IO bottleneck instead."

Hein
J Ancheta
Advisor

Re: how to optimize queue parameters

Jan, thanks for replying. My replies:
Firstly... Yes. I terminate user sessions and jobs running in our main queue (sys$batch) and part of the maint program shuts downs other queues
Second... the maint program throws in about 200 jobs. I've noticed that some jobs complete within a couple of minutes, while one particular job took 3 hours
For example:
Prior to Maintenance
--- CILINE.DAT;1 3427270/3427270 29-APR-2010 03:52:52.53
After to Maintenance
--- CILINE.DAT;1 2950045/3097570 27-AUG-2010 07:37:52.23

... which comes to mind, what if I seclude these jobs that process large files into a separate queue with similar setup as MQUE?

I've attached the sysuaf parameters. I run it under user JOE but I can run also under another user REOTRC but I don't know the difference between the two.
J Ancheta
Advisor

Re: how to optimize queue parameters

Hein, thanks for replying. My replies:
I was looking at the graph on $ MONITOR process/topcpu. I've also attached graphs. The BATCH_xxxx are the jobs that are running the maint.

...How many jobs? - about 200 or more.

I typed in $ show accounting
Accounting is not currently enabled.

... IO load spread wider - I don't know what that means. Create another queue to run the other half of jobs?

... core commands - I don't known what you mean by core commands.
abrsvc
Respected Contributor

Re: how to optimize queue parameters

Based upon the limited information here, I'd suspect an I/O bound series of jobs. It looks like there are only 2 drives involved. A creaful look at the individual jobs and what they are doing is required here. Perhaps a visit by an OpenVMS expert is in order? While we can suggest things to hopefully improve overall performance, a close examination and understanding of the tasks at hand will be required to rpovide the best suggestions.

At a minimum:

1) Hardware configuration (CPU/disk/etc)
2) Specifics of file size, location, type.
3) An idea of what these batch jobs do.
4) Are the "other" disks usable as temporary file areas?

Dan
J Ancheta
Advisor

Re: how to optimize queue parameters

Dan. Thanks for your reply. A script goes through various directories in the VMS and passes parameters (I think the file name) to another program which creates the jobs in the queue (i've attached REORG_ONE.TXT)
Hein van den Heuvel
Honored Contributor

Re: how to optimize queue parameters

From the UAF records the PGFLQUO is lowish, which may tell the sort, which is part of convert, to not use all (memory) resources available. Set to a million, allowing 500MB memory work space?

>> Accounting is not currently enabled.

So folks are not really interested in performance, they just want it to become better miraculously. (throw HW money at the problem)
Get that going! No excuse not to. Minimal overhead, lots of data.

>> IO load spread wider - I don't know what that means

For example, the provided example shows:
"$ ASSIGN DKC400:[WORK] SYS$SCRATCH"
Is that the same for all?
Are there other disks?
Can each job 'rotate' through a list, perhaps based on target file, or even the lowest digit of the job number or something else?

Start using TEMP1 !
Start using the system disk more!
Start using those DR drives more, define an extra (RAID-0) one for more scratch?


Also check out CONVWORK to a drive different from SORTWORK/SYS$SCRATCH

>> core commands - I don't known what you mean by core commands.

Those that do the real / time-consuming work.
Hint: check out: $ HELP SET PREFIX EXAMPLE

Looks like the core command were:

$CONV/NOSORT/FAST/FDL=FDL:'P2'.FDL/EXCE='P1''P2'.EXC -
/SECOND='YY' 'P1''P2'.DAT 'TDEV''P2'.NEW
$ COPY/CONT/ALLO='ALLOC' 'TDEV''P2'.NEW 'P1''P2'.DAT

That COPY seems costly, avoidable and scary.
Scary because you have half a file for a while. I prefer RENAME to bring a file in production.
Costly because there are 3 copies of the data: OLD, TMP, DAT
Avoidable by having properly maintained FDLs getting the 'right' allocations/attributes.

Note 1... where is the /STAT[=FULL] on those converts. Use that free data!

Note 2... Blindly using /SEC=NOK can cause performance issues due to excessive CONVWORK sizes.

>> .. which comes to mind, what if I seclude these jobs that process large files into a separate queue with similar setup as MQUE?

You want to order the jobs by descending anticipate time, which is likely related to the file size. Get the largest one going first, the little ones will 'fill the holes' and self-balance towards the end.

Regards,
Hein van den Heuvel ( at gmail )
HvdH Performance Consulting
Jan van den Ende
Honored Contributor

Re: how to optimize queue parameters

J.

The first thing I would try is check for [work] on dkc200, and create ir if not existent.

Next in REORG_ONE.TXT change
$ ASSIGN DKC400:[WORK] SYS$SCRATCH
into
$ ASSIGN DKC200:[WORK] SYS$SCRATCH

(note: please do this ONLY if DKC200 has enough free space! )

Now check how much of the IO to DKC400 gets split over the two devices.

And I already asked: Please publish
MC AUTHORIZE SHOW JOE /FULL

I have the impression that you have the equivalent of an entire football field at your jobs' disposal, but restrict the user to never move more than 1 meter from his start position.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
J Ancheta
Advisor

Re: how to optimize queue parameters

Jan,

Thanks again for your reply. I posted on my first reply on this thread - unless your asking me to run a different command.

If I understand correctly, your suggesting to split some of the workload for example: 100 jobs process in DKC400 and the other 100 jobs process in DKC200, right?
J Ancheta
Advisor

Re: how to optimize queue parameters

Hein,

Thanks for your reply again. A lot for me to digest and absorb.

Off the bat you mentioned increase UAF record which will make more memory available. Can you please give me a hint on how to do that?

Jan van den Ende
Honored Contributor

Re: how to optimize queue parameters

J.

>>>
Thanks again for your reply. I posted on my first reply on this thread - unless your asking me to run a different command.
<<<

Yeah, my bad. I overlooked your firsrt attachment.

And somehow Hein's and my reply intertwined.

As is to be expected, his answer is much more detailed, especially on RMS related stuff.

I ment more or less the same, but his wording in much better and better detailed.

DO try bto digest it, and ask detailed questions on anything unclear!

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
John Gillings
Honored Contributor

Re: how to optimize queue parameters

J,
Shave off 10% is a pretty good result!

>can this queue be optimized?

The word "optimized" should never appear in any question without being immediately followed bny "for ". There is no such thing as generic optimization, you have to optimize for something, for example, reduced elapsed time, reduced CPU time, fastest I/O, minimum memory, etc... By definition, improving one metric will cost some other metric. All you can do is move the bottle neck around.

I'll assume the metric you're interested in is ELAPSED TIME.

Queues don't really have much impact on performance, so they're not really a good place to perform tuning, with the SOLE exception of the JOB_LIMIT. This will control how many jobs can run in parallel.

CONVERT and COPY tend to be I/O bound, so it's possible you can run several in parallel (which you can control with JOB_LIMIT), however, if you have multiple I/O streams from or to the same disks, they may get in each other's way and REDUCE performance over single stream. You can also end up fragmenting files unnecessarily.

Without knowing the hardware configuration, or the sources and destinations of your files, it's not possible to give specific advice, but if you think about it as if it's a road network, you may be able to determine a good sequence of jobs. For example, suppose you have 4 disks, A, B, C, D and you have 4 jobs, two of which are A->B and two C->D. If you run them two at a time you could run the two A->B in parallel, then the C->Ds in parallel. This would maximise contention and probably perform poorly. If you go the other way and run A->B against C->D they don't get in each other's way, so perform better.

(This is simplistic, as it doesn't consider controllers, busses and other I/O infrastructure, but hopefully you get the idea)

The key to performance for this type of job is the exact sequence of operations, to prevent contention. You can't really do that if you just generate a bunch of jobs and dumping on a single queue.

What you may want to consider is to analyze your environment to determine which operations can be done in parallel. Create a separate queue for each "contention group" and give it /JOB_LIMIT=1.

Now divide up your jobs and submit each to the appropriate queue. Now you can run as many jobs as possible in parallel (one in each queue), while minimising contention, and thereby minimising total elapsed time.
A crucible of informative mistakes
J Ancheta
Advisor

Re: how to optimize queue parameters

Hein, Jan, and John,

Thank you all for posting and guidance. I now understand of how our maintenance program works and see the loop holes you've indicated and have been actively researching various terminology.

I created new scripts using the old scripts as a basis and applied some of your suggestions and letting her rip as of the moment. I know this is quite a bold attempt but nevertheless I was extremely careful in not modifying various core commands.

I appologize as I realize that I mislabelled this post but I had no clue what was going on, what to look for, and where to start (newbie).

I will return to post how much time was saved.

Cheers.
Hein van den Heuvel
Honored Contributor

Re: how to optimize queue parameters

Sounds like you are doing great!
We are proud of you!

:-)

Seriously, it sounds like you are picking up terminology and understanding quite fast.

Your next step, if needed, is to review the FDL files used.
Specifically you may want to do monthly/yearly ANALYZE/RMS/FDL and EDIT/FDL/NOINTERACTIVE cycle for those files which are not 'hand tuned' already.

Hint: do the ANAL/RMS outside the maintenance window, possibly after all converts are done, on the OLD files.

Cheers,
Hein
J Ancheta
Advisor

Re: how to optimize queue parameters

The new maintenance procedure that I ran took approx 2 1/2 hours which used to be 5 hours!
abrsvc
Respected Contributor

Re: how to optimize queue parameters

Cutting the elapsed time in half is a wonderful start. With the new procedures, take a look at paging and disk activity. As John said earlier, making sure that the disks are not fighting each other is important. Check the IO stats for activity as well.

Careful use of the DCL SYNC can help to coordinate jobs as well.

One side note: I'm not sure if contiguous files are as important these days as in the past. You may want to test whether or not that requirement is necesasry. Depending upon how the individual "bits" of the file are accessed, neing contiguous may not be buying you much in performanc.

Dan