Operating System - OpenVMS
1833059 Members
2627 Online
110049 Solutions
New Discussion

Jobs stay in status starting

 
Wim Van den Wyngaert
Honored Contributor

Jobs stay in status starting

This morning I found a 7.3 cluster with all jobs in starting status.

I tried to restart the queue manager but the command hung.

I did a stop/id of the queue manager. It restarted but the jobs stayed in starting.

Nothing in system log.

I rebooted the queue manager node.

Problem solved.

Found that at the moment that the first job stayed in starting, a new accounting file was created.

Tried to simulate it by doing submits of jobs and set acc/new in a loop. After 3000 new accounting files no problems arrised.

Did someone already see something like this ?

Did someone see the creation of a new accounting file without doing "set acc/new" ?

Wim
Wim
33 REPLIES 33
John Abbott_2
Esteemed Contributor

Re: Jobs stay in status starting

Not seen it.
How badly fragmented was your original acc file ? maybe this had something to do with it ? although I wouldn't have expected a new acc file to be automatically created after an extend error (one possibility).

Kind Regards
John.
Don't do what Donny Dont does
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

13770 fragments. I have other accounting files with 14094 and 14202 fragments.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Uhm, skip that.

It's 346 fragments. But I have files with 390 fragments.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Found something extra.

In the original accounting file accounting continued and the new one created yesterday only started to be used today at boot time.
But no batch jobs in it after the time the new file was created.

Wim
Wim
John Abbott_2
Esteemed Contributor

Re: Jobs stay in status starting

If you copy away (save) the orig acc file, then try to write several records to the end of the orig file, does it work or do you get an error ? Just a thought.

PS. We create a new acc file every month as the acc file was the most fragmented file on our system, so I'm unlikely to see your problem.
Don't do what Donny Dont does
Marc Van den Broeck
Trusted Contributor

Re: Jobs stay in status starting

Wim,

is it possible that someone just created a new acc file with create or with an editor in stead off using set acc/new?

Rgds
Marc
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

The blocks allocated to the original files are not yet all used. So yes, I can write to it. Our disks are defragmented weekly.

Marc : everything is possible. I tried to create the file by hand and it gave no side effects on accounting or qman.

Wim
(suffering from very slow internet)
Wim
Ian Miller.
Honored Contributor

Re: Jobs stay in status starting

could people login at that time (before you rebooted)?
____________________
Purely Personal Opinion
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Yes Ian. All processes could continue running and could be created. Except batch ones. And the queue manager was running without problems too.

It gives the impression that the queue manager (or job_control ?) was waiting for the new accounting file to be operational. For ever.

Wim
Wim
Jan van den Ende
Honored Contributor

Re: Jobs stay in status starting

Wim,


All processes could continue running and could be created. Except batch ones.


How about PRINT jobs?
...trying to nail down or discard the queue manager...

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Jan

No print jobs in starting. But I will check Friday in accounting if any were executed during the night.

Wom
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Jan

No print jobs in starting. But I will check Friday in accounting if any were executed during the night.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

No print jobs. But no printjobs yesterday either. So, no conclusion.

Wim
Wim
Jean-François Piéronne
Trusted Contributor

Re: Jobs stay in status starting

Jan,

may be no related, but

may I suggest to use SDA to take a look at the
QUEUE_MANAGER processes
for examples
sho process, show lock, show process/channel

What the size of SYS$QUEUE_MANAGER.QMAN$JOURNAL
If the file is very large you can try
$ mcr jbc$command
JBC$COMMAND> diag 7


Jean-François
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

JF,

Quotas are monitored. No problems. Channels idem. And I did a diag 7. Nothing changed.

Wim
Wim
Steve Schultz_1
Advisor

Re: Jobs stay in status starting

I have seen your same situation with batch jobs in a starting status. Problem was caused by someone who left their session in authorize and locked the sysuaf file. It only effected batch jobs and users were able to still log in. Once the authorize session was ended the batch jobs executed. HP says this cannot happen, but it has a few times now. Although when I try to recreate the problem, I cannot make it happen.

Steve
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

I still have the crash dump. Will check if anyone was in sysuaf.

Wim
Wim
Ian Miller.
Honored Contributor

Re: Jobs stay in status starting

In the crash dump see what the Q manager was doing.
____________________
Purely Personal Opinion
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Steve : no process was using authorize.exe. But it was a good idea.

Ian : I even restarted the queue manager (with stop/id). What do you mean with "was doing" ?

Wim
Wim
Jim_McKinney
Honored Contributor

Re: Jobs stay in status starting

In addition to the QUEUE_MANAGER process, the JOB_CONTROL process requires access to the ACCOUNTNG file. You might take a look at your dump and see if it was hung. I've had some strange experiences with batch jobs where I've flipped the noacnt bit in the process header with regards to the JOB_CONTROL process. I've found that when this occurs, the process can never exit until that bit is once again set. Similar to your observed behavior in a way (can't start vs can't stop). (In case you wonder why I might flip this bit, to shorten a long story, it was to inhibit image accounting on one particular process on a system where image accounting must remain access).
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Job_control had no channel open to the accounting file. (already checked that but checked again).

Wim
Wim
Volker Halle
Honored Contributor

Re: Jobs stay in status starting

Wim,

only JOB_CONTROL writes to ACCOUNTNG.DAT, no other process.

There is a TIMA article, which explains some of the communication between QUEUE_MANAGER and JOB_CONTROL:

http://h18000.www1.hp.com/support/asktima/operating_systems/CTI_SRC930602002229.html

The batch processes are to be created by JOB_CONTROL. As are interactive processes, but the notification mechanisms are different (IPC or mailbox msg).

Have any of the BACTH processes already been created ? LOGINOUT would then report back to the QUEUE_MANAGER and the job would not be in 'starting' anymore.

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: Jobs stay in status starting

Volker,

The processes were not created. And there were multiple jobs started in a timerange of about 18 hours that were all in starting.

Wim
Wim
Volker Halle
Honored Contributor

Re: Jobs stay in status starting

Wim,

this implies, that the key problem must be between QUEUE_MANAGER and JOB_CONTROL, most likely in JOB_CONTROL itself, as that process has been involved in the stopping/starting of ACCOUNTNG.

One would need to dig through the sources to find some kind of internal work-queue(s) for requests to JOB_CONTROL and then verify this in the forced crash.

Volker.