Operating System - OpenVMS
1830517 Members
2896 Online
110006 Solutions
New Discussion

Re: jobs sitting in queues

 
SOLVED
Go to solution
Lucinda_1
Frequent Advisor

jobs sitting in queues

OpenVMS 7.3-2 tri-host cluster. 2)es45 production nodes/ 1)ds25 develop. node All patches are current.
"jobs" print and BATCH will sit in the que and do nothing. The log files indicate nothing, the job just simply stops and stays there. No adverse i/o on system, nothing. The job will sit there until it is killed. Run the job again, all is fine. I cannot seem to pin down one thing that is a constant, ie user, particular job, or even particular application area. Seems it has something to do with the queues, files system, or someting that is "overall" involved. But again, no errors to help lead to a solution.
16 REPLIES 16
Ian McKerracher_1
Trusted Contributor

Re: jobs sitting in queues

Hello Lucinda,

I would be tempted to check again that your patches are current as there has been a similar problem in the past and an ECO was released.


Regards,

Ian

Daniel Fernandez Illan
Trusted Contributor

Re: jobs sitting in queues

Hi Lucinda.
Are you modified recently node names of your environment?
In this case you must check definition of queues (qualifier /ON)
Saludos.
Daniel.
Volker Halle
Honored Contributor

Re: jobs sitting in queues

Lucinda,

if this problem happens, what is the status of the jobs and the status of the queues, these jobs are in ?

Any errors in OPERATOR.LOG ?

Volker.
Lucinda_1
Frequent Advisor

Re: jobs sitting in queues

sorting thru the few patches not installed. Not finding any relevant
no modifications in node names. no errors in any log file. there is one job log, additionally there is a log kept in the users directory, no errors found. I have submitted this problem to Compaq as well as to the company that provides support for our file system, no help.
Arch_Muthiah
Honored Contributor

Re: jobs sitting in queues

Hi Perry,

How you configured your printer ?

Configured as LPD printer or telnet printer?

In anycase, if you send output of

$ SHOW QUEUE/MANAGER ---> will make sure your queue manager is running

$ SHOW QUEUE ---> list the print job submitted and the status of each print job and you can see to which node your TCPIP$LPD queue has been mounted.

If you see the result of hte above command, you can easily figure out the problem, or send to us.

In otherway, I would suggest you to do this trials...

Reconfigure your LPD printer..
Before you configure TCPIP$LPD printer, your queue manager should start and running...

$ start/queue/manger if your queue manger not running already.

reconfigure LPD printer using sys$manager:tcpip$config.com procedure.
You knew LPD printer uses 515 port#.

now submit any print job,
if $SHOW QUEUE still hanging with pending state, please send the output of

$SHOW QUEUE

In other way, (I would not suggest you to have the Telnet printer instead of LPD configured printer)
If you would like to test that you can communicate with your printer using TELNET,
just try to do this trials

$ init/queue/start/process = TCPIP$TELNETsym
/on = "IP-address-of-your-printer:port#"
symbol_for_your_printer

this will create/initialize a new telnet printer.

now ping to that telnet printer using
$ TELNET IP_Address_of_your_printer port#

Note: Sysadmin knows the assigned port number for this printer, mostly 9100 will be the port# for HP branded printer.


Archunan

Regards
Archie
Lucinda_1
Frequent Advisor

Re: jobs sitting in queues

this is not just a printer problem, it happens in job queues as well, i have lpd, lat, and TCPIP$TELNETSYM printers.
Ian Miller.
Honored Contributor

Re: jobs sitting in queues

the output of the show commands previously would be interesting. Also is the queue journal file large?
____________________
Purely Personal Opinion
Lucinda_1
Frequent Advisor

Re: jobs sitting in queues

que journal file?
Lucinda_1
Frequent Advisor

Re: jobs sitting in queues

i am downloading patches, and I have to schedule install. I will try the printer directions and let you know the results. The tough part is that it doesnt always hang.
Ian Miller.
Honored Contributor

Re: jobs sitting in queues

queue journal file
SYS$QUEUE_MANAGER.QMAN$JOURNAL

There has been past bugs where this would get large and there was trouble.

is the queue on the same node as the queue manager is running or another node. If another node then could there more an issue with the cluster communications between the nodes?
(queues use ipc)
____________________
Purely Personal Opinion
Robert_Boyd
Respected Contributor

Re: jobs sitting in queues

Lucinda,

Another troubleshooting approach on this problem -- when a job is hanging, try enabling operator messages at your session:

$ REPLY/ENABLE

Then issue

$ REPLY/STATUS

And see what if anything you get.

Also, if you have operator logging enabled then you might look in SYS$MANAGER:OPERATOR.LOG or whatever file OPC$LOGFILE_NAME is pointing at.

Also, if you see a job stuck like that you might go into ANALYZE/SYSTEM and do a
SHOW PROCESS/ID=/CHANNEL
for the pid of the hung job and see what I/O channels are open and busy (if any).

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Art Wiens
Respected Contributor

Re: jobs sitting in queues

I have had trouble where the system time on the cluster member running the queue manager was several minutes out from cluster members that were doing the submitting/printing. Jobs that were holding for a particular time did not start as expected, because it wasn't time yet according to the queue manager.

Art
John Yu_1
Valued Contributor
Solution

Re: jobs sitting in queues

Might be process limits or quotas?
Artificial intelligence is rarely a match for natural stupidity.
Lawrence Czlapinski
Trusted Contributor

Re: jobs sitting in queues

Lucinda:
1. For the printers,
$show log/sys lat$symparameter
We had problems where sometimes a symbiont would hang. When a printer queue hangs, you could check whether the printers are using the same symbiont. You might try lowering the value of LAT$SYMPARAMETER.

$! * Define the maximum number of printer symbionts for LAT queues
$! ( 1 means new symbiont each time a queue stops ...
$! ( 4 is an o.k. alternative if performance suffers [default 16?]
$ define/sys/exec lat$symparameter "1"
2. Are all three nodes using the same queue manager? If they aren't, we have seen the journal file grow quite large on one of our nodes and we had assorted queue problems.
3. Since you are having problems with both printers and batches, it might be a problem with the queue manager journal file. You could check to see whether it has grown quite large.
4. I'm assuming your CPUs are not running at 100% on the node(s) that have the batch queues since that could keep low priority batch jobs from getting a time slice.
Lawrence
Lucinda_1
Frequent Advisor

Re: jobs sitting in queues

The qman$journal is not large. I did check the LAT$SYMPARAMETER, it is default 16 on both nodes in question. CPU is fine. I did find an ask the wizard that also included
DEFINE/SYSTEM tcpip$TELNETSYM_STREAMS 16
I have now defined on both systems. I have scheduled patches for Nov 11, but in reading I did not find any that eluded to my problems. All three nodes use the same que manager. I have many nightly jobs that run under the "operator" user account, at times they sit in the que (there are other jobs ran by other users as well) I have bumped up the user quotas to equal the system. (it is effectively a system account anyway). It is definitely not a time issue as many of the jobs are submitted by the user to run now. The problem happens on both of the production nodes, the que manager runs on just one.
Lawrence Czlapinski
Trusted Contributor

Re: jobs sitting in queues

Lucinda:
1. Having a separate batch queue might help the Operator jobs.
2. You could try a SHOW PROC/CONTINOUS/ID=id_number to see if the process is looping or waiting for something.
Lawrence