1827293 Members
3562 Online
109717 Solutions
New Discussion

%QMAN-E-CREPRCSTOP

 
Lub
New Member

%QMAN-E-CREPRCSTOP

Hi,

We have some troubles with queues seen with status "stopped pending" and at same time these
messages:

%%%%%%%%%%% OPCOM 29-OCT-2004 12:56:12.46 %%%%%%%%%%%
Message from user SYSTEM on COYOTE
%QMAN-E-CREPRCSTOP, failed to create a batch process, queue COYOTE_P_GEM$BATCH will be stopped

%%%%%%%%%%% OPCOM 29-OCT-2004 12:56:12.46 %%%%%%%%%%%
Message from user SYSTEM on COYOTE
-QMAN-I-QUEAUTOOFF, queue COYOTE_P_GEM$BATCH is now autostart inactive

I check BALSECTCNT (it's ok) and autogen don't
see any problem.

COYOTE_SYS> sho mem/slo
System Memory Resources on 3-NOV-2004 14:18:51.06

Slot Usage (slots): Total Free Resident Swapped
Process Entry Slots 320 218 102 0
Balance Set Slots 318 218 100 0

Has anybody an idea ?

OpenVMS V7.3-1
Thanks.
Regards.
Luc BOUGEANT.
13 REPLIES 13
Himanshu_3
Valued Contributor

Re: %QMAN-E-CREPRCSTOP

Hi Bougent,

Since you have checked the BALSETCNT and the problem doesnt seem to be there, you can do following to troubleshoot the problem.....

1) Use MONITOR SYSTEM/INT=2 and notice the total number of processes
displayed in the right-hand corner near the top.
You can also use sysman,sysgen and f$getsyi to display your MAXPROCESSCNT .
Just check if the problem is there.

2)Check the system parameter SPTREQ which shouldnt be set too low .
When a process is created, pages in system space are required to hold
temporary data during process creation. This means that some entries
in the system page table must be available. If they aren't (or are too few),
the process creation fails.

The typical value for SPTREQ could depend and vary from system to system. Try increasing this value.

3) To check peak number of processes running
you can also do

$ @SYS$UPDATE:AUTOGEN SAVPARAMS

and search SYS$SYSTEM:AGEN$FEEDBACK.DAT for PEAK

Or run

$ @SYS$UPDATE:AUTOGEN SAVPARAMS TESTFILES

and inspect SYS$SYSTEM:AGEN$PARAMS.REPORT


Hope this helps,
Regards,
HP
Ian Miller.
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Is non-paged pool ok ? Can you post the results of SHOW MEMORY.

Could there be a disk problem - any disks shown in unusual states in SHOW DEVICE D?
____________________
Purely Personal Opinion
Volker Halle
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Luc,

there should have been an additional message after the -QMAN-I-QUEUEAUTOOFF, which should tell the system service failure code.

If that doesn't help, could you do a SET QUE /RETAIN=ERROR on the queue and check with SHO QUE/FULL/ALL once the error has reoccured, whether there is an additional service failure status reported ?

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Check the accounting file. Normally the reason why the process creation failed is in it.

Wim
Wim
Lub
New Member

Re: %QMAN-E-CREPRCSTOP

Thanks for your replies.
Reply 1:
Our systems are AlphaServer (ES45/GS140), SPTREQ parameter is only for VAX ?
About peak number of process running, here is the value given with autogen
MAXPROCESSCNT parameter information:
Feedback information.
Old value was 320, New value is 320
Maximum Observed Processes: 170
- AUTOGEN parameter calculation has been overridden.
The calculated value was 256. The value 320
will be used in accordance with the following requirements:
MAXPROCESSCNT has been specified by a hard-coded value of 320.

Reply 2:
NPP was used at about 50% (see with Unicenter Performance Manager)
COYOTE_SYS> sho mem
System Memory Resources on 4-NOV-2004 13:36:05.69

Physical Memory Usage (pages): Total Free In Use Modified
Main Memory (16.00GB) 2097152 624128 1464114 8910

Extended File Cache (Time of last reset: 11-AUG-2004 02:37:24.15)
Allocated (GBytes) 7.79 Maximum size (GBytes) 8.00
Free (GBytes) 0.91 Minimum size (GBytes) 0.00
In use (GBytes) 6.87 Percentage Read I/Os 131%
Read hit rate 100% Write hit rate 0%
Read I/O count 4294967295 Write I/O count 3273570759
Read hit count 4294967295 Write hit count 0
Reads bypassing cache 1105585617 Writes bypassing cache 197570589
Files cached open 1376 Files cached closed 11190
Vols in Full XFC mode 0 Vols in VIOC Compatible mode 114
Vols in No Caching mode 2 Vols in Perm. No Caching mode 0

Granularity Hint Regions (pages): Total Free In Use Released
Execlet code region 1536 0 1125 411
Execlet data region 360 0 359 1
S0/S1 Executive data region 20976 0 20976 0
Resident image code region 1536 659 877 0
Resident image data region 1024 0 0 1024

Slot Usage (slots): Total Free Resident Swapped
Process Entry Slots 320 203 117 0
Balance Set Slots 318 203 115 0

Dynamic Memory Usage: Total Free In Use Largest
Nonpaged Dynamic Memory (MB) 162.11 106.37 55.74 63.11
Bus Addressable Memory (KB) 528.00 508.68 19.31 504.00
Paged Dynamic Memory (MB) 19.07 10.06 9.00 9.90
Lock Manager Dyn Memory (MB) 116.46 78.70 37.76

Buffer Object Usage (pages): In Use Peak
32-bit System Space Windows (S0/S1) 2 3
64-bit System Space Windows (S2) 29 60
Physical pages locked by buffer objects 31 51

Memory Reservations (pages): Group Reserved In Use Type
Total (0 Bytes reserved) 0 0

Swap File Usage (8KB pages): Index Free Size
PAGE_COYOTE:[000000]SWAPFILE0.SYS;1
1 21872 21872
PAGE_COYOTE:[000000]SWAPFILE1.SYS;1
2 21872 21872
PAGE_COYOTE:[000000]SWAPFILE2.SYS;1
3 21872 21872
PAGE_COYOTE:[000000]SWAPFILE3.SYS;1
4 21872 21872

Total size of all swap files: 87488

Paging File Usage (8KB pages): Index Free Size
PAGE_COYOTE:[000000]PAGEFILE3.SYS;1
251 1062496 1062496
PAGE_COYOTE:[000000]PAGEFILE2.SYS;1
252 1062496 1062496
PAGE_COYOTE:[000000]PAGEFILE1.SYS;1
253 1062496 1062496
PAGE_COYOTE:[000000]PAGEFILE0.SYS;1
254 1062496 1062496

Total size of all paging files: 4249984
Total committed paging file usage: 50067580

Of the physical pages in use, 142452 pages are permanently allocated to OpenVMS.

I didn't sea any disk problem.

Reply 3 and 4:
I find the same day (and no other since) these messages in operator.log
%%%%%%%%%%% OPCOM 29-OCT-2004 14:08:40.93 %%%%%%%%%%%
Message from user SYSTEM on COYOTE
%QMAN-E-CREPRCSTOP, failed to create a batch process, queue COYOTE_P_GEM$BATCH will be stopped

%%%%%%%%%%% OPCOM 29-OCT-2004 14:08:40.93 %%%%%%%%%%%
Message from user SYSTEM on COYOTE
-QMAN-I-QUEAUTOOFF, queue COYOTE_P_GEM$BATCH is now autostart inactive

%%%%%%%%%%% OPCOM 29-OCT-2004 14:08:40.93 %%%%%%%%%%%
Message from user SYSTEM on COYOTE
-JBC-F-NOSUCHJOB, no such job

I don't understand why but when the queue was restarted the entry 655 has been run twice (all fields of his batch runner, entry 111 are empty in accounting) ?

See an extract from accounting in attachment.

Regards.
Luc BOUGEANT.







Antoniov.
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Luc,
reading into your attachment I saw the NOSUCHJOB error and in my mind lights on: may be you have purged command file? If you purge you can have trouble because batch queue store filename and version of command file.

Antonio Vigliotti
Antonio Maria Vigliotti
Volker Halle
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Luc,

looks like something happened (to the queue ?) on 29-OCT-2004 12:56:12.47, batch jobs 655 and 111 (these are different JOBs - see job name !) terminated at about the same time with an unusual final status of FFFFFFFF

At 14:08 job 655 seems to have been retarted but LOGINOUT failed with %JBC-F-NOSUCHJOB.

At 14:15, that job 655 then finally finished with a successful exit status.

Note that if you restart a batch job (SET ENT/NOHOLD 655), which had been retained in the queue due to /RET=ERROR or other such methods, the job will keep it's entry number.

Volker.
Jan van den Ende
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Antonio,


batch queue store filename and version of command file


Sorry, but that is NOT true! What IS stored, is the file ID. That means that even if you know the original version number, it does not help to recreate a file with the original version number.

Cheers.

Have one on me.

Jan
Don't rust yours pelled jacker to fine doll missed aches.
labadie_1
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Jan you are absolutely right. And you give the reason, why, when you have submitted a .COM, modified it, a simple copy/overlay works: it keeps the LBN. of course, you can delete the entry, and submit the modified .com file.

Bonjour Luc !
Volker Halle
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Luc,

was there a 3rd OPCOM message at 12:56:12.46 ? This may have shown the system service failure status for the first occurence of the problem.

Volker.
Lub
New Member

Re: %QMAN-E-CREPRCSTOP

Volker,

There's no 3rd OPCOM message at 12:56:12.46 !

Antoniov.
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

Jan,
you are absolutely right; I would only point about the purge before start entry.

Antonio Vigliotti
Antonio Maria Vigliotti
labadie_1
Honored Contributor

Re: %QMAN-E-CREPRCSTOP

as a friend pointed out, I should have said "keeps the file id", of course.

I should give up alcohol :-)