Operating System - OpenVMS
1748235 Members
3687 Online
108759 Solutions
New Discussion

COBOL EOF Processing and the START Command

 
SOLVED
Go to solution
Paul Raulerson
Advisor

COBOL EOF Processing and the START Command

Is there something â specialâ about how COBOL processing end of file under OpenVMS? I have ran into this reproducible issue and I am not sure at all if it is ignorance on my part, if I have something loaded into the system causing it (RAID, Apache, etc.) or if it is â normalâ behavior for HP COBOL.

Sample code duplicating the issue is attached.

Here are the details

This file is normally used with random reads to validate codes key-entered by users in various screens. However, there are maintenance screens where a user can walk through the file record by record and choose to edit, delete, or add new records. Nothing unusual or the least bit tricky, stuff we all have done thousands of times.

During maintenance processing, if the user encounter's end of file, the code merely sets an 88 level entry (EOF) to true. If a READ NEXT request is sent from the user, and EOF is true, no read will be attempted, the maint
program merely displays a message like "at end of file" and returns. This appears to work great.

Of course, the user can, at any point, request a read of the previous record, which is then processed in a similar way. If EOF is detected in a READ PREVIOUS, an 88-level entry (BOF) is set to true, and subsequent READ PREVIOUS requests will not attempt to perform a real read.

In both the read-next and read-prev subroutines, there is a check for the opposite condition. Again, nothing odd or tricky. If the read-next processing detects a BOF, it will perform a start on the file like this:
move spaces to dt-code
start dtmaster key >= dt-code .... (with appropriate invalid key checking of course)

If the read-prev processing detects an EOF condition, it attempts the corresponding action,
(the value of dt-code here is the key value from the last record in the file)
start dtmaster key <= dt-code .... (with appropriate invalid key checking of course)

This all works as expected until the second time an EOF is processed, at which point the
"start dtmaster key <= dt-code" will catastrophically fail. There is a long pause
and the following error message is produced.


Reading Previous Record
%DEBUGBOOT-W-EXQUOTA, process quota exceeded
%SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000007345
7EB0, PC=000000007C324940, PS=0000001B

Improperly handled condition, image exit forced.
Signal arguments: Number = 0000000000000005
Name = 000000000000000C
0000000000000000
0000000073457EB0
000000007C324940
000000000000001B
Register dump:
R0 = 0000000000000000 R1 = 0000000000000000 R2 = 000000007C347180
R3 = 0000000000050C40 R4 = 00000000000A5468 R5 = 000000007C3383B0
R6 = 000000007AE25918 R7 = 00000000000A5998 R8 = 0000000000018039
R9 = 0000000000018029 R10 = 0000000000018011 R11 = 0000000000000001
R12 = 0000000000000006 R13 = 0000000000003030 R14 = 0000000000000000
R15 = 0000000000000000 R16 = 000000000000002E R17 = 00000000000A5536
R18 = 0000000000000006 R19 = 0000000000000004 R20 = 0000000000000004
R21 = 0400000000000000 R22 = 0000000000000000 R23 = 0000006400000000
R24 = 0000000000000001 R25 = 0000000000000005 R26 = 0000000073458E80
R27 = 0000000073457EB0 R28 = 0000000000000000 R29 = 0000000073458EB0
SP = 0000000073458EB0 PC = 000000007C324940 PS = 300000000000001B


The file is defined thusly:

FILE-CONTROL.
SELECT DTMASTER
ASSIGN TO DYN-FILE-NAME
ORGANIZATION IS INDEXED
ACCESS MODE IS DYNAMIC
RECORD KEY IS DT-CODE
LOCK MODE IS MANUAL WITH LOCK ON MULTIPLE RECORDS
FILE STATUS IS DT-MASTER-STATUS.

FD DTMASTER VALUE OF ID IS DYN-FILE-NAME.
01 DT-RECORD.
05 DT-CODE PIC X(6).
05 DT-DESCRIPTION PIC X(40).


---------------------------
I worked around this problem with a jury rigged solution I am not happy with at all.
In the read-prev processing with the suspected start, I replaced the start with an indexed
read of the last record, since the value of the record when EOF was detected is preserved.
I left the start in the BOF which always works as expected without a failure.

I am very suspicious I have just coded something wrong, but similar code does work on other platforms,
so maybe not. The code is terribly over commented at this point, and also has some displays in it to
make the problem very obvious.

In any case, thanks for taking the time to look. :)

-Paul





4 REPLIES 4
Hein van den Heuvel
Honored Contributor

Re: COBOL EOF Processing and the START Command

Hello (again) Paul.

The program dies with a VM quota error, but the problem is triggered through locking.

This can be seen with a nicely with SHOW PROC/CONT. Hit 'q' to switch to the quota screen and just before dying you'll get something like:

ASTs remaining 246/250 ( 98%)
Timer entries remaining 10/10 (100%)
PGFL quota count/limit 0/32000 ( 0%)
ENQ quota count/limit 1999/2000 ( 99%)

A simple ^T will of course also reveal a high PageFault rate, and Memory accumulation.

The Cobol RTL seems a little confuseed, possibly due to the condition:
%RMS-S-OK_ALK, record is already locked

Using SET PROC/SSL=STATE=ON and a little post processing we see the program end with:

13:35:43.05 00018039 U SYS$RMS_FIND RMS
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00018039 U SYS$RMS_GET RMS
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000001 K SYS$CRETVA_64 SYS$VM
13:35:43.05 00018039 U SYS$RMS_GET RMS
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00000689 E SYS$$ENQ LOCKING
13:35:43.05 00018039 U SYS$RMS_FIND RMS
13:35:43.05 00000689 E SYS$$ENQ LOCKING


The easiest WORKAROUND is to comment out the manual lock holding.

"lock mode is manual with lock on multiple records."

Of course this might not be acceptable for the real usage.

The next best workaround is to toss in an
" unlock dtmaster. " before the crater read.
In the real code, an unlock all when hitting EOF or BOF may well be reasonable.

Looks like you found a bug, mostly likely in the COBRTL.

Great reproducer! It failed for me on OpenVMS 8.3 Itanium, Cobol V2.8-1444.

I would urge to report this to HP Support.

Hope this helps some, maybe more later...

Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting


Paul Raulerson
Advisor

Re: COBOL EOF Processing and the START Command

Hi Hein-
Well, I thought that manual locking was just that, so I am a bit confused over why the system would be locking records I did not tell it to. Must have something to do with the EOF status.

I guess I should have put the compiler version in here, but I am glad you duplicated it on Itanium!

Compiler: VMS 8.3 Alpha with 0300 patch levels, Compaq COBOL V2.8-1286.

-Paul


Hein van den Heuvel
Honored Contributor
Solution

Re: COBOL EOF Processing and the START Command

Paul, you are right. When specifying manual locking and not using READ WITH LOCK, there should be no record locked.

The 'natural' way to use RMS is automatic locking. The not-locking is actually implemented with the ROP=NLK and tells RMS to unlock the record right after it got it, and before returning to the user code.

If you were SET FILE/STAT on the test file, and watch your program with ANAL/SYS... SHOW PROC/RMS=FSB, then you would see an equal number of ENQueues and DEQueues.

But on the first "start dtmaster key <= dt-code" there is no DEQ and the last record in the file is left locked (RFA=3,1 as it is the first record inserted in the reproducer).

I _suspect_, but this is a stretch/gamble, that that's caused by the internal rtl routine 'set_filepos' which does an rab->rab$v_rrl = 1, without rab->rab$v_nlk = 1, where mostly those need to go hand in hand.

This could leave a record locked, which down the road would cause an OK_ALK which is pretty much an alternate succes status which is not tested for in the internal routines. Externally it is mapped to "00".

Now the root cause for the problem itself would almost have to be in the cobol read-prior cause, not knowing to deal with ALK which building its RFA cache. (ALK could be legit after all).

What's all that mumbo jumbo you say?

Well, RMS does not offer read prior sequential. It does offer a read keyed reverse. But if there are duplicates, that would not work. Cobol itself, not rms, protects against that by building an array of records to walk back through. The re-allocation of that array is likely what pops VM... but it should not get there (there are no dups!).

I'll ping our friend John R in OpenVMS Engineering a pointer to this topic, in case he has not spotted it already. He is likely to desire/appreciate a formal problem report though.

Cheers,
Hein.
Paul Raulerson
Advisor

Re: COBOL EOF Processing and the START Command

Ah- actually that makes absolutely perfect sense, even if I am not all that familar yet with the internals of RMS processing.

At least I don't feel quite so stupid. I originaly caught it because I am using the regression test checklists to validate ported code, but just figured I had fat fingered a line of code or something.

I've really got to learn more about RMS; you found and validated in a few minutes what I spent several hours convincing myself was real. ;)

-Paul