Operating System - OpenVMS
1748113 Members
3434 Online
108758 Solutions
New Discussion юеВ

Re: Indexed file as a FIFO buffer

 
SOLVED
Go to solution
Michael Moroney
Frequent Advisor

Indexed file as a FIFO buffer

I am dealing with some VMS software that generates messages to be sent to another system. In order not to lose the messages if the other system is down, it doesn't send the messages directly. It writes the messages to a temporary indexed file, which is just used as a FIFO buffer. Another process loops reading records from the file, sends them to the remote system, and if successful, deletes the record. If there is no record, it waits and tries again.

The indexed file has one key, the date/time. It's just there to keep the records in order.

The problem is, the file grows and grows, and access gets slower and slower, despite the fact it's usually empty and rarely has more than a couple of records in it. They have a hack that does the equivalent of a $CONVERT/RECLAIM every so many records, but there must be a better way. Is there some FDL magic to tune a file where it is expected that every record is deleted? Or is there a better way to implement a file-based FIFO on VMS? I say file-based since any data (or the fact there is no data) must survive reboots.
21 REPLIES 21
Hein van den Heuvel
Honored Contributor

Re: Indexed file as a FIFO buffer

There are better ways.

Specifically they should probably learn to keep a last-time-looked key value, possibly in a lock value block and use that for a KGE lookup 99 out of 100 times.
Then once every 100 elements, or 1000 seconds (whichever comes first) scan the file from the beginning to catch out of order arrivals.

Also, review the bucket size, and duplicate key requirements.
Switch dups off, and selecting a large bucket size may keep this file under control.

I suspect that this will already help enough. If not, a day of consulting should do it. Several folks in this forum would be eager to help with that (for fee). Or maybe acquire a special program I happen to have for these cases. Send me Email!

This topic has similarities to an other question today: 1189881. Check that out. Post an ANALYZE/RMS/FDL/OUT example ?!

Hope this help some,

Hein van den Heuvel ( at gmail dot com )
HvdH Performance Consulting
Hoff
Honored Contributor

Re: Indexed file as a FIFO buffer

So you're writing some RTR-like middleware. OK. (Writing custom local middleware wouldn't be my first choice when I could purchase same and let somebody else deal with support and upkeep, but there can be cases when rolling your own can be necessary.)

RMS indexed is not the best RMS structure choice for this case, since you care less about the sorted access provided by indexed files than about a sequence of messages. Your file (as you've discovered) grows.

First pass design: Two queues, with pending and free lists. Probably using a relative or a block or fixed-length file for the static storage, or using memory queues and a file-backed section and periodic flushes -- and the flushes could be coded to increase in frequency in response to the remote end dropping off-line. This is standard AST-driven server stuff.

End-points would want to (or have to) maintain sliding windows for processing potentially duplicate data. These sliding windows can be based on what your indexed files are presently using for keys. (Indexed files don't deal with sliding windows all that well.)

Biggest wrinkle would be what to do when the far end receiver was slower than your storage was big. You can only enbiggen for so long before you have to decide to stall (eg: back-pressure) or to drop.

Also cromulent would be the use of 2PC and transactions, whether XA or otherwise with DECdtm.

Middleware such as RTR would be the easiest. This stuff is already tied in with DECdtm.

A different design might parallel how Erlang and its Mnesia avoids writing to disk entirely by always keeping and always replicating the information in memory across multiple hosts, and assuming that some subset of the hosts would survive. Microsoft Research has something logically similar with its Niobe replication platform.

I have pointers to some of this stuff posted at the web site, if you can't find details elsewhere. (And assuming you're not going straight after the queues mentioned earlier, or after the approach that Hein suggests, or after RTR or other middleware.)

Stephen Hoffman
HoffmanLabs LLC
David Jones_21
Trusted Contributor

Re: Indexed file as a FIFO buffer

Doesn't RMS have a rule about not re-using RFAs (record file addresses) in a file so even if you delete a record it still consumes a little bit of space? You end with buckets full of deleted record markers. Does turning off 'allow duplicates' affect this.

I wonder if it would help if you used 2 keys, a primary key whose values are recycled (records re-written) and the timestamp in a secondary key. You never delete records, just overwrite them. Robustly tracking the free list of primary key values for the next write/rewrite would be the biggest issue.
I'm looking for marbles all day long.
Hein van den Heuvel
Honored Contributor

Re: Indexed file as a FIFO buffer

Hoff is absolutely right in suggesting the many alternatives. A 'circular' buffer in a sequential or relative file may well be the better design. Optionally add to that a (clusterwide) lock value block to hold the next free, last finished numbers to be updated to disk every so often.
And several products adrress this: RTR, MQseries, ...

My recommendation were geared to making the most of the current setup with minimal changes. All too often a requirement to get anything done at all!

David>> Doesn't RMS have a rule about not re-using RFAs (record file addresses) in a file so even if you delete a record it still consumes a little bit of space?

It does have that rule, but it implemented through a simple 'next free id' concept. This is a fixed word in the bucket header, always there. It does force a bucket split every 65K records targetted for the same data bucket!

Deleted records are nicely expunged EXCEPT:
a) The last (first !? :-) in a bucket
b) In the case duplicated are NOT allowed.
c) An odd key compression case where deleting the record would cause the NEXT record key to become too big to fit.
in all the above cases RMS leaves the record header plus (compressed) primary KEY behind

>> You end with buckets full of deleted record markers. Does turning off 'allow duplicates' affect this.

You can get those (single bytes!) for ALTERNATE KEY, SIDR entries, not for primary keys.

>> I wonder if it would help if you used 2 keys, a primary key whose values are recycled (records re-written) and the timestamp in a secondary key. You never delete records, just overwrite them.

Then why not create a circular buffer.
Those 'alternate keys' you suggest would become a header records. Record 0. That's all.

>> Robustly tracking the free list of primary key values for the next write/rewrite would be the biggest issue.

Ayup!

Hein.
Michael Moroney
Frequent Advisor

Re: Indexed file as a FIFO buffer

I'm not writing anything new, just dealing with an existing application. We're not about to do a full redesign or rewrite now, just looking for tuning tricks or ideas for minor changes. Right now they have to close the file, call CONV$RECLAIM on it and reopen the file every 5000 writes.

Right now the records are read with a KGE lookup with a key of 0 to get the oldest record, if any. Would just saving this key and doing the next KGE read with it set to the last key rather than zero help?

The remote end is a PC with an Oracle database.

Attached is a FDL of an "empty" version of the file. It has grown to 3231 blocks here.
Hein van den Heuvel
Honored Contributor

Re: Indexed file as a FIFO buffer

>> I'm not writing anything new, just dealing with an existing application.

I expected as much.

>> Right now they have to close the file, call CONV$RECLAIM on it and reopen the file every 5000 writes.

That's just crazy!

>> Would just saving this key and doing the next KGE read with it set to the last key rather than zero help?

Oh yeah. You will not recognize it.
The size will not change, the need to convert will just about go away, or be reduced to daily or so.

>> Attached is a FDL of an "empty" version of the file.

Can you snarf a copy (back/ignore=interlock?) while there still are records?
Or take an empty file and plunk in 1 dummy record just to trigger ANAL/RMS to output more data (DCL write)?
Or use my tune_check!

Better still... Run the attached program against (a copy of) the empty file and share the results in an attached txt file, or in an Email to me.

Cheers,
Hein.
Robert Gezelter
Honored Contributor

Re: Indexed file as a FIFO buffer

Michael,

I can think of several ways of dealing with indexed records that would not give rise to the need to reorganize the file on an ongoing basis.

The core of the problem as you describe it is the use of the date/time as the key. There are several alternative ways of organizing this to prevent a constantly changing primary key, which is effectively the source of your problem.

Depending on how the sources are implemented, the correction could be straightforward, or it could be more involved. As my colleagues would certainly agree, speculating on the ease of modifying sources without a thorough review of those sources is not a sound way to proceed.

I would agree with Hein's original comment, re: retaining a suitably experienced consultant to review the sources and suggest changes [Disclosure: such services are within our scope of practice].

Some of the approaches that I can imagine would virtually eliminate the need to reorganize the file, period.

- Bob Gezelter, http://www.rlgsc.com
Jon Pinkley
Honored Contributor

Re: Indexed file as a FIFO buffer

If messages must be delivered FIFO, how do you guarantee FIFO order when time is changed in a backward direction in the fall (assuming you are under daylight savings/summer time rules.)

Jon
it depends
Michael Moroney
Frequent Advisor

Re: Indexed file as a FIFO buffer

Hein,

I took a copy of the "empty" file and added a bunch of legitimate data and generated the FDL again.

ANALYSIS_OF_KEY 0
DATA_FILL 94
DATA_KEY_COMPRESSION 83
DATA_RECORD_COMPRESSION -4
DATA_RECORD_COUNT 849
DATA_SPACE_OCCUPIED 144
DEPTH 1
INDEX_COMPRESSION 54
INDEX_FILL 2
INDEX_SPACE_OCCUPIED 12
LEVEL1_RECORD_COUNT 12
MEAN_DATA_LENGTH 84
MEAN_INDEX_LENGTH 22
LONGEST_RECORD_LENGTH 132

Your little program produces:

$ xxx -v cadsegment.sfl2

* 3-JAN-2008 16:24:15.45 ALQ=6528 BKS=12 GBC=0 cadsegment.sfl2

Bucket VBN Count Key
------- ---------- ----- ----------------------------------
1 3 76 20080103161046130000
2 15 77 20080103161046150015
3 27 76 20080103161046170015
4 39 75 20080103161046190017
5 51 71 20080103161046210014
6 63 75 20080103161046230004
7 75 79 20080103161046240040
8 87 77 20080103161046260038
9 99 73 20080103161046280034
10 111 77 20080103161046300028
11 123 75 20080103161046320031
12 135 30 20080103161046340032