Re: RMS Fast Deletes

HDS · ‎02-26-2009

Hello.

I am executing the attached TEST_DEL.FOR as a test sample run to time the removal of records from a specific indexed file. The ANALYZE/RMS/FDL report for that target file is also included at the bottom of that attached text file.

What I am seeing is that it is performing over 3.3Million DIOs in order to delete the ~300K records contained in that target file. The records are being read, locked and deleted at a rate of about 30,000 a minute. When the deletion of the records has completed, because of the Fast Delete, I thought that I would expect to see index items removed from the primary key, but still remaining for the alternate keys. However, I am seeing zero items in all keys when all is done. Given that I was seeing the 3M+ DIOs to delete the records (no split IOs, file has only 6 extents), it seems that all indexes are being processed on the record deletes.

Am I misunderstanding the purpose or the usage of the RAB$V_FDL attribute?

A $MONITOR RMS shows the following (no global buffers enabled for this file):
Peak DIO Rate = 4K-5K iops
Local cache hit rates of ~35%
Local cache attempt rate of ~4K-to-5K+
$GET rate of ~470+
$DELETE rate equal to that of $GET
Local Buffer Write rate at ~1K+
Local Buffer Read rate at ~3K+
Locking ENQ/DEQ rates of ~3500 each

Any information regarding the purpose and/or usage of the Fast Delete attribute will be appreciated.

Note that I also tried the FDL attribute CONNECT FAST_DELETE because I was trying it from an FDL angle in addition to using the RAB attribute. This was to see if the observed results would differ. They did not.

Many thanks in advance.

-Howard-

Hoff · ‎02-26-2009

In no particular order...

This is apparently not working entirely as documented, but the behavior demonstrated here appears to be a proper subset of what is documented.

And I'd assume that reverting to allowing what is documented may break some existing applications. Even though that may well be permitted, based on the documentation.

What are the application parameters and goals and time requirements here? Is sharing within this file active? Clustering?

I'll assume that this isn't using DECnet FAL.

If there are time constraints on the processing, going to faster disks and larger and more aggressive caching is the usual solution for these cases. Or toward sharding the data. Or toward reworking how the data is stored and accessed.

What does tuning the file have to say about its particular current design? ANALYZE /RMS /FDL and EDIT /FDL, et al.

Hein van den Heuvel · ‎02-26-2009

Thanks for a well documented topic!
How refreshing :-).

The only missing, critical, bit of information is SHOW RMS, and specifically the default number of buffers for an indexed file.
If you did not know about that, then you may want to re-run your test with $SET RMS/IND/BUF=100

This may take down your 10 Io/del to 5 IO/delete or there abouts.

Locking should go away when not sharing the file. Do you need to share it? If you do need to share it, then DEFERRED WRITE can help a bunch. Call SYS$FLUSH every 1000 records or so, if that makes you feel better.

>> Because of the Fast Delete, I thought that I would expect to see index items removed from the primary key, but still remaining for the alternate keys.

Fast Delete only works when Duplicates are allowed. The provided FDL shows they are not.
From the listings (RM3DELETE) :

" If a fast-delete is requested, terminate the deletion of the current secondary key only if this secondary key allows duplicates.
If this secondary key does not allow duplicates, then a fast delete of it can not be done, since the error caused by a later attempt to insert a record with a secondary key that is a duplicate of this one would go undetected. "

>> Note that I also tried the FDL attribute CONNECT FAST_DELETE

That's a waste of time. The FDL CONNECT section is only useful if FDL$PARSE is used to create the FAB & RAB. It does NOT set a file permanent attribute. You use fortran, which provides its own RAB.

How about an indication / description of what the 'real' problem is? There may, or might not be, be better techniques. It this a partial datafile purge? Complete clean-out.
What is the goal of the cleanout (business, space, maintenance)?

Btw 1... be careful with the RAB versus RAB64.
No problem with the ROP field though.

Btw 2... The applicaiton may not need "KEY 2" as RMS supports descending lookup without an explicit key through the RAB$V_REV option.

Regards,
Hein van den Heuvel ( at gmail dot com )
HvdH Performance Consulting

Hein van den Heuvel · ‎02-26-2009

>> And I'd assume that reverting to allowing what is documented may break some existing applications. Even though that may well be permitted, based on the documentation.

Nah, the implementation is just half-baked, and no customer complained enough about it to get it fixed.

As it is, 'dangling' Secondary Index Data Array (SIDR) entries pointing to deleted records are only cleaned up when those pointers are followed to attempt a 'get'.

What is missing that on a PUT (or UPDATE) when a check for duplicate key is made, then IF, and only if, a duplicate key is about to be returned THEN a quick check needs to be done whether the record pointed to was in fact deleted and the SIDR entry can simply be re-used for the new record.

This would NOT slow down regular insert and just spread the deferred price of the delete
to the 'synergy' time. That is, RMS is already looking at the target SIDR bucket. To follow it to the data record is the smallest possible extra expense in the grand scheme.

Yeah there are 'details' with bucket and records locks to consider, but I when I last analyzed this it seemed fair game. Just a (big!) Missed opportunity.

Hein
( who is still sorry, he couldn't get permission/time to fix this back in '91 when I was working in RMS Engineering :-)

HDS · ‎02-26-2009

What wonderful information from you both, Hoff and Hein. Thank you.

Funny how I documented all of that info and forgot to include my true underlying purpose. In any case, the purpose of that test code was to investigate the report that, and I quote, "Record deletions [from that file] are taking 15-20 times longer than they should." It is hard to get an understanding of "what should be" and how this can be "15-20 times" worse. So, I figured that I would throw together some code to just delete that file's records. In actuality, I am not quite sure how anyone justified that measurement, but I was able to at least see that deletes were taking longer than I would have expected. So, my research led me to that question about Fast Deletes...since we do have that attribute enabled on our production runs (the ones reportedly running so slow).

That said, allow me to respond to your questions:
- The file is sharable and must remain so across a cluster of 8 varying ALPHA systems. Note that the test run info provided was on a much smaller cluster of 4 smaller development ALPHA systems.

- I am under the assumption that the OPEN defaults to a deferred write; that I do not need to specifically enable FAB$V)DFW...but I suppose I could be wrong.

- The goals here to speed up record deletes. "Speed up" is vague; I understand, but that is what is truly being asked.

- I am attaching a text file showing the $SHOW RMS display. I will be trying out the suggestion to use the 100 indexed buffers and will let you know how that goes. Thank you for that suggestion.

- The FDL that I used to create ($CONVERT) that file prior to that test run was generated by using the following:
$ ANALYZE/RMS/FDL/OUT= file.dat
$ EDIT/FDL/NOINTER/GRAN=DOUBL/EMPH=FLATT -
/ANAL=/OUT=FILE.FDL -
basis_fdl.fdl
I was going to be trying the OPTIMIZE interactive script to play, but I was really looking to some alternative "hit" at this one.

- Yes, I understand and agree with the word of caution on the RAB and RAB64 stuff...it does confuse me sometimes, admittedly.

- I will try the explicit SYS$FLUSH to see its affect.

- Correct...no DECnet in play here.

- I agree with the idea of more agressive caching. I suppose that I will get to evaluating options on that once I get a better grasp on what I am seeing with things as they are now.

- The option of faster disks and IO subsystem components is in play...but I think that that report of "15-to-20 times" is based on a constant hardware baseline. Not sure, though.

Thanks again for the assistance. As soon as I get a chance, I will be trying what has been discussed.

-Howard-

Hoff · ‎02-26-2009

My long-standing rule of thumb is to dig out the size of the index data and use that as the initial size for the global buffers setting.

ANALYZE /RMS /STATISTICS gets the bucket size and gets the block counts in each of the indices. Total up the block counts in each of the indices and (assuming the same bucket size throughout the file) divide by the bucket size, and round up some to account for index growth (if appropriate) and for some data buckets.

Also get some T4 and XFC hot-file data and (if you have it) DECset PCA application performance data, and see what's really going on. I've had more than a few cases when the actual hot files or the actual slow code was, um, entirely elsewhere.

Hein van den Heuvel · ‎02-26-2009

>> Funny how I documented all of that info and forgot to include my true underlying purpose.

:-)

"Record deletions [from that file] are taking 15-20 times longer than they should."

Vague, but not un-reasonable.
It will eb hard to help singleton deletes,
but bulk deletes can be helped through CACHING, in turn helped with ORDER.

It may be worthwhile to sort the records to be deleted by alternate key and use that to drive the deletes.

- The file is sharable and must remain so across a cluster of 8 varying ALPHA systems.

>> OK. Please realize the concurrent writes to the file on an other node WILL flush the XFC cache and with then cause REAL IOs, which surely explains the perceived slowness for the reall application.
You might NOT see this in a test.

>> - I am under the assumption that the OPEN defaults to a deferred write;

I would expect NOT. DFW has the potential to cause stale, and even corrupt data on process/system crash. Typical OpenVMS defaults do not opt for the worrysome case.

>> The goals here to speed up record deletes. "Speed up" is vague; I understand, but that is what is truly being asked.

- I am attaching a text file showing the $SHOW RMS display.

Hmmm.. if RMS is a serious performance component to this application then either you folks are really smart or way overdue for some performance review & tuning.
I know some folks who are real good at that. Send Email! :-)

>> I will be trying out the suggestion to use the 100 indexed buffers and will let you know how that goes.

100 is 'just a big number'. It need not be optimal, but it will help a lot WHEN ACTIVELY SHARING. In a test case it may hurt because the XFC will help so much!

>> - I will try the explicit SYS$FLUSH to see its affect.

That's just to mitigate any concerns about stale buffers. It will NOT speed up matters.

- I agree with the idea of more agressive caching.

RMS GLOBAL buffers are the ultimate tool for this.

I don't expect T4 to show much interesting stuff here, as the operations are so predictable.

But as Hoff suggests you should use SHOW MEMORY /CACHE=FILE=xxx
Use that on the PRODUCTION NODES for the PRODUCTION files, and execute more or less at the same time on all nodes, and do it a few times secodns pr minutes apart. This will help you understand whether the XFC cache is likely to be flushed during the bulk delete, and whether the solution is perhaps as simple as running the delete on a specific node.

Cheers,
Hein.

Hein van den Heuvel · ‎03-04-2009

HDS,

The 'points' suggest that you are happy now, but could you perhaps reply with the solution(s) opted for and perhpas some numbers behind them (preferably from production).

If details would be too much, too ill organized, or inappropriate to share here, then please consider sending me a private Email with some follow up?
Thanks!
Hein van den Heuvel (at gmail dot com )

HDS · ‎03-04-2009

Good morning.

I did assign the points, but am actually still working this item. I became occupied with another project for the moment, so I could not provide the proper attention to this. My apologies for the delay. I do absolutely intend on providing a full response as soon as I get a chance. For now, I can say that, as a result of the information that has been provided, I was able to improve the performance dramatically. I do have some other questions, though, but for the most part things look good.

I do need to complete this other task first, however. My promise that I will respond over the next few days.

Again, I thank you (all) for your advice.

-Howard-

HDS · ‎03-27-2009

Hello.

Just a brief update.
I have been away from this project for a while because of other immediate priorities and personal health issues. My sincerest apologies for such a delay in responding to you.

All is well now and this item is once again near the top of my project list, so I will be back on it shortly.

I cannot thank all of you enough for your advice. Your information has proven to be very helpful. I am grateful and, speaking professionally, do owe you a response and and update.

I will be in touch. I ask for your patience and request that this thread remain open for a short time longer. It's funny how one set of answers leads to other questions... questions which I do wish to post, and will do so as soon as I can get the details together.

Much obliged.

-H-

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: RMS Fast Deletes

RMS Fast Deletes