Re: extremely slow read performance on indexed file

mpradhan · ‎10-30-2015

hi,

We have a file with following FDL

IDENT FDL_VERSION 02 "23-OCT-2015 08:56:11 OpenVMS ANALYZE/RMS_FILE Utility"

SYSTEM
SOURCE OpenVMS

FILE
ALLOCATION 19745088
BEST_TRY_CONTIGUOUS yes
BUCKET_SIZE 52
CLUSTER_SIZE 32
CONTIGUOUS no
EXTENSION 59124
FILE_MONITORING no
NAME "<NAME REMOVED>"
ORGANIZATION indexed
OWNER [<OWNER REMOVED>]
PROTECTION (system:RWED, owner:RWED, group:RE, world:)
GLOBAL_BUFFER_COUNT 0
GLBUFF_CNT_V83 0
GLBUFF_FLAGS_V83 none

RECORD
BLOCK_SPAN yes
CARRIAGE_CONTROL carriage_return
FORMAT variable
SIZE 25083

AREA 0
ALLOCATION 19745072
BEST_TRY_CONTIGUOUS yes
BUCKET_SIZE 52
EXTENSION 59124

KEY 0
CHANGES no
DATA_KEY_COMPRESSION no
DATA_RECORD_COMPRESSION no
DATA_AREA 0
DATA_FILL 100
DUPLICATES no
INDEX_AREA 0
INDEX_COMPRESSION no
INDEX_FILL 90
LEVEL1_INDEX_AREA 0
NAME "LOTID"
NULL_KEY no
PROLOG 3
SEG0_LENGTH 12
SEG0_POSITION 0
TYPE string

KEY 1
CHANGES yes
DATA_KEY_COMPRESSION no
DATA_AREA 0
DATA_FILL 90
DUPLICATES yes
INDEX_AREA 0
INDEX_COMPRESSION no
INDEX_FILL 90
LEVEL1_INDEX_AREA 0
NAME "ACTIVITYKEY"
NULL_KEY no
SEG0_LENGTH 11
SEG0_POSITION 12
SEG1_LENGTH 2
SEG1_POSITION 31
TYPE string

KEY 2
CHANGES yes
DATA_KEY_COMPRESSION no
DATA_AREA 0
DATA_FILL 90
DUPLICATES yes
INDEX_AREA 0
INDEX_COMPRESSION no
INDEX_FILL 90
LEVEL1_INDEX_AREA 0
NAME "COMCLASS"
NULL_KEY no
SEG0_LENGTH 1
SEG0_POSITION 22
SEG1_LENGTH 2
SEG1_POSITION 31
TYPE string

KEY 3
CHANGES yes
DATA_KEY_COMPRESSION no
DATA_AREA 0
DATA_FILL 90
DUPLICATES yes
INDEX_AREA 0
INDEX_COMPRESSION no
INDEX_FILL 90
LEVEL1_INDEX_AREA 0
NAME "TIMESTAMPKEY"
NULL_KEY yes
NULL_VALUE 0
SEG0_LENGTH 8
SEG0_POSITION 23
TYPE string

ANALYSIS_OF_AREA 0
RECLAIMED_SPACE 0

ANALYSIS_OF_KEY 0
DATA_FILL 8
DATA_KEY_COMPRESSION 0
DATA_RECORD_COMPRESSION 0
DATA_RECORD_COUNT 137409
DATA_SPACE_OCCUPIED 7274904
DEPTH 2
INDEX_COMPRESSION 0
INDEX_FILL 64
INDEX_SPACE_OCCUPIED 6604
LEVEL1_RECORD_COUNT 139902
MEAN_DATA_LENGTH 2383
MEAN_INDEX_LENGTH 15
LONGEST_RECORD_LENGTH 6823

ANALYSIS_OF_KEY 1
DATA_FILL 20
DATA_KEY_COMPRESSION 0
DATA_RECORD_COUNT 9725
DATA_SPACE_OCCUPIED 25688
DEPTH 1
DUPLICATES_PER_SIDR 13
INDEX_COMPRESSION 0
INDEX_FILL 31
INDEX_SPACE_OCCUPIED 52
LEVEL1_RECORD_COUNT 494
MEAN_DATA_LENGTH 263
MEAN_INDEX_LENGTH 17

ANALYSIS_OF_KEY 2
DATA_FILL 18
DATA_KEY_COMPRESSION 0
DATA_RECORD_COUNT 7054
DATA_SPACE_OCCUPIED 24492
DEPTH 1
DUPLICATES_PER_SIDR 18
INDEX_COMPRESSION 0
INDEX_FILL 12
INDEX_SPACE_OCCUPIED 52
LEVEL1_RECORD_COUNT 471
MEAN_DATA_LENGTH 319
MEAN_INDEX_LENGTH 7

ANALYSIS_OF_KEY 3
DATA_FILL 0
DATA_KEY_COMPRESSION 0
DATA_RECORD_COUNT 134651
DATA_SPACE_OCCUPIED 12377924
DEPTH 2
DUPLICATES_PER_SIDR 0
INDEX_COMPRESSION 0
INDEX_FILL 49
INDEX_SPACE_OCCUPIED 10452
LEVEL1_RECORD_COUNT 238037
MEAN_DATA_LENGTH 17
MEAN_INDEX_LENGTH 11

dir /full on the file

Size: 9.10GB/9.35GB Owner: [<OWNER REMOVED]

Created: 7-JUL-2011 16:19:55.32

Revised: 30-AUG-2015 16:04:50.21 (581)

Expires: <None specified>

Backup: <No backup recorded>

Effective: <None specified>

Recording: <None specified>

Accessed: <None specified>

Attributes: <None specified>

Modified: <None specified>

Linkcount: 1

File organization: Indexed, Prolog: 3, Using 4 keys

Shelved state: Online

Caching attribute: Writethrough

File attributes: Allocation: 19626816, Extend: 59124, Maximum bucket size: 52

, Global buffer count: 0

No version limit, Contiguous best try

Record format: Variable length, maximum 25083 bytes, longest 0 bytes

Record attributes: Carriage return carriage control

RMS attributes: None

Journaling enabled: None

File protection: System:RWED, Owner:RWED, Group:RE, World:

Access Cntrl List: None

Client attributes: None

We are trying the below sample program snippet (which reads the file using stream I/O and using RMS), in both cases it takes an extremely long time for the read to finish. Whereas FTP run on the file as well as BACKUP /SAVE-SET run on the file completes in no-time.

File Read using stream:

FILE * fp = open("filename", "rb", "ctx=stm", "shr=get");

while(!EOF)

fread

File Read using RMS

sys$open(FAB);

sys$connect(RAB);

while(!EOF)

sys$get(RAB);

Is there anything we can do to change the way we are reading the file, to match FTP, BACKUP's speed?

Thanks,

Manoj

Hein van den Heuvel · ‎10-30-2015

>> extremely long time .

oh c'mon. You can do better than that!

hh:mm:ss each method?

cpu time used (lib$show_timer, ^T)

First thing to try:

Convert the file with KEY 0; DATA_RECORD_COMPRESSION yes; DATAFILL 90, BUCKETSIZE 63 (for DATA)

add AREA 1 with bucket size 32 or 16 for rest.

Change the record access test to use RAB$V_NQL ( or NLK+RRL )

Next, consider a 3rd party product line Vselect from EGH, which can read RMS indexed file records quicker than RMS can using no locking and large IOs.

Possible explanation:

Backup and FTP will read the file in BLOCK mode, no locks. Minimal CPU time.

How many IO's do they take? ( ^T, or logout stats )

Both your RMS tests perform record IO, with full lock protection.

I suspect you have high (kernel) CPU, possibly even bottlenecked at CPU.

For every $GET requested, RMS will roughly do this:

- DEQ prior record lock; CNV EX bucket lock; ENQ EX record; CNV NL bucket lock.

- For every new bucket, it will DEQ some old bucket lock, and ENQ NL bucket lock.

Now the file. It shows:

DATA_FILL 8, so just 8 % of the space is used.
DATA_RECORD_COUNT 137409
DATA_SPACE_OCCUPIED 7274904 --> divided by bucketsize=52 --> 139902

So we actually have MORE buckets than records. Empty buckets, which will be read!

(admittedly BACKUP and FTP also read those).
LEVEL1_RECORD_COUNT 139902 --> confirms math.

So now what you have is a full Bucket ENQ/DEQ per record, not just the convert

Plus you have the record locks that you don't really need.

Packing more records in the bucket (data compression, large fill) will,

besides the obvious IO savings, switch that to mostly converts for the bucket locks which is much cheaper.

NQL will remove the record locks.

That should reduce the kernel mode usage a lot, leaving just the EXEC mode usage for actual RMS work.

Good luck!
Hein

Hoff · ‎10-30-2015

Depends on your goals, though — why you're chosing a sequential record read over block reads, for instance.

If you're just looking to copy the file around, then...

Map the file into memory, map the target file into memory, perform a big byte copy, close?
Use the callable BACKUP API?
Use Carl's callable copy, or Bob Sampson's Fast I/O copy; both linked at http://labs.hoffmanlabs.com/node/417

Hein van den Heuvel · ‎11-03-2015

Manoj, any feedback?

You spend considerable time creating this topic,

I spend considerable time trying to answer.

Please don't leave us hanging.

Hein

Hoff · ‎11-04-2015

It's easy to match the speed of the wildly-insecure FTP transfers, with the use of RMS block mode.

With the posted source code — not really source code, unfortunately — I'd expect the source code is probably using record reads and not block reads. That's easy to verify with some instrumentation in the code; based on whether the final transfer count matches the number of blocks transferred, or the number of records transferred.

Here's an ancient callable copy that might help. This C code shows how to use RMS block mode.

This twenty+ year old software configuration is missing more than a few patches, performance updates and security fixes, too. It's far too old for callable BACKUP or other functional enhancements. Missing more than a few C fixes, too — this is a barely-functional C RTL, particularly given OpenVMS V6.1 was one of the first releases with the DEC C RTL even present.

Hein van den Heuvel · ‎11-04-2015

Hi Hoff,

I'm not sure why you keep refering to copy?!

It seems clear to me that the OP wants to GET the records, the actual data bytes, not the raw blocks.

OP used FTP and BACKUP as COMPARISON to prove the whole file, indexed file overhead and all, can be sucked of the disk quickly enough, compared to telling RMS to just read the records.

mpradhan · ‎03-17-2016

Guys,

Sorry, I have been trying to come back here and thank you all for all the help and guidance but for some reason the forum would let me login with errors "IP validation failed".

Finally I could login.

thanks, I went ahead with Block IO option and things are working fine.

Thanks,

Manoj

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: extremely slow read performance on indexed file

extremely slow read performance on indexed file