SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Mark Corcoran · ‎03-01-2010

I've been looking at a problem to do with the post-processing of table files unloaded from RDB using RMU.

The table content is in the order in which data was added to the table, not sorted in order.

For some tables, the amount of time taken to "index" then after they are unloaded, can be significantly longer than "normal", regardless of whether or not there are significantly more or less rows in the table.

I had been convinced that the issue was the fact that:

a) all of the tables were being unloaded at the same time
b) ...and to the same "disk" (it's actually a shadowset "disk" in a SAN farm, which I'm not responsible for, so who knows how exactly it is configured)
c)after the unload is finished, where a table needed to be indexed, a CONV/SORT would be performed on it, with the output file (and sometimes the sort work and convert work files) being output to the same disk as the input file.

On having examined the input file to CONV /SORT for one table, on a few occasions where the CONV has been running for 12+ hours, I was convinced that the problem was because the input file was badly fragmented (~2000 fragments, for ~45m records across 11.5m blocks)

I've since manually copied the fragmented file to another empty disk, making the file contiguous.

Attempting the CONVERT /SORT still causes it to take hours, or at least, longer than I'm prepared to wait, so I abort it.

Before aborting, I've set the protection on the sort and convert work files, so that CONVERT wouldn't tidy them up on rundown, and analysing them, shows them to have ~200 fragments each.

SORTing the input file first of all (the sorted output file is also contiguous) takes about 13 minutes, and then CONVERT /NOSORT takes about a further 45 minutes.

I aborted both the SORT and the CONVERT /NOSORT operations (on separate runs), after changing the protection on the work files, and could see that they had a similar number of fragments when just a CONVERT /SORT was attempted.

I've tried CONVERT /SORT with differing numbers of work files, and using a different disk for SYS$SCRATCH, but it doesn't seem to make any difference.

All the evidence appears to point to CONVERT/SORT being manifestly less efficient than SORT + CONVERT/NOSORT.

I can't find anything in the documentation that would explain why this should be the case (in both the CONVERT /SORT and /NOSORT cases, the same FDL file is used) - I'm convinced that it shouldn't be the case.

Does anyone have any suggestions as to why I might be observing the problems I'm seeing?

Mark

Jan van den Ende · ‎03-01-2010

Mark,

I am not really sure that it will be THE issue, but the large number of fraagments for the work files DOES play a part.
Can you use a drive with EXTEND size in the order of the file size for workfiles? That would lead to workfiles with at most 3 fragments, making a significant ( but how much ? ) difference.
And, just to make sure, have a second look at highwatermarking being turned OFF!!

hth

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Hoff · ‎03-01-2010

In no particular order...

If you're going to look to run with the older OpenVMS tools and settings, then look to create the output file as one huge hunk. And look to allocate the file in (larger) hunks.

45 million records of about 150 characters isn't all that much these days. See if you can sort it in memory. (This depends on which box you're dealing with; Alpha is slow, as are the older Itanium processors. And many of these boxes are under-configured for physical memory and for working set.)

I'd look at several things here. Working set as mentioned; both what this process is authorized and what the system is configured to allow. I'd look at parallelism; Knuth et al. Split these 45 million records across however many processors you have. The standard VMS libraries aren't parallelized, AFAIK. (But just for giggles, see if Hypersort helps.)

OpenVMS I/O isn't fast, and older hardware can be a bottleneck. Fragmented disks (which you appear to have here) and dinky allocation sizes make this worse, and the typical SAN stuff is fairly glacial. The typical 2 Gb HBAs and EVAs are fairly slow these days.

Make sure you don't have highwater marking (HWM) lit.

I'd guess that the core limit here is I/O or paging, but I'd profile the process activity and see where it's spending its time.

If your key strings are close or if your FDL design doesn't allow for the numbers of records here, you can end up with some pretty nasty RMS file structures, too. FDL dates back to when 45 million was very big, so its defaults might need explicit coaching to get you decent performance here.

Tokyo Cabinet or other key-value storage such may well be faster than messing around with the classic stuff. Depending on what you're up to, some of the newer tools can segment this stuff across hosts nicely, too; you can scale up smaller boxes. (I haven't ported it, nor have I seen a Tokyo Cabinet port for OpenVMS, though a port is reportedly in progress. It does run on the other boxes I deal with, and it's fast.)

http://00f.net/2009/an-overview-of-modern-sql-free-databases
http://1978th.net/tokyocabinet/
http://www.johndapps.com/

Mark Corcoran · ‎03-01-2010

Gak! I had attempted to reply earlier, and after clicking on Submit, I focused on doing some other work; come back to the session to find the browser blank but with no indication that Forums had accepted my update (I have a vague recollection that it doesn't like you taking a long time between Reply and Submit); I then did some more browsing, so can't find it in my history.

Oh well, here's having a go at remembering what I put:

Thanks Jan (and now Stephen).

I don't think in this case that fragmentation is really the issue - from a clean disk with essentially nothing else on it
(apart from BITMAP.SYS, 000000.DIR yada yada), I have performed SORT & CONVERT /NOSORT, got the stats, deleted the files, then performed a CONVERT /SORT, got the stats and compared them.

The volume characteristics (which would determine the extent size of files created on it) is alas not within my team's purview, but that wouldn't stop me from doing a SET RMS_DEFAULT/EXTEND= prior to the CONVERT command...

With the input & output and work files all on the same disk, and the input file being contiguous, both methods resulted in output files that were contiguous too (in the SORT scenario, there's obviously 3 files - input, sorted, sorted & converted - these were all contiguous).

The fragmentation of the sort work files appears to be similar between methods (~200 framents each).

However, SORT & CONVERT/NOSORT appears to be consistently quicker/more efficient than CONVERT /SORT

I've looked at the 7.3 RMS Record Management Services Reference Manual, and I have to say I'm confused by the wording used when it describes CONVERT, which probably doesn't help me work out it's meant to function...

"1. The primary keys are sorted...

...If your input records are already ordered by the primary key or if the primary key of the input and output files is the same, you should specify /NOSORT.
This qualifier ensures that the primary keys are not sorted again."

The input file in my case is not indexed/keyed.

However, the manual talks about the primary key of the input/output files being the same (in which case, use /NOSORT), but I'd infer that an input file can only have a primary key that matches the primary key of an output file if the input file is actually indexed/keyed.

It also says that the /NOSORT qualifier ensures that the primary keys "are not sorted again", but it's not really clear in which step (of the 6 in the manual) that the primary keys might be sorted a second or subsequent time.

It also goes on to talk about "secondary key" and "the first alternate key" -
Is "the first alternate key" the "secondary key" (and the second alternate key is the "tertiary key")?
Or is "the first alternate key" the first key after the secondary key (i.e. the tertiary key, or numerically speaking, key=2)?

I've pasted the FDL below (with key &etc. names changed, for security reasons).

I can't see anything obviously wrong about it, except for the fact that it has not been regularly updated as data has grown over the years (the Area 0 allocation should be ~11500000 instead of 5000000).

Perhaps the contents of the FDL would make it obvious as to why CONVERT /SORT appears to be slower - I don't find the description in the RMS manual particularly helpful in visualising how CONVERT does what it does (an example would have been very useful).

TITLE "X.unl A B indexing"

IDENT "22-JAN-2003 12:40:51 OpenVMS FDL Editor"

SYSTEM
SOURCE "OpenVMS"

FILE
NAME "X.unl"
ORGANIZATION indexed

RECORD
CARRIAGE_CONTROL carriage_return
FORMAT variable
SIZE 200

AREA 0
ALLOCATION 5000000
BEST_TRY_CONTIGUOUS yes
BUCKET_SIZE 18
EXTENSION 65520

AREA 1
ALLOCATION 9828
BEST_TRY_CONTIGUOUS yes
BUCKET_SIZE 18
EXTENSION 2466

KEY 0
CHANGES no
DATA_AREA 0
DATA_FILL 100
DATA_KEY_COMPRESSION yes
DATA_RECORD_COMPRESSION yes
DUPLICATES no
INDEX_AREA 1
INDEX_COMPRESSION no
INDEX_FILL 100
LEVEL1_INDEX_AREA 1
NAME "A"
PROLOG 3
SEG0_LENGTH 12
SEG0_POSITION 0
TYPE string
KEY 1
CHANGES no
DATA_AREA 0
DATA_FILL 100
DATA_KEY_COMPRESSION yes
DATA_RECORD_COMPRESSION yes
DUPLICATES no
INDEX_AREA 1
INDEX_COMPRESSION no
INDEX_FILL 100
LEVEL1_INDEX_AREA 1
NAME "B"
PROLOG 3
SEG0_LENGTH 20
SEG0_POSITION 42
TYPE string
KEY 2
CHANGES no
DATA_AREA 0
DATA_FILL 100
DATA_KEY_COMPRESSION yes
DATA_RECORD_COMPRESSION yes
DUPLICATES yes
INDEX_AREA 1
INDEX_COMPRESSION no
INDEX_FILL 100
LEVEL1_INDEX_AREA 1
NAME "C"
PROLOG 3
SEG0_LENGTH 11
SEG0_POSITION 68
TYPE string

Mark Corcoran · ‎03-01-2010

Incidentally, in reply to some of Stephen's points:

>If you're going to look to run with the older OpenVMS tools and settings, then look to create the output file as one huge hunk. And look to allocate the file in (larger) hunks.
I had actually been working on an update to the command files used to unload and index the files, so that they would specify suitable (per-table) values for the initial allocation and extension sizes in RMU UNLOAD, and then a similar thing with the CONVERT (modifying the FDL file as the initial unload file gets bigger).

Some colleagues in the meantime had been messing around with SORT & CONVERT/NOSORT, and found that it was quicker - I couldn't believe that it could be the case, because CONVERT uses the same old SORT shareable image just the same as SORT does.

I was convinced that they weren't comparing apples & pears, and that fragmentation was still an issue, until I managed to reproduce it myself.

>But just for giggles, see if Hypersort helps
I seem to remember reading that HYPERSORT can't be used with indexed files (does this apply to the input file or the output or both)?

>Fragmented disks (which you appear to have here) and dinky allocation sizes make this worse, and the typical SAN stuff is fairly glacial.
Hmm. Not that you knew, but I was working on changing this for all the table unloads, so that the initial allocation size would be commensurate with the current table unload size, and the extent size either 10% of the unload size, or 65000 whichever was the smaller.

It's interesting to note that the sort work files appear to be similarly fragmented irrespective of what method is chosen, yet the output file(s) remain contiguous (on this "test" disk I'm using), but the CONVERT/SORT always appears to be a no-brainer.

Someone has suggested that "these problems always [seem to] go away after a reboot, and that the memory pools are fragmented, causing issues for sort"; I'm a hardened sceptic however, and I think it's just a case of coincidence resulting in people adding 2 + 2 to make 5.

>Tokyo Cabinet or other key-value storage such may well be faster than messing around with the classic stuff
Thanks for the info on this and the URLs; unfortunately, it's the normal case of "money & is it approved" vs "Yes it might well fix the problem".

Hoff · ‎03-01-2010

Some tests...

Burn some disk space and run three one-key files on three disks.

Run a tuning pass on a fully-populated file and see what that turns up for an index depth suggestion.

Depending on what you're up to, try turning off key compression.

Check non-paged and related.

Given the lack of mention of details and the mention of price sensitivity, I'd infer this is Alpha (read: slow) hardware, too. It'd be entertaining trying a prototype of this on one of the mid-vintage Itanium boxes that show up fairly regularly for under US$1000 on the used market. Between DAS SAS I/O and the faster processors and the extra memory, these used Itanium boxes can blow away most Alpha boxes. (Though a ProLiant x86-64 box will likely beat Itanium on price-performance, if not on single-stream performance.)

Hein van den Heuvel · ‎03-01-2010

I don't have enough time right now but want to make a few quick observations to help a little perhaps.

- OpenVMS version is important.
-- CONVERT changed from file mode sort to record mode
-- CONVERT added stats.
I think this was all backin 7.2, but check.

>> longer than I'm prepared to wait, so I abort it.

It might have been 'almost done'. :-). I typically use PROCIO to gage where I am, or DIR/SIZE=ALL on the works files, or ANAL/SYST... SHOW PROC/RMS=(FAB,RAB) to actually understand how far it is, how far to go.
Of course with a current CONVERT and interactive session, you can just use ^T (Control-T) to have Convert tell you where it thinks it is.

Use CONVERT/STAT=FULL to see how much TIME the SORT within COVNERT took.

Try with a smaller cut of the input file?

Use CONVERT/SECO=3 for this file to make CONVERT use just one, bigger, CONVWORK.

Maybe more later...

Hein

Hein van den Heuvel · ‎03-01-2010

A few more minutes, so I'll tackle Mark's reply.

>> Gak! I had attempted to reply earlier, and after clicking on Submit, I focused on doing some other work; come back to the session to find the browser blank

Happened a good few times to me. I try to remember to ^A ^C before hitting submit. If I forget, and it failed, then ALT-back to go back to the prior pages has often brought back my work. Too late for you now....

>> I have performed SORT & CONVERT /NOSORT, got the stats,

CONVERT/STATS=FULL gives nice details.

deleted the files, then performed a CONVERT

>> "1. The primary keys are sorted...

...If your input records are already ordered by the primary key

or

if the primary key of the input and output files is the same, you should specify"

>> The input file in my case is not indexed/keyed.

Fine. Often convert is used from old-indexed to new-indexed.

The 'or' in the manual sentence is critical.
The first part has an implied expectation of a sequential input file.

>> I'd infer that an input file can only have a primary key that matches the primary key of an output file if the input file is actually indexed/keyed.

Right, a common situation, but not your case.

>> primary keys might be sorted a second or subsequent time.

Primary keys are sorted once, nor not at all, if you request so.

>> Is "the first alternate key" the "secondary key" (and the second alternate key is the "tertiary key")?

alternate and secondary are interchangeable for convert. I guess they started with secondary as different from primary. But 3rd secondary sounds too od, so they called it alternate from then onwards?

> the Area 0 allocation should be ~11500000 instead of 5000000

Well, that's a pretty big. That's a lot of potential, and avoidable, extents right there. It also suggests that the BUCKETsize may be stuck in some past value.

>> Perhaps the contents of the FDL would make it obvious as to why CONVERT /SORT appears to be slower

Not really.

>> - I don't find the description in the RMS manual particularly helpful in visualising how CONVERT does what it does (an example would have been very useful).

Hmm, there may have been a release note when /SECON was implemented. $ HELP CONV FILE/SECON is a good read.

Hein

John Gillings · ‎03-01-2010

Mark,

Forum timeouts and lost input are more common than successful posts from this side of the planet (great advertising for HP hardware!)

Standard practice is to select all text and copy before hitting "Submit". Sometimes hitting BACK will recover your text, otherwise you should still have it in the paste buffer.

A really dumb question about this issue... what does SORT of an indexed file actually achieve? I thought the whole point of an indexed file is that it's already sorted (at least on the primary key).

What keys are you sorting on? Maybe if you post the complete commands you're executing I'll understand.

One thing to sanity check is the LRL of the file you're sorting. If the LRL value of the input file is significantly higher than the actual length of the longest record, this can result in severe performance degradation of SORT or CONVERT/SORT (this was/is an issue with STREAM_LF files produced by C I/O). The output file from a CONVERT should have an accurate LRL, so if it's different from the input LRL, investigate the source of the file - why is the LRL inaccurate?

A crucible of informative mistakes

Phil.Howell · ‎03-01-2010

You converted file has two secondary keys, depending on the key data, this can impact performance.
You should create an intermediate fdl removing the B and C keys and convert to that first, then re-convert using your 3 key fdl.
You could also run this with the pre-sort as well for comparison.

Phil

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???

Re: SORT plus CONVERT /NOSORT is quicker than CONVERT /SORT ???