1824169 Members
3425 Online
109668 Solutions
New Discussion

OPENVMS SORT problem

 
Desmond Or
Advisor

OPENVMS SORT problem

We have approximately 1.3 million records in a initial file. Several SORT commands are executed against this file – to sort it by “/KEY” and “/DESCENDING” and “/ASCENDING “ and “/NODUPLICATE”. This is done in order to remove duplicate records from the final version of the file.

We are finding that the SORT is not generating the results expected. Records we expect to be removed as a duplicate remain in the file and the record we wanted to remain in the final file has been removed.

Does someone know that there is other 3rd party SORT utility I can use on OpenVMS? Or any suggestion for this data sorting?
5 REPLIES 5
Hoff
Honored Contributor

Re: OPENVMS SORT problem

Just because the SORT command results are unexpected does not mean that the results are wrong; you're going to want to (need to) provide rather more context here.

Posting the DCL SORT command, whether or not you're using HyperSort, and a small reproducer data file matching the command would go a very long way toward describing what is happening here and what you expect to have happen within the example.

SORT is good at handling duplicates (with and without /STABLE), but it has no innate knowledge of which record is locally considered the duplicate, either.

It's likely feasible to custom-create tools for this task, but (beyond tools such as SORT) there are few generic tools toward this goal; the RMS file and record formats all tend to be local.
Hein van den Heuvel
Honored Contributor

Re: OPENVMS SORT problem

This is documented behaviour. From HELP:

SORT /DUPLICATES Full_Description

By default, Sort/Merge retains records with equal keys. The /NODUPLICATES qualifier eliminates all but one record with equal keys. The retained records may not appear in the same order as they appeared in the input file. If you want to specify which duplicate record to keep, invoke Sort at the program level and specify an equal-key routine.

The /STABLE and the /NODUPLICATES qualifiers are mutually exclusive.
-----------------------------

One could hope that /STABLE would perhaps make sort keep the first occurance but as per aboce, that

Using /STABLE could of course prepare an output file which could subsequently easily be filtered for dups.

Maybe you should ask yourself exactly how you decided which records, duplicated by key, is the right one to keep. Perhaps there is a way to express that in a 'specification file'?

hth,
Hein.
Jon Pinkley
Honored Contributor

Re: OPENVMS SORT problem

Desmond,

Can you create a small file (10 records should be big enough to reproduce) and show us what you are doing, and what results you are seeing that you didn't expect?

Is there a single key? Evidently you are making multiple sort passes. You don't specify that you are using /STABLE. That may be a requirement. See help sort /stable

You description is not detailed enough for me to understand the problem, so I (and probably others) can't offer any useful suggestion, only guesses.

Jon

it depends
Jon Pinkley
Honored Contributor

Re: OPENVMS SORT problem

As Hein pointed out, you can't use /noduplicate and /stable. Also, I am not aware of any VMS utility to find the first occurence of records with unique keys.

Once the file is sorted, it isn't hard to write a program that will write the first occurance of each record with a /key value, but it would be nice if a utility like the following existed:

$ UNIQUE /KEY=(...) INFILE OUTFILE ! does not exist

Jon

it depends
Graham Burley
Frequent Advisor

Re: OPENVMS SORT problem

What's the difference between records you want to keep and those you don't? If you can describe this in terms of the data in the records then it's highly likely that SORT can do what you want it to.

SORT using Specification Files can do more than just sort the file, it can /INCLUDE or /OMIT records based test /CONDITIONS on /FIELDS in the record (including binary data).

$ help sort spec spec
SORT
Specification_File_Qualifiers
Specification_File_Example