Operating System - OpenVMS
1827620 Members
3381 Online
109966 Solutions
New Discussion

Re: Comparing two files and extract data

 
Ian McWhirter
New Member

Comparing two files and extract data

Hi,

Please can someone help me. I really stuck! I have two text files that I want to compare. If the same data is in both, I want to extract the data.

I want to following output as the data is in both files:-

SEE ATTACHED FILE

Hope someone can help me!!!
Thank you,
Ian
7 REPLIES 7
Ken Robinson
Valued Contributor

Re: Comparing two files and extract data

What you're looking for is the intersection of the two files, i.e. a file containing only those records that are contained in both files.

Unfortunatly, DCL doesn't have a built-in function or command that does that.

Here's a "quick & dirty" command procedure I just wrote & tested really quickly. There's not much error checking, so you probably don't want to use it "as is".

Ken

intersection.com

$ open/read f1 'p1'.txt
$ copy nl: 'p3'.txt
$ open/app out 'p3'.txt
$ read f1 line
$ write out line ! takes care of the header line
$rl:
$ read/end=done f1 line
$ sear/noout 'p2'.txt "''line'"
$ if $status .eqs. "%X00000001" then write/sym out line
$ goto rl
$done:
$ close out
$ close f1
Steven Schweda
Honored Contributor

Re: Comparing two files and extract data

It sounds as if you may need to write a real
program.

If you have (or can get) a copy of GNU
"diff" (part of the GNU Diffutils) package,
then you might get close using something
like this:

pipe gdiff -y V1.TXT V2.TXT | search /match = nor sys$input "<", ">"

For example:

alp $ pipe gdiff -y V1.TXT V2.TXT | search /match = nor sys$input "<", ">"
BIN ITEM BIN ITEM
02R_346 11012447 02R_346 11012447
02S_234 00125774 02S_234 00125774
03B_233 99002567 03B_233 99002567
04P_459 00389256 04P_459 00389256

"gdiff -y" does "Output in two columns", and
you get lines with "<" or ">" if a line
appears in only one file, but lines which
appear in both files will appears twice per
line in the output. A little DCL
post-processing could cure that.

For large files, it may help to sort the
input files before sending them through
"diff".

http://www.gnu.org/software/diffutils/
Phil.Howell
Honored Contributor

Re: Comparing two files and extract data

try this
Phil

$copy file1.txt files.txt
$append file2.txt files.txt
$sort files.txt sorted.txt
$sort/nodup files.txt nodup.txt
$diff/para/match=1 sorted.txt nodup.txt



copy file2.txt
Phil.Howell
Honored Contributor

Re: Comparing two files and extract data

(I don't know where that last line came from)
Phil
Hein van den Heuvel
Honored Contributor

Re: Comparing two files and extract data

Soooo many ways....
The easiest, once you have GNV installed:

$ mcr USERS:[SYS0.SYSCOMMON.GNV.bin]GREP -f file2.tmp file1.tmp

The 'native way' albeit with some noise:

$ conv/fdl="fil; org ind; rec; form fix; size 18; area 0; buc 60; key 0; seg0_l 18"/fast/excep=xxx.tmp/sort/pad/tru fi
le1.tmp,file2.tmp tmp.tmp

pre 8.3 that would be:

$ conv/fdl=sys$input/fast/excep=both.txt/sort/pad/trun
file; org ind; rec; form fix; size 18; area 0; buc 60; key 0; seg0_l 18"
file1.tmp,file2.tmp tmp.tmp


With perl, this script will do the job:

----- both.pl -------
$file = shift @ARGV;
open FILE, "<$file" or die "Could not open $file";
while () {
$seen{$_}++;
}
$file = shift @ARGV;
open FILE, "<$file" or die "Could not open $file";
while () {
print if $seen{$_}++;
}
$ perl both.pl file2.txt file1.txt


And finally with DCL:

------------- both.com --------
$open/read file 'p1
$loop1:
$read/end=file2 file record
$x = f$edit(record,"COLLAPSE")
$x'x = 1
$goto loop1
$
$file2:
$close file
$open/read file 'p2
$loop2:
$read/end=done file record
$x = f$edit(record,"COLLAPSE")
$if f$type(x'x).eqs."INTEGER" then write sys$output record
$goto loop2
$
$done:
$close file

$@both file2.tmp file1.tmp


cheers,
Hein.
Steven Schweda
Honored Contributor

Re: Comparing two files and extract data

And yet another lame method:

> $copy file1.txt files.txt
> $append file2.txt files.txt

copy file1.txt, file2.txt files.txt

sort files.txt sorted.txt
sort /nodup files.txt nodup.txt

pipe diff /match = 1 /merged = 0 /nonumber -
sorted.txt nodup.txt | search /match = nor -
sys$input "******", "Number of difference", -
"DIFFERENCES ", ";"

Robert Gezelter
Honored Contributor

Re: Comparing two files and extract data

Ian,

I would definitely agree with Hein, using CONVERT is the easiest way to do it.

- Bob Gezelter, http://www.rlgsc.com