- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- comparing files - not sorted
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2002 12:48 AM
11-19-2002 12:48 AM
comparing files - not sorted
I am trying to compare two large files (hence trying bdiff), but I think I am asking for a bit too much.
the files I want to compare are just a load of reference numbers, but the sorted nature of the file will mean I get an inaccurate picture, eg :
file 1 file 2
------ ------
a1 a1
b1 b1
c1 b2
d1 c1
e1 d1
e1
I want to find out which records do not exist in the other negating the order. So if I use bdiff, four descrepancies will be flagged, when I only want to know about the fact that b2 exists in one file and not the other ... is this possible, am I asking too much. The actual files have 3 Million entries.
thanks a lot
john
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2002 12:51 AM
11-19-2002 12:51 AM
Re: comparing files - not sorted
3 million entries?
I would start thinking of loading this stuff into a database, create the necessary indexes and query the database for the required results.
good luck,
Thierry.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2002 01:17 AM
11-19-2002 01:17 AM
Re: comparing files - not sorted
warning, not tested.
#!/usr/bin/perl
use strict;
use warnings;
use DB_File;
tie my %f1, "DB_File", "f1_tie";
@ARGV = ("f1");
while (<>) {
chomp;
$f1{$_}++;
}
@ARGV = ("f2");
while (<>) {
chomp;
if (exists $f1{$_}) {
print "= $_ $f1{$_}\n";
$f1{$_} = 0;
}
else {
print "> $_\n";
}
}
for(keys%f1) {
$f1{$_} or next;
print "< $_ $f1{$_}\n";
}
untie %f1;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2002 01:31 AM
11-19-2002 01:31 AM
Re: comparing files - not sorted
If you do a unique sort of both files, 'comm' will tell you which records only appear in one file and not the other.
man comm
Regards,
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2002 03:05 AM
11-19-2002 03:05 AM
Re: comparing files - not sorted
I've just tried a quick test using
file1 file2
----- -----
a c
b d
c b
a
a
when I do $comm -12 file1 file2
it gives me: c
But obviously a,b,and c appear in both files
bit confused!
cheers
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2002 03:27 AM
11-19-2002 03:27 AM
Re: comparing files - not sorted
If you sort your example files, then run comm -12, you'll get the correct answer (b and c, a isn't in your second file!).
Actually, from your original post, you should be using comm -23 to list records in file1 and not file2.
Regards,
John