- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Using perl to compare large lists ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 12:59 AM
тАО02-07-2006 12:59 AM
Against large files, I get "Out of memory!" using the code below. The basic idea is, I want the items in list A which are not in list B for very large lists. Are there options outside of breaking down the files into smaller chunks?
Thanks, all.
#!/usr/bin/perl -w
open (FILE0,$ARGV[0]) || die "Problem with $ARGV[0]\n";
open (FILE1,$ARGV[1]) || die "Problem with $ARGV[1]\n";
while ( defined($item0=
{
seek FILE1, 0, 0;
@x=grep { /$item0/ }
if ( $#x != 0 ) { print $item0 } # For items in argv0, but not in argv1.
}
close (FILE1);
close (FILE0);
Solved! Go to Solution.
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 01:10 AM
тАО02-07-2006 01:10 AM
Re: Using perl to compare large lists ...
If the small file is still too large, use a tied hash
#!/usr/bin/perl
use strict;
use warnings;
my $f1 = shift or die "usage: $0 file1 file ...\n";
open my $f, "< $f1" or die "$f1: $!\n";
tie my %f1, "DB_File", "/tmp/keys$$";
while (<$f>) {
$f1{$_}++;
}
while (<>) { # The rest of the files
if (exists $f1{$_} {
# This line is also in file0
}
else {
# This is not
}
}
untie %f1;
unlink "/tmp/keys$$";
-->8---
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 01:16 AM
тАО02-07-2006 01:16 AM
Re: Using perl to compare large lists ...
Perhaps, instead of building a list whose elements are the full record, you could build a hash of the keys to the records you want to report. Then, walk the hash and print the items of interest using the keys to seek the full record.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 01:20 AM
тАО02-07-2006 01:20 AM
Re: Using perl to compare large lists ...
not a perl reply, but did you have a look at "comm" ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 01:25 AM
- Tags:
- comm
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 03:35 AM
тАО02-07-2006 03:35 AM
Re: Using perl to compare large lists ...
Check this link, it could be helpful
http://www.codecomments.com/archive234-2005-5-497414.html
-Arun
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2006 08:30 PM
тАО02-07-2006 08:30 PM
Re: Using perl to compare large lists ...
why not use:
grep -vf small_file big file
of course file have to be sorted.
HTH,
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-13-2006 04:40 AM
тАО02-13-2006 04:40 AM
Re: Using perl to compare large lists ...
grep: not enough memory
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-13-2006 05:04 AM
тАО02-13-2006 05:04 AM
Re: Using perl to compare large lists ...
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-13-2006 05:26 AM
тАО02-13-2006 05:26 AM
Re: Using perl to compare large lists ...
First, Did you try Merijn's first suggestion? I see no 'points' to indicate a usefulness for the answer.
Secondly, is this a once-in-a-lifetime-never-mind-the-processing-time cleanup, or a job to be scheduled frequently?
Third, Really large lists don't just happen.
They have some meaing, often order, often some key, some 'objectness' or record size.
If you describe that to us, then we may be able to help better.
For example, you may be able to build an array of key values and seek addresses such as not to store the whole 'records/blobs' and then later go back.
If they are sorted you can do a classic some from the left, some from the right compare. See below.
How much difference do you expect, is still useful to report?
How about not breaking down the actual list in chunks, but just the processing?
Some algoritme where you read in N records (100,000?) from one file, then start reading the other file and deleting matches untill less then M matches (10,000?).
Now read N-M more from the first file, and continue reading from where you were the second file, or re-start the second file from the start untill again dowm below M matches. Repeat untill at end of first file.
When at at, make one more sweep from second file. Admittedly this is a lot of handwaving, but something along those lines, the details heavily depending on what you know about the files.
Below you'll find a script I made and used to walk two largish parameter value list I needed to compare and report on. This is a case where I knew the params were sorted by a key. So I split the lines from f1 and f2 into their keys k1 and k2 and values v1 and v2. The compare k1 and k2. If equal compare values v1 and v2 and report. If k1
The actual script below is probably not useful (unless you are comparing SAP benchmark r3.out files ;-), but the principle may become clearer reading it. (then again, the added processing like 'known to be variable' substitutions' and pretty-printing may confuse the concept beyond recognition :^)
Hope this helps some,
Hein.
#!/bin/perl
#
$f1 = @ARGV[0];
$f2 = @ARGV[1];
$ALL = @ARGV[2];
die "Must provide two R3.out files to compare" unless $f2; open (F1, $f1) || die "Error open file 1: $f1"; open (F2, $f2) || die "Error open file 2: $f2";
# Find system ID and first parameter
while (
$S1 = $1 if (/SAP System\s+(\w+)\s/);
$I1 = $1 if (/^INSTANCE_NAME\s+\(!\) (\w+)/);
last if (/^Param/);
}
while (
$S2 = $1 if (/SAP System\s+(\w+)\s/);
$I2 = $1 if (/^INSTANCE_NAME\s+\(!\) (\w+)/);
last if (/^Param/);
}
$format = "%-30.30s %s %-20s %s %-20s\n"; print "\nColumn \"?\" legend: \"|\" = default, \"X\" = changed, \" \" = missing.\n\n"; printf $format, "Parameter", " ", "$S1 - $I1", " ", "$S2 - $I2"; printf $format, "------------------------------","?","--------------------",
"?","--------------------"; while (
$v1 = " ";
if (/^(\S+).*( |\(!\)) (.{1,20})/) {
$k1 = $1;
$d1 = ($2 eq " ") ? "|" : "X";
$v1 = $3;
$v1 =~ s/$S1/{SID}/g;
$v1 =~ s/$I1/{INST}/g;
}
while ($k2 lt $k1) {
last unless ($_ =
$v2 = " ";
if (/^(\S+).*( |\(!\)) (.{1,20})/) {
$k2 = $1;
$d2 = ($2 eq " ") ? "|" : "X";
$v2 = $3;
$v2 =~ s/$S2/{SID}/g;
$v2 =~ s/$I2/{INST}/g;
}
if ($k2 lt $k1) {
printf $format, $k2, " ", " ", $d2, $v2;
}
}
if ($k1 eq $k2) {
printf $format, $k1, $d1, $v1, $d2, $v2 if
($ALL || ($v1 ne $v2 && ($d1.$d2 ne "||") ) );
} else {
printf $format, $k1, $d1, $v1, " ", " ";
}
}
Sample output
Column "?" legend: "|" = default, "X" = changed, " " = missing.
Parameter BE2 - D11 RAC - D02
------------------------------ ? -------------------- ? --------------------
DIR_ORAHOME | /oracle/{SID} X /oracle/{SID}/901_64
ES/SHM_MAX_PRIV_SEGS X 63 | 2
ES/TABLE X SHM_SEGS | UNIX_STD
SAPDBHOST X b009 X sapsd