- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: awk parsing 2 files help
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2004 10:17 PM
02-01-2004 10:17 PM
awk parsing 2 files help
I have two big files, what I want is to get those fields that match from my 1st and 2nd files and those that did not match.
File1:
xxx 10 hello
yyy 20 hello
xxx 20 hello
File2:
xxx thanks 10
xxx please 20
zzz thanks 10
OUTPUT:
xxx 10 thanks hello
xxx 20 please hello
zzz 10 thanks hello
zzz 10 thanks
yyy 20 hello
Fields 1 and 2 of file1 should match fields 2 and 4 of file2.
thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2004 10:24 PM
02-01-2004 10:24 PM
Re: awk parsing 2 files help
xxx 10 thanks hello
xxx 20 please hello
zzz 10 thanks _____
yyy 20 ______ hello
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2004 10:39 PM
02-01-2004 10:39 PM
Re: awk parsing 2 files help
can you clarify this a bit more? I still don't know, what you want.
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2004 11:08 PM
02-01-2004 11:08 PM
Re: awk parsing 2 files help
Have a look at the "join" command instead, it matches fields from two files and print out selected fields from both of the files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-02-2004 12:08 AM
02-02-2004 12:08 AM
Re: awk parsing 2 files help
If "join", as suggested, is no good, try the man pages for "comm", "uniq".
-- Graham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-02-2004 12:30 AM
02-02-2004 12:30 AM
Re: awk parsing 2 files help
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=372886
Regrards,
Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-02-2004 03:55 AM
02-02-2004 03:55 AM
Re: awk parsing 2 files help
> I have two big files
Define big! for less than 10MB or so I would definitely just write a PERL (not awk!) script that remembers all lines and columns to print them (optionall sorted) out after all is read. For an example see below.
For file larger then 1000MB you would need to pre-sort and do a classic merge join.
(read one, read other untill larger than one, read one untill larger then other and so on.). That is readily done with awk (as long as the input is sorted, unlike your example!).
While you sort, or in addition to sort, you could perhpas re-arrange the join fields such that the standard join tool can do the final work.
hth,
Hein.
>>>> Fields 1 and 2 of file1 should match fields 2 and 4 of file2.
You meant 1 and 2 matching 1 and 3 right?
open (FILE, "
chop;
($k1,$k2,$c) = split;
$x1{$k1." ".$k2} = "------";
$x2{$k1." ".$k2} = $c;
}
close (FILE);
open (FILE, "
chop;
($k1,$c,$k2) = split;
$x1{$k1." ".$k2} = $c;
$x2{$k1." ".$k2} = "------" unless ($x2{$k1." ".$k2});
}
foreach $k (sort keys %x1) {
print "$k $x1{$k} $x2{$k}\n";
}
xxx 10 thanks hello
xxx 20 please hello
yyy 20 ------ hello
zzz 10 thanks ------
without the sort in the foreach you'd get:
xxx 10 thanks hello
zzz 10 thanks ------
yyy 20 ------ hello
xxx 20 please hello
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-02-2004 06:47 PM
02-02-2004 06:47 PM
Re: awk parsing 2 files help
Next, the order of the output. Should it be aphabetically ordered or is the order unimportant? If so, I could think of a nice solution, so please give more info on this.
And, to prevent procura to 'nag' about the solution I've in mind, empty lines can be ignored? Only lines with 3 fields are important/exist?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2004 05:42 PM
02-03-2004 05:42 PM
Re: awk parsing 2 files help
xxx 10 hello
xxx 20 hello
xxx 10 bybye
yyy 10 oopsy
Or in File2:
xxx thanks 10
xxx please 10
xxx please 20
That way all solutions with putting 1 of the files in memory will fail, and a new solution should be written.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2004 06:37 PM
02-03-2004 06:37 PM
Re: awk parsing 2 files help
Sorry for the late response... Let me clarify my question.
File1 is the main file meaning every rows from this file will be part of the output, for example:
File1:
xxx thanks 10
xxx thanks 20
yyy please 10
zzz help 10
File1, fields 1 and 3 have to be matched with File2 fields 1 and 2. Those that matched will have an another field which came from File2. So if File2 contents are:
xxx 10 hello
xxx 20 hello
zzz 10 ok
zzz 20 ok
Then, if the fields did not matched then I have to put a default field of "notmatched", the output will be:
10 xxx thanks hello
20 xxx thanks hello
10 yyy please notmatched
10 zzz help notmatched
I hope this time, I'm clear enough.
Thank you very much for the help. The first solution you gave me was really great and it made my script really fast. :-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2004 08:07 PM
02-03-2004 08:07 PM
Re: awk parsing 2 files help
try the attachment.
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2004 09:18 PM
02-03-2004 09:18 PM
Re: awk parsing 2 files help
Now for the solution. What I suggest is to combine the files again, but this time we do it a bit different. I have not tested this on large files, but the script would become:
( awk '{printf "1 %s %s %s",$1,$3,$2}' < File2 ; awk '{printf "2 %s %s %s",$1,$2,$3}' < File1 ) | sort -k 2,3 -k 1 | awk '$1=="1" { last1=$2;last2=$3;last3=$4 }
$1=="2" { if ($2==last1 && $3==last2)
{ printf "%s %s %s %s\n",$2,$3,$4,last3 }
else {printf "%s %s %s NOTMATCHED\n",$2,$3,$4}}'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2004 09:42 PM
02-03-2004 09:42 PM
Re: awk parsing 2 files help
The problem, as it is described, can be much simplified with some pre-processing of the data. By reordering and merging the matching fields to one field in each file you can do a simple join and thed split the fields in the output. Try the following:
awk '{ printf "%s#%s %s\n", $1, $3, $2 }' xxx | sort >xxx1
awk '{ printf "%s#%s %s\n", $1, $2, $3 }' yyy | sort >yyy1
join -1 1 -2 1 -o 1.1,1.2,2.2 -a 1 xxx1 yyy1 | tr "#" " "
It is not a final solution but may give you some ideas.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2004 02:07 AM
02-04-2004 02:07 AM
Re: awk parsing 2 files help
Bah humbug.
This is a completely different requirement description from the initial:
> ooops, output should be:
>
> xxx 10 thanks hello
> xxx 20 please hello
> zzz 10 thanks _____
> yyy 20 ______ hello
That line 'zzz' could have only originated from file 2.
Now you tell us that file 1 is a driver, and the 'unmatched' can only appear in the last output column.
Much simpler! Boring even, and essentially answerred in all prior replies.
Kindly ask the rigth question and study the replies!
Cheers,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2004 05:13 PM
02-04-2004 05:13 PM