- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Pull data from file with script tool?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-19-2007 02:03 PM
тАО10-19-2007 02:03 PM
123 789
234 500
999 662
373 881
474 662
611 200
515 789
809 424
234 500
I want to return each row that has a value in field two that appears with one or more different values in field one. So from this list, I want:
999 662
474 662
123 789
515 789
I do not want:
234 500
234 500
because that value matches. I want to see when a value repeats in field two with a different value in field one. Make sense?
I'm sure some awk or perl expert can knock this one out quick. Thanks for the help in advance!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-19-2007 02:42 PM
тАО10-19-2007 02:42 PM
Solution- when the first pairs where the same "432 500"
- when they were not "456 789"
Here is one way:
perl -n x.pl x
--- x.pl ---
($left,$right)=split;
$prior = $seen{$right};
if (defined $prior) {
if ($left ne $prior) {
print "$prior $right\n$left $right\n";
delete $seen{$right}
}
} else {
$seen{$right} = $left;
}
-------------
So this tosses ignores repeats and tosses a pair once printed. Cast in order or appearance.
Or this....
Sort first by key 2, then look and remember.
This breaks on a 3rd pair as written.
$ sort -k 2 x | perl -ne '($a,$b)=split; print "$x $y\n$a $b\n" if $a ne $x and $b eq $y; $x=$a; $y=$b'
474 662
999 662
123 789
515 789
I think you want this....
----------- x.pl ----------
($left,$right)=split;
$prior = $seen{$right};
if (defined $prior) {
if ($left ne $prior) {
print "$prior $right\n" unless $header{$right}++;
print "$left $right\n";
}
} else {
$seen{$right} = $left;
}
----------------------------
Hein.
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-19-2007 03:34 PM
тАО10-19-2007 03:34 PM
Re: Pull data from file with script tool?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-19-2007 03:40 PM
тАО10-19-2007 03:40 PM
Re: Pull data from file with script tool?
I just need to see every combination once where the second field has more than one value in field one.
So from this list:
999 662
474 662
333 662
123 457
474 662
811 363
474 662
return:
999 662
474 662
333 662
That help?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-19-2007 04:27 PM
тАО10-19-2007 04:27 PM
Re: Pull data from file with script tool?
perl -n x.pl x
Did you try?
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-20-2007 03:03 AM
тАО10-20-2007 03:03 AM
Re: Pull data from file with script tool?
Here's another approach:
# cat ./filter
#!/usr/bin/perl
use strict;
use warnings;
my %seen;
my ( $first, $second );
while (<>) {
( $first, $second ) = split;
push( @{ $seen{$second} }, $first );
}
for $second ( sort keys %seen ) {
next if scalar( @{ $seen{$second} } ) == 1;
for $first ( sort @{ $seen{$second} } ) {
print "$first $second\n";
}
}
1;
...run as:
# sort -u file | ./filter
333 662
474 662
999 662
The code begins by using a 'sort' to eliminate duplicate lines. The Perl script splits the lines into two pieces. The second element is used as a hash key and each first element pushed into an array associated with that key.
When all the data has been assimilated, arrays with only one element are skipped. The remaining elements of each array are then printed as associated with their hash key.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-20-2007 05:44 AM
тАО10-20-2007 05:44 AM
Re: Pull data from file with script tool?
awk '
{
x[$2]++
if (f[$2]) {
if ($0!=f[$2]) f[$2]=f[$2]"\n"$0
else delete f[$2]
} else f[$2]=$0
} END {for (i in f) if (x[i]>1) print f[i]}' file
- Tags:
- awk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-20-2007 06:20 AM
тАО10-20-2007 06:20 AM
Re: Pull data from file with script tool?
Instead of having to externally sort and pipe the sorted file to the Perl script, you can use this version:
# cat ./filter
#!/usr/bin/perl
use strict;
use warnings;
my %seen;
my ( $first, $second );
open( FH, "-|", "sort", "-u", @ARGV ) or die;
while (
( $first, $second ) = split;
push( @{ $seen{$second} }, $first );
}
for $second ( sort keys %seen ) {
next if scalar( @{ $seen{$second} } ) == 1;
for $first ( sort @{ $seen{$second} } ) {
print "$first $second\n";
}
}
1;
...now, simply do:
# ./filter file
333 662
474 662
999 662
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-20-2007 06:35 AM
тАО10-20-2007 06:35 AM
Re: Pull data from file with script tool?
999 662
474 662
333 662
123 457
474 662
811 363
474 662
Disregard my last post and instead use the script posted below...
# sort -u file | awk '{i=$2;x[i]++;f[i]=f[i]?f[i]"\n"$0:$0}END{for(j in f) if(x[j]>1) print f[j]}'
333 662
474 662
999 662
~hope it helps
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-21-2007 05:57 AM
тАО10-21-2007 05:57 AM
Re: Pull data from file with script tool?
JRF, Sandman,...
Once one decides on sorting anyway, then sort it properly and there is no longer a need to remember anything but the last value pair! So no 'end' processing is needed.
sort -k 2 -k 1 -u file | awk 's2!=$2 {s1=$1;s2=$2;next} s1!="" {print s1,s2; s1=""} {print}'
explanation:
- sort by second column first, first colum next, eliminate dups.
- feed into awk (or perl or...)
- if saved-second was not current second then save the current value pair, and be done, issueing a 'next' ready for the next matching.
- (else saved-second was equal to current second)
- if saved-first not empty then print saved value pair and ark as printed by clearing saved-first.
- print current line
I like the way JRF makes the 'seen' array entries lists.
Here is a solution using that technique, but not requiring a sort, thos the output will be in order of input, with the requested restrictions:
#!/usr/bin/perl
use strict;
my %seen;
while (<>) {
my ($left,$right)=split;
if (defined $seen{$right}) { # seen before?
my @prior = @{$seen{$right}}; # grab current list
next if grep (/^$left/,@prior); # eleminate dups
print "@prior[0] $right\n" if (1==@prior); # first time?
print;
}
push ( @{$seen{$right}}, $left); # remember this one
}
Regards,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-21-2007 02:55 PM
тАО10-21-2007 02:55 PM
Re: Pull data from file with script tool?
474 662
999 662
123 789
515 789
Hein, the last code you gave seems to start running and then hang , like this:
perl -n perlHein.pl /tmp/testdata
999 662
474 662
Not sure why that is happening.
Thanks much for sharing your skill!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-21-2007 03:08 PM
тАО10-21-2007 03:08 PM
Re: Pull data from file with script tool?
The x.pl is the perl script text
The x was the data file name
The -n tell perls to create an implied loop to process each input line.
The last solution was presented as a full program, invoking perl itself, and with the loop explicitly coded.
Loop: while (<>) {
That solution should be invoked with:
./script_name file_name
Sorry for not making that clear.
And I forgot to mark the 'retain formattng' option while posting, so the loop was not visibile with the indenting either!
The 'hang' is reading from STDIN for more data to process.
Sorry 'bout that confusion!
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-21-2007 06:45 PM
тАО10-21-2007 06:45 PM
Re: Pull data from file with script tool?
# cat file
123 789
234 500
999 662
373 881
474 662
611 200
515 789
809 424
234 500
# awk '{i=$2;if(x[i] && x[i]!=$1) print x[i],i"\n"$0;x[i]=$1}' file
999 662
474 662
123 789
515 789
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-21-2007 11:28 PM
тАО10-21-2007 11:28 PM
Re: Pull data from file with script tool?
Two rules to practice thsi week...
1) As simple as possible, but no simpler
2) leave well enough alone
Your new examples only work for certain data sets. It's much similar to my first solution and fails for much the same reason
It remembers only left value for a given right value, where there might be a list.
And on each 'fresh' value it reprints the save values potentially making fresh dups as it goes. Try this data set...
999 662
474 662
333 662
123 457
474 662
811 363
474 662
123 789
234 500
999 662
373 881
474 662
611 200
515 789
809 424
234 500
Cheers,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-22-2007 04:19 AM
тАО10-22-2007 04:19 AM
Re: Pull data from file with script tool?
Each record in the input file is replaced by the succeeding one and since the file is not sorted the dups fall thru the cracks. But this horse ain't dead yet ;) so here is the final version of the awk construct.
awk '{
if (x[$0]!=$0) {
y[$2]=(y[$2]?y[$2]"\n"$0:$0); z[$2]++
} x[$0]=$0
}END{
for (i in y) if (z[i]>1) print y[i]
}' file