Re: Pull data from file with script tool?

TDW · ‎10-19-2007

If I have a file like this:

123 789
234 500
999 662
373 881
474 662
611 200
515 789
809 424
234 500

I want to return each row that has a value in field two that appears with one or more different values in field one. So from this list, I want:

999 662
474 662
123 789
515 789

I do not want:

234 500
234 500

because that value matches. I want to see when a value repeats in field two with a different value in field one. Make sense?

I'm sure some awk or perl expert can knock this one out quick. Thanks for the help in advance!

Hein van den Heuvel · ‎10-19-2007

What needs to happen with a 3rd value pair?
- when the first pairs where the same "432 500"
- when they were not "456 789"

Here is one way:

perl -n x.pl x
--- x.pl ---
($left,$right)=split;
$prior = $seen{$right};
if (defined $prior) {
if ($left ne $prior) {
print "$prior $right\n$left $right\n";
delete $seen{$right}
}
} else {
$seen{$right} = $left;
}
-------------

So this tosses ignores repeats and tosses a pair once printed. Cast in order or appearance.

Or this....
Sort first by key 2, then look and remember.
This breaks on a 3rd pair as written.

$ sort -k 2 x | perl -ne '($a,$b)=split; print "$x $y\n$a $b\n" if $a ne $x and $b eq $y; $x=$a; $y=$b'
474 662
999 662
123 789
515 789

I think you want this....

----------- x.pl ----------
($left,$right)=split;
$prior = $seen{$right};
if (defined $prior) {
if ($left ne $prior) {
print "$prior $right\n" unless $header{$right}++;
print "$left $right\n";
}
} else {
$seen{$right} = $left;
}
----------------------------

Hein.

TDW · ‎10-19-2007

I want every line returned where the same value in field two has multiple values in field one. So there could be two or more lines returned for each value in field two. Does that explain it? Thanks for your assistance!

TDW · ‎10-19-2007

After reading your response again, I think I need to explain a little more.

I just need to see every combination once where the second field has more than one value in field one.

So from this list:

999 662
474 662
333 662
123 457
474 662
811 363
474 662

return:

999 662
474 662
333 662

That help?

Hein van den Heuvel · ‎10-19-2007

That's what my final piece of code will do when put into a file and executed as...

perl -n x.pl x

Did you try?

Hein.

James R. Ferguson · ‎10-20-2007

Hi:

Here's another approach:

# cat ./filter
#!/usr/bin/perl
use strict;
use warnings;
my %seen;
my ( $first, $second );
while (<>) {
( $first, $second ) = split;
push( @{ $seen{$second} }, $first );
}
for $second ( sort keys %seen ) {
next if scalar( @{ $seen{$second} } ) == 1;
for $first ( sort @{ $seen{$second} } ) {
print "$first $second\n";
}
}
1;

...run as:

# sort -u file | ./filter
333 662
474 662
999 662

The code begins by using a 'sort' to eliminate duplicate lines. The Perl script splits the lines into two pieces. The second element is used as a hash key and each first element pushed into an array associated with that key.

When all the data has been assimilated, arrays with only one element are skipped. The remaining elements of each array are then printed as associated with their hash key.

Regards!

...JRF...

Sandman! · ‎10-20-2007

Try the awk construct below. It does what you are looking for:

awk '
{
x[$2]++
if (f[$2]) {
if ($0!=f[$2]) f[$2]=f[$2]"\n"$0
else delete f[$2]
} else f[$2]=$0
} END {for (i in f) if (x[i]>1) print f[i]}' file

James R. Ferguson · ‎10-20-2007

Hi (again):

Instead of having to externally sort and pipe the sorted file to the Perl script, you can use this version:

# cat ./filter
#!/usr/bin/perl
use strict;
use warnings;
my %seen;
my ( $first, $second );
open( FH, "-|", "sort", "-u", @ARGV ) or die;
while () {
( $first, $second ) = split;
push( @{ $seen{$second} }, $first );
}
for $second ( sort keys %seen ) {
next if scalar( @{ $seen{$second} } ) == 1;
for $first ( sort @{ $seen{$second} } ) {
print "$first $second\n";
}
}
1;

...now, simply do:

# ./filter file
333 662
474 662
999 662

Regards!

...JRF...

Sandman! · ‎10-20-2007

Base on the sample input provided viz...

999 662
474 662
333 662
123 457
474 662
811 363
474 662

Disregard my last post and instead use the script posted below...

# sort -u file | awk '{i=$2;x[i]++;f[i]=f[i]?f[i]"\n"$0:$0}END{for(j in f) if(x[j]>1) print f[j]}'
333 662
474 662
999 662

~hope it helps

Hein van den Heuvel · ‎10-21-2007

( It this horse dead yet? :-)

JRF, Sandman,...

Once one decides on sorting anyway, then sort it properly and there is no longer a need to remember anything but the last value pair! So no 'end' processing is needed.

sort -k 2 -k 1 -u file | awk 's2!=$2 {s1=$1;s2=$2;next} s1!="" {print s1,s2; s1=""} {print}'

explanation:

- sort by second column first, first colum next, eliminate dups.
- feed into awk (or perl or...)
- if saved-second was not current second then save the current value pair, and be done, issueing a 'next' ready for the next matching.
- (else saved-second was equal to current second)
- if saved-first not empty then print saved value pair and ark as printed by clearing saved-first.
- print current line

I like the way JRF makes the 'seen' array entries lists.
Here is a solution using that technique, but not requiring a sort, thos the output will be in order of input, with the requested restrictions:

#!/usr/bin/perl
use strict;
my %seen;
while (<>) {
my ($left,$right)=split;
if (defined $seen{$right}) { # seen before?
my @prior = @{$seen{$right}}; # grab current list
next if grep (/^$left/,@prior); # eleminate dups
print "@prior[0] $right\n" if (1==@prior); # first time?
print;
}
push ( @{$seen{$right}}, $left); # remember this one
}

Regards,
Hein.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Pull data from file with script tool?

Pull data from file with script tool?