Operating System - Linux
1748219 Members
4575 Online
108759 Solutions
New Discussion юеВ

Re: file with duplication ignor anything where there is a duplicate.

 
SOLVED
Go to solution
rmueller58
Valued Contributor

file with duplication ignor anything where there is a duplicate.

I have a flat file with "names" in it.

See below:

aanderson
abergman
abergman
aboell
aboell
abone
abridwell
abridwell
aburks
achowdhury

for records containing duplicates I want to ignor these all together and only get the records where there is a single record..
The file is an a-z so I can't just do a simple grep ignor..
Any insight appreciated..

Rex Mueller - Unix System ESU#3
15 REPLIES 15
Peter Nikitka
Honored Contributor

Re: file with duplication ignor anything where there is a duplicate.

Hi,

since it seems, that the file is sorted, you can use 'uniq' (see man page).

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Peter Nikitka
Honored Contributor

Re: file with duplication ignor anything where there is a duplicate.

Hi,

since it seems that the file is sorted, you can use 'uniq' (see man page).

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
James R. Ferguson
Acclaimed Contributor

Re: file with duplication ignor anything where there is a duplicate.

Hi Rex:

# cat ./report
#!/usr/bin/perl
use strict;
use warnings;
my %names;
my $key;
while (<>) {
$names{$_}++;
}
for $key (sort keys %names) {
print $key if $names{$key} == 1;
}
1;

...run as:

# ./report filename

Regards!

...JRF...
Sandman!
Honored Contributor
Solution

Re: file with duplication ignor anything where there is a duplicate.

The requirement is to ignore those names that appear more than once in the input file and print only those that occur once?? If that's the case, try the awk construct below (assuming file has one column records only):

# awk '{x[$1]++}END{for(i in x) if(x[i]==1) print i}' file
James R. Ferguson
Acclaimed Contributor

Re: file with duplication ignor anything where there is a duplicate.

Hi (again) Rex:

If you prefer, the Perl script I offered can be reduced to a commandline script:

# perl -ne '$names{$_}++;END{for $key (sort keys %names) {print $key if $names{$key}==1}}' filename

Regards!

...JRF...
OldSchool
Honored Contributor

Re: file with duplication ignor anything where there is a duplicate.

perhaps something like:

sort filename | uniq > outfilename

would work for you?
rmueller58
Valued Contributor

Re: file with duplication ignor anything where there is a duplicate.

Jim,

I tried the script the names and duplicates remain.. Any ideas?

Sandman!
Honored Contributor

Re: file with duplication ignor anything where there is a duplicate.

Did you try the awk script I posted? Does the file contain mixed-case names or does it have all lowercase names?

rmueller58
Valued Contributor

Re: file with duplication ignor anything where there is a duplicate.

Sandman You DA MAN!!! I will run it past the recipient to see if this is the data they are looking for.

THANKS!! Kudos to all