Re: identify duplicates in a file

Anand_30 · ‎10-29-2003

Hi,

I have a file which has around 500 numbers and some numbers are duplicate. Is there any way to find out which numbers have duplicate entry in the file.

Thanks,
Anand.

RAC_1 · ‎10-29-2003

Check man page of uniq.

cat your_file | uniq -d

Will give you the entries that are repeated.

(File is ASCII file)

There is no substitute to HARDWORK

Hein van den Heuvel · ‎10-29-2003

Ayup uniq will do the trick.

Now if you want to do something more then just print the numbers and then you might go perl:

perl -e 'while (<>){ if (defined($x{$_})) { print } else { $x{$_}=1 }}' < yourfile

replace the 'print' to something weird or wonderful at your whim.

Hein.

Graham Cameron_1 · ‎10-29-2003

uniq will only work for adjacent lines.
ie, it will not find "line 3" in

line 1
line 2
line 3
line 4
line 3

I would use sort and sort -u to create 2 files, and diff to compare them.

sort file > f1
sort -u file > f2
diff f1 f2

This will show all duplicate lines, prefixed with "<".
If you want to take out the noise, use

diff f1 f2|grep "^<"|cut -c 3-

-- Graham

Computers make it easier to do a lot of things, but most of the things they make it easier to do don't need to be done.

Mark Grant · ‎10-29-2003

Maybe we could just do it the simple way by combining several options above.

cat file | sort -n | uniq -d

Never preceed any demonstration with anything more predictive than "watch this"

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: identify duplicates in a file

identify duplicates in a file