Operating System - HP-UX
1834926 Members
2850 Online
110071 Solutions
New Discussion

finding out unique values from a file??

 
SOLVED
Go to solution
sekar_1
Advisor

finding out unique values from a file??

Hi,
consider a file contains values simply like this:
a=1
a=2
a=5
a=1
a=1
a=1
a=10
a=5
....
....
....

how to find out only unique values from this file?
for example, for the above file
a=1
a=2
a=5
a=10 is what i mean by unique values.

Thank you.
9 REPLIES 9
Dennis Handly
Acclaimed Contributor

Re: finding out unique values from a file??

You can of course use "sort -u" to get the unique values for each line.
If there are only particular fields you want to check, you can use the sort -k option to define the fields.
sekar_1
Advisor

Re: finding out unique values from a file??

ohh...that sort...i forget that ;)
thanks..
i was thinking i have to use a difficult "find and grep" a string..
Kapil Jha
Honored Contributor

Re: finding out unique values from a file??

just do
cat file|uniq>new_file

New_file is ur unique file.

BR,
Kapil
I am in this small bowl, I wane see the real world......
Hein van den Heuvel
Honored Contributor
Solution

Re: finding out unique values from a file??

As pointed out, sort -u, is often the right tool for this.

associative arrays in perl or awk, can also be very useful for these kind of tasks, notably if small transformations need to be done on the data before comparing, or when for example the number or occurances need to be counted.


For example in awk...

$ awk 'END {for (line in x){print line}} {x[$0]++}' x

For example in perl...

$ perl -ne '$x{$_}++ }{ foreach (sort keys %x) {print}'

So for both we take each line and increment (+create) an x array element for each.
At end report all key values.

With count:

$ perl -ne '$x{$_}++ }{ foreach (sort keys %x) {print qq($x{$_}; $_)}' x
4; a=1
1; a=10
1; a=2
2; a=5

Sorted by value:

perl -ne '$x{$_}++ }{ foreach (sort {(split(/=/,$a))[1] <=> (split(/=/,$b))[1]} keys %x) {print}' x

Enjoy!
Hein.
sekar_1
Advisor

Re: finding out unique values from a file??

great..will check that idea..thanks..
Steve Post
Trusted Contributor

Re: finding out unique values from a file??

one more very short answer....

cat myfile | sort | uniq -c | sort -n | more

This will give you a count of the uniq stuff and put the single entries on top. You could use it to find a misspelled werd....wort.... I mean word.


James R. Ferguson
Acclaimed Contributor

Re: finding out unique values from a file??

Hi Sekar:

You reduce the number of processes you spawn by avoiding 'cat' to simply read a file!

Standard Unix commands read their input from a pipe or from a file(or files) specified on the commandline.

Thus:

# cat filename|sort

...wastes a process ('cat') to open and read a file simply to allow 'sort' to receive input. Instead:

# sort filename

...saves a process.

Regards!

...JRF...
Steve Post
Trusted Contributor

Re: finding out unique values from a file??

Thanks.

I seem to get more knowledge than I can dish out.
Hein van den Heuvel
Honored Contributor

Re: finding out unique values from a file??

Steve Post wrote : "I seem to get more knowledge than I can dish out. "

That would be why I keep trying to help some, and surely it is the same reason for many others. In this particalur topic I was reminded of the -c option in uniq which.

We should frame Steve's reply in some "intro to forums document", along the line of, or incorporated in:
/66.34.90.71/ITRCForumsEtiquette/

Cheers,
Hein.