sort & uniq

Ravinder Singh Gill · ‎12-09-2005

in a script I have the following:

while read fakeuser
do
grep $fakeuser passgvts >> queryusers
done < testnotokusers

However I have a lot of repeats in the file queryusers due to having two same entries in testnotokusers i.e. gillar and having two very similar entries in passgvts i.e. gillars & gillarst. Hence for each entry I end up with it being outputted to queryusers twice as it is being grepped twice i.e. the queryusers file may look like:

gillars
gillarst
gillars
gillarts

As there are hundreds of entries I can not go through the whole file deleting repetitions. It has been suggested that I do something like the following at the bottom of the script

cat queryusers ¦ sort ¦ uniq > newfile

hence this would sort the queryusers file and send unique entries to the newfile. Can someone please tell me what options I need for "sort" and "uniq" if any. Thanks

Bill Hassell · ‎12-09-2005

If the queryusers file has the name as the first entry on the line, then |sort|uniq will reduce all occurances of the same name to just one. If the problem is that you are looking for an exact word (ie, gillar) and do not want anything else (like gillars and gillarst), use the -w option in grep. This option is only available with the latest patch for grep.

Bill Hassell, sysadmin

Hein van den Heuvel · ‎12-10-2005

ah... a continuation/parallel thread to:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=980856

As bill says... you may want to grep explicitly for a word, or use an egrep anchoring your words at begin of line, following them with whitepace, or anything else to make it do what you want exactly.

Now if you do want to post-process that queryusers files, and not simply avoid the dups as per the other topic, then just can just tell uniq to look only for the first N characters:

sort queryusers | uniq -w 6

but then why not go one step further and jsut tell srot exatly what you want:

Something like:

sort -k 1.1,1.6 -u queryusers

caveat... I did not have access to an hpux box just now, so I only tested on my CD player which runs redhat linux 2.4

Hein.

Arturo Galbiati · ‎12-11-2005

Hi Ravinder,
sort -uo newfile queryusers
should be sufficent to remove dusplicates from queryusers if this file contains only the name of teh users otherwise is necessary to provide the key for the sort in teh format
-k to teh sort.
You can type man sort to look at this.
HTH,
Art

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

sort & uniq

sort & uniq

Re: sort & uniq

Re: sort & uniq

Re: sort & uniq