<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how to take out duplicate ones and keep the sequences in the file in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005157#M297948</link>
    <description>You have a rather annoying practice of not knowing how to do something and then specifying that it be done not only in the shell but in a particular shell. This task would be MUCH prettier and elegant in Perl but we can leverage sort, uniq, and grep -q to do what you want in the Korn sh.&lt;BR /&gt;&lt;BR /&gt;----------------------------------------&lt;BR /&gt;#!/usr/bin/ksh&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;TDIR=${TMPDIR:-/var/tmp}&lt;BR /&gt;UNIQUES=${TDIR}/F${$}.uniq&lt;BR /&gt;DUPS=${TDIR}/F${$}.dup&lt;BR /&gt;TFILE=${TDIR}/F${$}.tmp&lt;BR /&gt;&lt;BR /&gt;trap 'eval rm -r ${UNIQUES} ${DUPS} ${TFILE}' 0 1 2 3 15&lt;BR /&gt;&lt;BR /&gt;# Copy stdin to a temp file&lt;BR /&gt;rm -f ${TFILE} ${DUPS}&lt;BR /&gt;while read X&lt;BR /&gt;  do&lt;BR /&gt;    echo "${X}" &amp;gt;&amp;gt; ${TFILE}&lt;BR /&gt;  done&lt;BR /&gt;# Sort temp file and find unique words&lt;BR /&gt;sort ${TFILE} | uniq -u &amp;gt; ${UNIQUES}&lt;BR /&gt;echo "\c" &amp;gt; ${DUPS} # null file&lt;BR /&gt;# Now read temp file; if word is unique echo it&lt;BR /&gt;cat ${TFILE} | while read X &lt;BR /&gt;  do&lt;BR /&gt;    grep -q "${X}" ${UNIQUES}&lt;BR /&gt;    STAT=${?}&lt;BR /&gt;    if [[ ${STAT} -eq 0 ]]&lt;BR /&gt;      then&lt;BR /&gt;        echo "${X}" &lt;BR /&gt;      else&lt;BR /&gt;#       not found in Unique file; see if it is in dups&lt;BR /&gt;        grep -q "${X}" ${DUPS}&lt;BR /&gt;        STAT=${?}&lt;BR /&gt;        if [[ ${STAT} -ne 0 ]]&lt;BR /&gt;          then # not already written; echo to stdout and insert in dups file&lt;BR /&gt;            echo "${X}"&lt;BR /&gt;            echo "${X}" &amp;gt;&amp;gt; ${DUPS} &lt;BR /&gt;          fi&lt;BR /&gt;      fi&lt;BR /&gt;  done&lt;BR /&gt;exit 0  &lt;BR /&gt;-----------------------------------------&lt;BR /&gt;&lt;BR /&gt;Useit like this:&lt;BR /&gt;removedups.sh &amp;lt; infile &amp;gt; outfile&lt;BR /&gt;&lt;BR /&gt;What is does is first copy each line of stdin to a temporary file. Next that temporary file is sorted and passed to uniq -u to create a second temporary file containing only unique lines. Now we reread the temporary file and use grep -q to determine if the line is unique. If so, we echo it to stdout. If not, we now need to determine if this is the first time that the duplicate word has been echo'ed. We use grep to examine a third temporary file to see if the word is found, if not, echo the line to stdout and also append it to the third temporary file. When finished, a trap removes all the temporary file and your duplicates have been removed and the original order has been preserved.&lt;BR /&gt;&lt;BR /&gt;NOTE: This still should have been done in Perl.&lt;BR /&gt;</description>
    <pubDate>Tue, 22 May 2007 14:57:08 GMT</pubDate>
    <dc:creator>A. Clay Stephenson</dc:creator>
    <dc:date>2007-05-22T14:57:08Z</dc:date>
    <item>
      <title>how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005152#M297943</link>
      <description>I have a file, and it includes multiple items(words), one item per line. Some of itmes are duplicaetd, I want to remove the repeated ones, and also leave unique one in the file. I also want to keep the original sequences of these items( which means, I can not use sort -u). How do I achieve that by using ksh?&lt;BR /&gt;&lt;BR /&gt;Thanks in advance</description>
      <pubDate>Tue, 22 May 2007 14:23:17 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005152#M297943</guid>
      <dc:creator>Hanry Zhou</dc:creator>
      <dc:date>2007-05-22T14:23:17Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005153#M297944</link>
      <description>Hi Hanry,&lt;BR /&gt;&lt;BR /&gt;What about the "uniq" command?&lt;BR /&gt;&lt;BR /&gt;Robert-Jan</description>
      <pubDate>Tue, 22 May 2007 14:41:01 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005153#M297944</guid>
      <dc:creator>Robert-Jan Goossens</dc:creator>
      <dc:date>2007-05-22T14:41:01Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005154#M297945</link>
      <description>A script like this could do the job:&lt;BR /&gt;&lt;BR /&gt;OLDFILE=/tmp/original_file&lt;BR /&gt;NEWFILE=/tmp/new_file&lt;BR /&gt;&lt;BR /&gt;touch $NEWFILE&lt;BR /&gt;for LINE in `cat $OLDFILE`&lt;BR /&gt;do&lt;BR /&gt;  EXISTS=`grep -w $LINE $NEWFILE | wc -l`&lt;BR /&gt;  if [ $EXISTS -eq 0 ]&lt;BR /&gt;  then&lt;BR /&gt;  # The word is not in the new file yet&lt;BR /&gt;  echo $LINE &amp;gt;&amp;gt; $NEWFILE&lt;BR /&gt;  fi&lt;BR /&gt;done</description>
      <pubDate>Tue, 22 May 2007 14:41:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005154#M297945</guid>
      <dc:creator>Ivan Ferreira</dc:creator>
      <dc:date>2007-05-22T14:41:31Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005155#M297946</link>
      <description>The problem with uniq is that the works must be sorted (duplicates followed), and he does not want to change the order.</description>
      <pubDate>Tue, 22 May 2007 14:44:01 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005155#M297946</guid>
      <dc:creator>Ivan Ferreira</dc:creator>
      <dc:date>2007-05-22T14:44:01Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005156#M297947</link>
      <description>Hi Hanry:&lt;BR /&gt;&lt;BR /&gt;# perl -ne 'push @list,$_ unless $found{$_}++;END{print for (@list)}' file&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt;&lt;BR /&gt;...JRF...</description>
      <pubDate>Tue, 22 May 2007 14:55:40 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005156#M297947</guid>
      <dc:creator>James R. Ferguson</dc:creator>
      <dc:date>2007-05-22T14:55:40Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005157#M297948</link>
      <description>You have a rather annoying practice of not knowing how to do something and then specifying that it be done not only in the shell but in a particular shell. This task would be MUCH prettier and elegant in Perl but we can leverage sort, uniq, and grep -q to do what you want in the Korn sh.&lt;BR /&gt;&lt;BR /&gt;----------------------------------------&lt;BR /&gt;#!/usr/bin/ksh&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;TDIR=${TMPDIR:-/var/tmp}&lt;BR /&gt;UNIQUES=${TDIR}/F${$}.uniq&lt;BR /&gt;DUPS=${TDIR}/F${$}.dup&lt;BR /&gt;TFILE=${TDIR}/F${$}.tmp&lt;BR /&gt;&lt;BR /&gt;trap 'eval rm -r ${UNIQUES} ${DUPS} ${TFILE}' 0 1 2 3 15&lt;BR /&gt;&lt;BR /&gt;# Copy stdin to a temp file&lt;BR /&gt;rm -f ${TFILE} ${DUPS}&lt;BR /&gt;while read X&lt;BR /&gt;  do&lt;BR /&gt;    echo "${X}" &amp;gt;&amp;gt; ${TFILE}&lt;BR /&gt;  done&lt;BR /&gt;# Sort temp file and find unique words&lt;BR /&gt;sort ${TFILE} | uniq -u &amp;gt; ${UNIQUES}&lt;BR /&gt;echo "\c" &amp;gt; ${DUPS} # null file&lt;BR /&gt;# Now read temp file; if word is unique echo it&lt;BR /&gt;cat ${TFILE} | while read X &lt;BR /&gt;  do&lt;BR /&gt;    grep -q "${X}" ${UNIQUES}&lt;BR /&gt;    STAT=${?}&lt;BR /&gt;    if [[ ${STAT} -eq 0 ]]&lt;BR /&gt;      then&lt;BR /&gt;        echo "${X}" &lt;BR /&gt;      else&lt;BR /&gt;#       not found in Unique file; see if it is in dups&lt;BR /&gt;        grep -q "${X}" ${DUPS}&lt;BR /&gt;        STAT=${?}&lt;BR /&gt;        if [[ ${STAT} -ne 0 ]]&lt;BR /&gt;          then # not already written; echo to stdout and insert in dups file&lt;BR /&gt;            echo "${X}"&lt;BR /&gt;            echo "${X}" &amp;gt;&amp;gt; ${DUPS} &lt;BR /&gt;          fi&lt;BR /&gt;      fi&lt;BR /&gt;  done&lt;BR /&gt;exit 0  &lt;BR /&gt;-----------------------------------------&lt;BR /&gt;&lt;BR /&gt;Useit like this:&lt;BR /&gt;removedups.sh &amp;lt; infile &amp;gt; outfile&lt;BR /&gt;&lt;BR /&gt;What is does is first copy each line of stdin to a temporary file. Next that temporary file is sorted and passed to uniq -u to create a second temporary file containing only unique lines. Now we reread the temporary file and use grep -q to determine if the line is unique. If so, we echo it to stdout. If not, we now need to determine if this is the first time that the duplicate word has been echo'ed. We use grep to examine a third temporary file to see if the word is found, if not, echo the line to stdout and also append it to the third temporary file. When finished, a trap removes all the temporary file and your duplicates have been removed and the original order has been preserved.&lt;BR /&gt;&lt;BR /&gt;NOTE: This still should have been done in Perl.&lt;BR /&gt;</description>
      <pubDate>Tue, 22 May 2007 14:57:08 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005157#M297948</guid>
      <dc:creator>A. Clay Stephenson</dc:creator>
      <dc:date>2007-05-22T14:57:08Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005158#M297949</link>
      <description>uniq oldfile &amp;gt; newfile&lt;BR /&gt;&lt;BR /&gt;that is it</description>
      <pubDate>Tue, 22 May 2007 15:23:49 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005158#M297949</guid>
      <dc:creator>Hanry Zhou</dc:creator>
      <dc:date>2007-05-22T15:23:49Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005159#M297950</link>
      <description>Hanry:&lt;BR /&gt;&lt;BR /&gt;&amp;gt; uniq oldfile &amp;gt; newfile&lt;BR /&gt;&lt;BR /&gt;that is it&lt;BR /&gt;&lt;BR /&gt;*NO* it's not, unless the input file is sorted.&lt;BR /&gt;&lt;BR /&gt;...JRF...</description>
      <pubDate>Tue, 22 May 2007 15:28:11 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005159#M297950</guid>
      <dc:creator>James R. Ferguson</dc:creator>
      <dc:date>2007-05-22T15:28:11Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005160#M297951</link>
      <description>I don't need these items in the file to be sorted, so uniq command should work.</description>
      <pubDate>Tue, 22 May 2007 15:48:43 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005160#M297951</guid>
      <dc:creator>Hanry Zhou</dc:creator>
      <dc:date>2007-05-22T15:48:43Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005161#M297952</link>
      <description>Hi James,&lt;BR /&gt;&lt;BR /&gt;You are right, Just find out why I can not use "uniq". Thanks.</description>
      <pubDate>Tue, 22 May 2007 15:50:52 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005161#M297952</guid>
      <dc:creator>Hanry Zhou</dc:creator>
      <dc:date>2007-05-22T15:50:52Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005162#M297953</link>
      <description>For uniq(1) to work the repeated lines need to be adjacent. Moreover uniq(1) will not preserve the original order of the items in the input file. See the man page of uniq(1) for details. The awk construct below might work so give it a try:&lt;BR /&gt;&lt;BR /&gt;# awk '{x[$1]++;if(x[$1]==1) print $1}' inputfile&lt;BR /&gt;&lt;BR /&gt;~cheers</description>
      <pubDate>Tue, 22 May 2007 15:55:29 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005162#M297953</guid>
      <dc:creator>Sandman!</dc:creator>
      <dc:date>2007-05-22T15:55:29Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005163#M297954</link>
      <description>Though I personally prefer a 1-line perl for such, I was intrigued to discover how easily this could be done in shell.&lt;BR /&gt;&lt;BR /&gt;cat test.words |&lt;BR /&gt;grep -n .* |&lt;BR /&gt;sort -u -t: -k2 |&lt;BR /&gt;sort -t: -1n |&lt;BR /&gt;cut -d: -f2-&lt;BR /&gt;&amp;gt; test.words.sansdupes&lt;BR /&gt;&lt;BR /&gt;1. Prefix a line number and : to each line&lt;BR /&gt;2. Sort by remainder of line and remove dupes.&lt;BR /&gt;3. Sort by line number&lt;BR /&gt;4. Remove line number&lt;BR /&gt;&lt;BR /&gt;Interesting,</description>
      <pubDate>Wed, 23 May 2007 08:17:22 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005163#M297954</guid>
      <dc:creator>drb_1</dc:creator>
      <dc:date>2007-05-23T08:17:22Z</dc:date>
    </item>
    <item>
      <title>Re: how to take out duplicate ones and keep the sequences in the file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005164#M297955</link>
      <description>&amp;gt;drb: 1. Prefix a line number and : to each line&lt;BR /&gt;&lt;BR /&gt;Yes, that's how I would do it.  Except you can refine your steps:&lt;BR /&gt;$ nl -ba -s: -nrz test.words | sort -t: -u -k2,2 | sort -t: -n -k1,1 |&lt;BR /&gt;  cut -d: -f2- &amp;gt; test.words.sansdupes&lt;BR /&gt;&lt;BR /&gt;I'm not sure why you had sort -1n?  It worked but you would be hard pressed to prove it was legal from sort(1).&lt;BR /&gt;&lt;BR /&gt;The problem with Ivan and Clay's solutions is that it will be real slow if there are lots of lines, because it searches each line against all others.&lt;BR /&gt;&lt;BR /&gt;&amp;gt;Clay: # Copy stdin to a temp file&lt;BR /&gt;&lt;BR /&gt;This can be done with cat - &amp;gt; file&lt;BR /&gt;&lt;BR /&gt;&amp;gt;echo "\c" &amp;gt; ${DUPS} # null file&lt;BR /&gt;&lt;BR /&gt;This can be done with just: &amp;gt; ${DUPS}&lt;BR /&gt;&lt;BR /&gt;&amp;gt; grep -q "${X}" ${UNIQUES}&lt;BR /&gt;&lt;BR /&gt;The only advantage over Ivan's is that the uniques file is smaller.&lt;BR /&gt;&lt;BR /&gt;Sandman's solution trades off memory for speed, so would be good for small files.</description>
      <pubDate>Thu, 24 May 2007 02:09:51 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-take-out-duplicate-ones-and-keep-the-sequences-in-the/m-p/4005164#M297955</guid>
      <dc:creator>Dennis Handly</dc:creator>
      <dc:date>2007-05-24T02:09:51Z</dc:date>
    </item>
  </channel>
</rss>

