<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how to get count of repeated words in a flat file in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761383#M657199</link>
    <description>Raj,&lt;BR /&gt;   That is also a fine a solution but I don't understand why you opted for a pipe. I guess I will never understand the typical Unix thinking involved. I come from VMS land, where for the longest times we did not have pipes. When we got them we understood the costs involved.&lt;BR /&gt;&lt;BR /&gt;Not that it matters for occasional use like here, but why print to a pipe segment and re-count what comes out when you can just count while there and print when done?!&lt;BR /&gt;&lt;BR /&gt;Might I suggest:&lt;BR /&gt;&lt;BR /&gt;$ awk '{for(i=1;i&amp;lt;=NF;++i) if($i~ "^word$") count++} END { print count }' textfile&lt;BR /&gt;&lt;BR /&gt;Of course due to the simple split by whitespace, that suffers from the same problem as my perl --&amp;gt; array example.&lt;BR /&gt;&lt;BR /&gt;It will not recognize 'word' in *this* example line, due to the quotes.&lt;BR /&gt;&lt;BR /&gt;Using perl you can fix that using \b to split.&lt;BR /&gt;&lt;BR /&gt;$ perl -nle '$w{$_}++ for (split /\b/) }{ for (sort {$w{$b}&amp;lt;=&amp;gt;$w{$a}} keys %w) { print qq($w{$_}\t$_)}' tmp.txt&lt;BR /&gt;&lt;BR /&gt;(but now it counts whitespace as words also)&lt;BR /&gt;&lt;BR /&gt;Hein.&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Sat, 05 Mar 2011 21:10:32 GMT</pubDate>
    <dc:creator>Hein van den Heuvel</dc:creator>
    <dc:date>2011-03-05T21:10:32Z</dc:date>
    <item>
      <title>how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761378#M657194</link>
      <description>I want to know how many times a particular word is repeated in a particular flat file.&lt;BR /&gt;&lt;BR /&gt;I am using the following command&lt;BR /&gt;&lt;BR /&gt;grep word textfile |wc -l&lt;BR /&gt;&lt;BR /&gt;word is the desired word&lt;BR /&gt;&lt;BR /&gt;textfile is the file iam searching.&lt;BR /&gt;&lt;BR /&gt;but the above file doesnot give exact count in some scenarios like if the word is repeated in a line it will consider a 1. please suggest</description>
      <pubDate>Fri, 04 Mar 2011 12:55:45 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761378#M657194</guid>
      <dc:creator>Gopi Kishore m</dc:creator>
      <dc:date>2011-03-04T12:55:45Z</dc:date>
    </item>
    <item>
      <title>Re: how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761379#M657195</link>
      <description>Hi:&lt;BR /&gt;&lt;BR /&gt;# $ perl -nle '$n++ while m{\bword\b}g;END{print $n}' file&lt;BR /&gt;&lt;BR /&gt;...will look for the string "word" and count every instance in the file argument.  Matches that begin at the start of a line or terminate at the end, as well as matches are counted.  If you substituted the string "words" only matches to "words" and not "word" would be found.&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt;&lt;BR /&gt;...JRF...</description>
      <pubDate>Fri, 04 Mar 2011 14:10:39 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761379#M657195</guid>
      <dc:creator>James R. Ferguson</dc:creator>
      <dc:date>2011-03-04T14:10:39Z</dc:date>
    </item>
    <item>
      <title>Re: how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761380#M657196</link>
      <description>You could first start with grep to find the lines then use tr(1) or sed(1) to split up the words into separate lines, then just count that:&lt;BR /&gt;grep word textfile | tr '[:space:]' '\012' | grep -c word</description>
      <pubDate>Sat, 05 Mar 2011 02:46:12 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761380#M657196</guid>
      <dc:creator>Dennis Handly</dc:creator>
      <dc:date>2011-03-05T02:46:12Z</dc:date>
    </item>
    <item>
      <title>Re: how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761381#M657197</link>
      <description>I like JRF's solution.&lt;BR /&gt;&lt;BR /&gt;Pay close attention to the usage of the '\b' regular expression component which takes no space itself bu specifies a work boundary. Just what is needed here it seems.&lt;BR /&gt;&lt;BR /&gt;Applied to the topic text it reports '5' as count for the word 'word' which obviously needs to be changed or become a variable for real work.&lt;BR /&gt;&lt;BR /&gt;Depending on exactly what problem you are trying to solve, it may be beneficial to just count all words and then address the selected words for further processing.&lt;BR /&gt;&lt;BR /&gt;Here is a 'one-liner' to demonstrate that:&lt;BR /&gt;&lt;BR /&gt;$ perl -nle '$w{$_}++ for (split) }{ for (sort {$w{$b}&amp;lt;=&amp;gt;$w{$a}} keys %w) { pri&lt;BR /&gt;nt qq($w{$_}\t$_)}' tmp.txt&lt;BR /&gt;5       the&lt;BR /&gt;5       word&lt;BR /&gt;4       a&lt;BR /&gt;4       is&lt;BR /&gt;3       in&lt;BR /&gt;2       file&lt;BR /&gt;2       repeated&lt;BR /&gt;2       textfile&lt;BR /&gt;:&lt;BR /&gt;&lt;BR /&gt;As you see, it also reports 5 for the word 'word'&lt;BR /&gt;&lt;BR /&gt;Enjoy,&lt;BR /&gt;Hein&lt;BR /&gt;</description>
      <pubDate>Sat, 05 Mar 2011 15:28:47 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761381#M657197</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2011-03-05T15:28:47Z</dc:date>
    </item>
    <item>
      <title>Re: how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761382#M657198</link>
      <description>Gopi,&lt;BR /&gt;&lt;BR /&gt;$ awk '{for(i=1;i&amp;lt;=NF;++i) if($i~ "^word$") print $i}' textfile| wc -l&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have fun!,&lt;BR /&gt;Raj.</description>
      <pubDate>Sat, 05 Mar 2011 20:44:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761382#M657198</guid>
      <dc:creator>Raj D.</dc:creator>
      <dc:date>2011-03-05T20:44:14Z</dc:date>
    </item>
    <item>
      <title>Re: how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761383#M657199</link>
      <description>Raj,&lt;BR /&gt;   That is also a fine a solution but I don't understand why you opted for a pipe. I guess I will never understand the typical Unix thinking involved. I come from VMS land, where for the longest times we did not have pipes. When we got them we understood the costs involved.&lt;BR /&gt;&lt;BR /&gt;Not that it matters for occasional use like here, but why print to a pipe segment and re-count what comes out when you can just count while there and print when done?!&lt;BR /&gt;&lt;BR /&gt;Might I suggest:&lt;BR /&gt;&lt;BR /&gt;$ awk '{for(i=1;i&amp;lt;=NF;++i) if($i~ "^word$") count++} END { print count }' textfile&lt;BR /&gt;&lt;BR /&gt;Of course due to the simple split by whitespace, that suffers from the same problem as my perl --&amp;gt; array example.&lt;BR /&gt;&lt;BR /&gt;It will not recognize 'word' in *this* example line, due to the quotes.&lt;BR /&gt;&lt;BR /&gt;Using perl you can fix that using \b to split.&lt;BR /&gt;&lt;BR /&gt;$ perl -nle '$w{$_}++ for (split /\b/) }{ for (sort {$w{$b}&amp;lt;=&amp;gt;$w{$a}} keys %w) { print qq($w{$_}\t$_)}' tmp.txt&lt;BR /&gt;&lt;BR /&gt;(but now it counts whitespace as words also)&lt;BR /&gt;&lt;BR /&gt;Hein.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Sat, 05 Mar 2011 21:10:32 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761383#M657199</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2011-03-05T21:10:32Z</dc:date>
    </item>
    <item>
      <title>Re: how to get count of repeated words in a flat file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761384#M657200</link>
      <description>Hein,&lt;BR /&gt;&lt;BR /&gt;Thats great, thanks for adding the count , pipe is not required as count can be done inside the awk, thanks!. And perl code is nice specially for whitespace trick., &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Rgds,&lt;BR /&gt;Raj.&lt;BR /&gt;&lt;BR /&gt;Gopi,&lt;BR /&gt;pls post points once you are done.</description>
      <pubDate>Sat, 05 Mar 2011 21:34:25 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/how-to-get-count-of-repeated-words-in-a-flat-file/m-p/4761384#M657200</guid>
      <dc:creator>Raj D.</dc:creator>
      <dc:date>2011-03-05T21:34:25Z</dc:date>
    </item>
  </channel>
</rss>

