<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Need help on creating script to split data in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524731#M701329</link>
    <description>perl -pe'BEGIN{open A,"&amp;gt;file1";open B,"&amp;gt;file2"}select(/\b(34521|43521|45123)\b/?B:/\b(12345|25431|54213|32541)\b/?A:STDOUT)' file&lt;BR /&gt;&lt;BR /&gt;TMTOWTDI&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
    <pubDate>Thu, 14 Apr 2005 06:24:35 GMT</pubDate>
    <dc:creator>H.Merijn Brand (procura</dc:creator>
    <dc:date>2005-04-14T06:24:35Z</dc:date>
    <item>
      <title>Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524727#M701325</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;I need some help and guidance to create a shell script to split data into different file.&lt;BR /&gt;&lt;BR /&gt;I have data in one file look like this:&lt;BR /&gt;&lt;BR /&gt;file1:&lt;BR /&gt;|A|LR|&lt;BR /&gt;|B|LR|&lt;BR /&gt;|B|FO|&lt;BR /&gt;|C|LR|&lt;BR /&gt;|D|LR|&lt;BR /&gt;|D|FO|&lt;BR /&gt;|E|LR|&lt;BR /&gt;|F|LR|&lt;BR /&gt;|G|LR|&lt;BR /&gt;|G|FO|&lt;BR /&gt;&lt;BR /&gt;I want to split "double entry" B,D and G into one file and "single entry" A,C,E and F in another file.&lt;BR /&gt;&lt;BR /&gt;Would appreciate if you could help me to do so.&lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;Munawwar&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 14 Apr 2005 05:46:21 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524727#M701325</guid>
      <dc:creator>Ahmad Munawwar</dc:creator>
      <dc:date>2005-04-14T05:46:21Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524728#M701326</link>
      <description>cat file1 | grep A | grep C | grep E &amp;gt; file2&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;cat file1 | grep G | grep D | grep G &amp;gt; file3&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;SEP</description>
      <pubDate>Thu, 14 Apr 2005 05:52:09 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524728#M701326</guid>
      <dc:creator>Steven E. Protter</dc:creator>
      <dc:date>2005-04-14T05:52:09Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524729#M701327</link>
      <description>perl -pe'BEGIN{open A,"&amp;gt;A";open B,"&amp;gt;B"}select(/\b[BDG]\b/?B:/\b[ACEF]\b/?A:STDOUT)' file&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
      <pubDate>Thu, 14 Apr 2005 06:02:37 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524729#M701327</guid>
      <dc:creator>H.Merijn Brand (procura</dc:creator>
      <dc:date>2005-04-14T06:02:37Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524730#M701328</link>
      <description>Hi Steven,&lt;BR /&gt;&lt;BR /&gt;The things is that I have about 30,000 of such data in one file.&lt;BR /&gt;&lt;BR /&gt;Actually the first field is represent numbering. &lt;BR /&gt;&lt;BR /&gt;A = 12345&lt;BR /&gt;B = 34521&lt;BR /&gt;C = 25431&lt;BR /&gt;D = 43521&lt;BR /&gt;E = 54213&lt;BR /&gt;F = 32541&lt;BR /&gt;G = 45123&lt;BR /&gt;&lt;BR /&gt;For duplicate data i.e. B, D and G.&lt;BR /&gt;it has same number.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 14 Apr 2005 06:18:45 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524730#M701328</guid>
      <dc:creator>Ahmad Munawwar</dc:creator>
      <dc:date>2005-04-14T06:18:45Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524731#M701329</link>
      <description>perl -pe'BEGIN{open A,"&amp;gt;file1";open B,"&amp;gt;file2"}select(/\b(34521|43521|45123)\b/?B:/\b(12345|25431|54213|32541)\b/?A:STDOUT)' file&lt;BR /&gt;&lt;BR /&gt;TMTOWTDI&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
      <pubDate>Thu, 14 Apr 2005 06:24:35 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524731#M701329</guid>
      <dc:creator>H.Merijn Brand (procura</dc:creator>
      <dc:date>2005-04-14T06:24:35Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524732#M701330</link>
      <description>Munawwar,&lt;BR /&gt;&lt;BR /&gt;#!/usr/bin/sh&lt;BR /&gt;# Extract unique only&lt;BR /&gt;cut -d'|' -f2 datafile.lis | uniq -u &amp;gt; unique.lis&lt;BR /&gt;# Change the format&lt;BR /&gt;sed "1,$ s/^/^|/" unique.lis &amp;gt; unique2.lis&lt;BR /&gt;sed "1,$ s/$/|/" unique2.lis &amp;gt; unique.lis&lt;BR /&gt;rm unique2.lis&lt;BR /&gt;# Extract Uniques&lt;BR /&gt;grep -f unique.lis datafile.lis &amp;gt; unique.data&lt;BR /&gt;# Extract Duplicates&lt;BR /&gt;grep -vf unique.lis datafile.lis&amp;gt; dup.data&lt;BR /&gt;rm unique.lis&lt;BR /&gt;&lt;BR /&gt;datfile.lis is the input filename&lt;BR /&gt;&lt;BR /&gt;Regards&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 14 Apr 2005 07:09:34 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524732#M701330</guid>
      <dc:creator>Peter Godron</dc:creator>
      <dc:date>2005-04-14T07:09:34Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524733#M701331</link>
      <description>&lt;BR /&gt;I like the uniq method myself.&lt;BR /&gt;Do you know the records are in order, and have just a single duplicate per key?&lt;BR /&gt;&lt;BR /&gt;Here is some somewhat convoluted awk to do the job:&lt;BR /&gt;&lt;BR /&gt;----- x.awk ---------&lt;BR /&gt;END{if (dup){print last&amp;gt;&amp;gt;"dups"} else {print last}}&lt;BR /&gt;{ if ($2==key) {&lt;BR /&gt;  print last&amp;gt;&amp;gt;"dups";&lt;BR /&gt;  dup=1;&lt;BR /&gt;  } else {&lt;BR /&gt;    if (dup) {&lt;BR /&gt;      print last&amp;gt;&amp;gt;"dups";&lt;BR /&gt;      dup=0;&lt;BR /&gt;      } else {&lt;BR /&gt;      if (NR&amp;gt;1) {print last};&lt;BR /&gt;      }&lt;BR /&gt;    }&lt;BR /&gt;  last=$0;&lt;BR /&gt;  key=$2;&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;It processes the last record based on current key matching the last or not.&lt;BR /&gt;It has to avoid the printing nothing for the first, and it has to special case the end for the last last. Yikes.&lt;BR /&gt;&lt;BR /&gt;Usage with your sample data in file 'x'&lt;BR /&gt;&lt;BR /&gt;# awk -F"|" -f x.awk x&lt;BR /&gt;|A|LR|&lt;BR /&gt;|C|LR|&lt;BR /&gt;|E|LR|&lt;BR /&gt;|F|LR|&lt;BR /&gt;&lt;BR /&gt;# cat dups&lt;BR /&gt;|B|LR|&lt;BR /&gt;|B|FO|&lt;BR /&gt;|D|LR|&lt;BR /&gt;|D|FO|&lt;BR /&gt;|G|LR|&lt;BR /&gt;|G|FO|&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;If you just have 30,000 record or so, then you can readily suck them into perl and spit back out based on dups or not:&lt;BR /&gt;----- x.pl -----------&lt;BR /&gt;while (&amp;lt;&amp;gt;) {&lt;BR /&gt; $key = (split(/|/))[1];&lt;BR /&gt; $records{$key} .= $_;&lt;BR /&gt; }&lt;BR /&gt;open (DUPS, "&amp;gt;dups");&lt;BR /&gt;foreach $key (sort keys %records) {&lt;BR /&gt; $_ = $records{$key};&lt;BR /&gt; if (/\n\|/) {print DUPS} else {print};&lt;BR /&gt; }&lt;BR /&gt;-----------------&lt;BR /&gt;So here each record fets concattenated with any prior data for a given key. If there was nothing, it'll be just that new record. If there was something it gets added.&lt;BR /&gt;When all is read, retrieve the key, and the data for the key. If there is a newline + bar in the record, it must have been a dup!&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Usage: # perl x.pl x&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;hth,&lt;BR /&gt;Hein.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 14 Apr 2005 08:17:05 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524733#M701331</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2005-04-14T08:17:05Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524734#M701332</link>
      <description>Great,&lt;BR /&gt;&lt;BR /&gt;Thanks for the input... I will try tomorrow and see which one will work :-)&lt;BR /&gt;&lt;BR /&gt;/munawar</description>
      <pubDate>Thu, 14 Apr 2005 10:09:16 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524734#M701332</guid>
      <dc:creator>Ahmad Munawwar</dc:creator>
      <dc:date>2005-04-14T10:09:16Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524735#M701333</link>
      <description>Let's start with being happy that you assign points, but I'd rather see the points assigned *after* you tried, so we can see what worked and what didn't, and maybe more important *why* (not).&lt;BR /&gt;&lt;BR /&gt;We like feedback as wel. This way we also can improve ourselves.&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
      <pubDate>Thu, 14 Apr 2005 10:14:19 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524735#M701333</guid>
      <dc:creator>H.Merijn Brand (procura</dc:creator>
      <dc:date>2005-04-14T10:14:19Z</dc:date>
    </item>
    <item>
      <title>Re: Need help on creating script to split data</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524736#M701334</link>
      <description>I should learn to leave well enough alone...&lt;BR /&gt;&lt;BR /&gt;Here is an alternate perl solution, suitable for much large files. It makes two passes over the input. First just count occurences for each key. The second time print to the right file based on the key&lt;BR /&gt;&lt;BR /&gt;---- &lt;BR /&gt;&lt;BR /&gt;$file = shift @ARGV or die "please provide file";&lt;BR /&gt;open (IN,"&amp;lt;$file") or die "Could not open $file";&lt;BR /&gt;while (&lt;IN&gt;) {&lt;BR /&gt; $keys{(split(/|/))[1]}++;&lt;BR /&gt; }&lt;BR /&gt;open (DUPS, "&amp;gt;dups");&lt;BR /&gt;open (IN,"&amp;lt;$file");&lt;BR /&gt;while (&lt;IN&gt;) {&lt;BR /&gt;  if ($keys{(split(/|/))[1]} &amp;gt; 1) {print DUPS} else {print};&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;-----------------&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;variant second part:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;while (&lt;IN&gt;) {&lt;BR /&gt;  $filehandle = ($keys{(split(/|/))[1]} &amp;gt; 1) ? DUPS : STDOUT;&lt;BR /&gt;  print $filehandle $_;&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Cheers,&lt;BR /&gt;Hein.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/IN&gt;&lt;/IN&gt;&lt;/IN&gt;</description>
      <pubDate>Thu, 14 Apr 2005 10:42:30 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/need-help-on-creating-script-to-split-data/m-p/3524736#M701334</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2005-04-14T10:42:30Z</dc:date>
    </item>
  </channel>
</rss>

