<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic sort file containing null characters in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682969#M611601</link>
    <description>Our application team has a file which contains some null characters. After the file is sorted, the size of output file is less than the original size.&lt;BR /&gt;&lt;BR /&gt;# sort -k1,1 -o output.data input.data&lt;BR /&gt;# ll&lt;BR /&gt;-rwxr-xr-x   1 vfeng      techserv   18281639 Sep  2 08:32 input.data&lt;BR /&gt;-rw-r--r--   1 vfeng      techserv   16736272 Sep  2 08:42 output.data&lt;BR /&gt;&lt;BR /&gt;I tried this on both our 11iv1 and v2. &lt;BR /&gt;&lt;BR /&gt;I also tried this on Solaris, the sort works well. &lt;BR /&gt;&lt;BR /&gt;For now, my workaround is to remove the null characters with sed.  A couple of years ago,  somebody reported same issue on AIX. Is this a known bug for HP-UX too? &lt;BR /&gt;&lt;BR /&gt;Victor</description>
    <pubDate>Fri, 03 Sep 2010 10:45:30 GMT</pubDate>
    <dc:creator>Victor  Feng</dc:creator>
    <dc:date>2010-09-03T10:45:30Z</dc:date>
    <item>
      <title>sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682969#M611601</link>
      <description>Our application team has a file which contains some null characters. After the file is sorted, the size of output file is less than the original size.&lt;BR /&gt;&lt;BR /&gt;# sort -k1,1 -o output.data input.data&lt;BR /&gt;# ll&lt;BR /&gt;-rwxr-xr-x   1 vfeng      techserv   18281639 Sep  2 08:32 input.data&lt;BR /&gt;-rw-r--r--   1 vfeng      techserv   16736272 Sep  2 08:42 output.data&lt;BR /&gt;&lt;BR /&gt;I tried this on both our 11iv1 and v2. &lt;BR /&gt;&lt;BR /&gt;I also tried this on Solaris, the sort works well. &lt;BR /&gt;&lt;BR /&gt;For now, my workaround is to remove the null characters with sed.  A couple of years ago,  somebody reported same issue on AIX. Is this a known bug for HP-UX too? &lt;BR /&gt;&lt;BR /&gt;Victor</description>
      <pubDate>Fri, 03 Sep 2010 10:45:30 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682969#M611601</guid>
      <dc:creator>Victor  Feng</dc:creator>
      <dc:date>2010-09-03T10:45:30Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682970#M611602</link>
      <description>actually it seems to be working ,but if you want to get rid of null characters with sed OR Perl (in my opinion it is better than sed)&lt;BR /&gt;no need to use sort prior to workaround.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 03 Sep 2010 11:23:57 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682970#M611602</guid>
      <dc:creator>Hakki Aydin Ucar</dc:creator>
      <dc:date>2010-09-03T11:23:57Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682971#M611603</link>
      <description>did you try it ?&lt;BR /&gt;&lt;BR /&gt;# sed '/^$/d' input.data &amp;gt; output.data</description>
      <pubDate>Fri, 03 Sep 2010 11:26:07 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682971#M611603</guid>
      <dc:creator>Hakki Aydin Ucar</dc:creator>
      <dc:date>2010-09-03T11:26:07Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682972#M611604</link>
      <description>Hi Victor:&lt;BR /&gt;&lt;BR /&gt;&amp;gt; Our application team has a file which contains some null characters. After the file is sorted, the size of output file is less than the original size.&lt;BR /&gt;&lt;BR /&gt;A snippet of the first few lines of the input and the output files might be informative.  Use something like 'xd file' so we can see things.&lt;BR /&gt;&lt;BR /&gt;&amp;gt; For now, my workaround is to remove the null characters with sed. &lt;BR /&gt;&lt;BR /&gt;Then what are you trying to do?  You said that "...after the file is sorted, the size of the output file is less than the original..."  Eliminating nulls before the sort would also reduce the file's size.&lt;BR /&gt;&lt;BR /&gt;By the way, constructing a small file with embedded nulls and sorting it doesn't lead to any size change for me (as I would expect).&lt;BR /&gt;&lt;BR /&gt;# cat -etv /tmp/sortme&lt;BR /&gt;ab1^@^@^@def 111$&lt;BR /&gt;ab2^@^@^@def 222$&lt;BR /&gt;ab3^@^@^@def 333$&lt;BR /&gt;&lt;BR /&gt;For example, using a reverse sort for emphasis:&lt;BR /&gt;&lt;BR /&gt;# sort -rk1,1 /tmp/sortme|cat -etv&lt;BR /&gt;ab3^@^@^@def 333$&lt;BR /&gt;ab2^@^@^@def 222$&lt;BR /&gt;ab1^@^@^@def 111$&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt;&lt;BR /&gt;...JRF...&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 03 Sep 2010 12:44:50 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682972#M611604</guid>
      <dc:creator>James R. Ferguson</dc:creator>
      <dc:date>2010-09-03T12:44:50Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682973#M611605</link>
      <description>&amp;gt;file which contains some null characters.&lt;BR /&gt;&lt;BR /&gt;This is not a text file.  sort(1) has a WARNING:&lt;BR /&gt;For non-text input files, the behaviour is undefined.&lt;BR /&gt;&lt;BR /&gt;&amp;gt;JRF: Eliminating nulls before the sort would also reduce the file's size.&lt;BR /&gt;&lt;BR /&gt;Undefined could mean that any chars in the record after the NUL could be lost.&lt;BR /&gt;But your example doesn't show that.</description>
      <pubDate>Fri, 03 Sep 2010 17:01:20 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682973#M611605</guid>
      <dc:creator>Dennis Handly</dc:creator>
      <dc:date>2010-09-03T17:01:20Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682974#M611606</link>
      <description>Well, the null characters in this file are different.&lt;BR /&gt;&lt;BR /&gt;Here is how I noticed the nulls.  When I open the file with vi editor, I see following message:&lt;BR /&gt;"vopx-extract-rn-am.data" 8930 lines, 18277789 characters (3850 nulls)&lt;BR /&gt;&lt;BR /&gt;18277789 + 3850 = 18281639&lt;BR /&gt;&lt;BR /&gt;I can just type w! to save the file, and the nulls will be removed.&lt;BR /&gt;&lt;BR /&gt;-rwx------   1 vfeng      techserv   18277789 Sep  2 09:34 in.txt&lt;BR /&gt;&lt;BR /&gt;Or I can use sed to redirect input to a output file, and the nulls will be removed too. e.g.&lt;BR /&gt;sed 's///g' in.txt &amp;gt; out.txt&lt;BR /&gt;sed 's/SOMETHING-NOT-IN-THE-FILE//g' in.txt &amp;gt; out.txt&lt;BR /&gt;set '/^$/d' in.txt &amp;gt; out.txt&lt;BR /&gt;&lt;BR /&gt;#ll&lt;BR /&gt;-rwx------   1 vfeng      techserv   18281639 Sep  2 09:34 in.txt&lt;BR /&gt;-rw-r-----   1 vfeng      techserv   18277789 Sep  3 14:57 out.txt&lt;BR /&gt;&lt;BR /&gt;Then sort will work well on out.txt.&lt;BR /&gt;&lt;BR /&gt;Here is a few line of files&lt;BR /&gt;AZ010  90001AMEND - POLICY CHANGE               999N KAT      &lt;BR /&gt;AZ010  90002AMEND - POLICY CHANGE               999N KAT                  &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Victor&lt;BR /&gt;</description>
      <pubDate>Fri, 03 Sep 2010 18:18:35 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682974#M611606</guid>
      <dc:creator>Victor  Feng</dc:creator>
      <dc:date>2010-09-03T18:18:35Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682975#M611607</link>
      <description>Hi (again) Victor:&lt;BR /&gt;&lt;BR /&gt;I too can observe that 'vi' and the 'sed' substitution as you used it will eliminate the nulls.  In my hands, either on an 11.11 or an 11.31 machine, the 'sort' *fails* to cause the loss of characters.&lt;BR /&gt;&lt;BR /&gt;While I can accept 'vi' eliminating the null characters (because it warns you that they are present), I do not agree with 'sed's behavior when one does:&lt;BR /&gt;&lt;BR /&gt;# sed -e '/^$/d' &lt;BR /&gt;&lt;BR /&gt;This should eliminate lines consisting only of a newline --- i.e. an "empty" line, in my opinion.  I observe the same behavior you do.&lt;BR /&gt;&lt;BR /&gt;&amp;gt; Here is a few line of files&lt;BR /&gt;&lt;BR /&gt;This isn't helpful.  If you used 'cat -etv' or 'xd' to list the file(s) we could see where null characters occur.  This is why I used it in my examples.&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt;&lt;BR /&gt;...JRF...&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 03 Sep 2010 20:01:35 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682975#M611607</guid>
      <dc:creator>James R. Ferguson</dc:creator>
      <dc:date>2010-09-03T20:01:35Z</dc:date>
    </item>
    <item>
      <title>Re: sort file containing null characters</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682976#M611608</link>
      <description>&amp;gt;After the file is sorted, the size of output file is less than the original size.&lt;BR /&gt;&lt;BR /&gt;From your numbers, it seems it is a lot less.  1.5 M vs 3.8 K&lt;BR /&gt;&lt;BR /&gt;&amp;gt;-rwxr-xr-x 18281639 Sep 2 08:32 input.data&lt;BR /&gt;&lt;BR /&gt;(It isn't a good idea to have data files be executable.)&lt;BR /&gt;&lt;BR /&gt;&amp;gt;my workaround is to remove the null characters with sed.&lt;BR /&gt;&lt;BR /&gt;Can you compare the sorted files you get by using sort directly and then sort on the file where you removed the NULs?  Also use wc(1) on each.&lt;BR /&gt;&lt;BR /&gt;That might indicate whether records are missing, or just parts of lines.</description>
      <pubDate>Sat, 04 Sep 2010 17:35:19 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/sort-file-containing-null-characters/m-p/4682976#M611608</guid>
      <dc:creator>Dennis Handly</dc:creator>
      <dc:date>2010-09-04T17:35:19Z</dc:date>
    </item>
  </channel>
</rss>

