<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic REMOVING UNICODE CHARACTERS from file in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493881#M364037</link>
    <description>Does anyone know how to remove unicode characters from a file in unix?</description>
    <pubDate>Tue, 08 Sep 2009 18:46:12 GMT</pubDate>
    <dc:creator>MBacc</dc:creator>
    <dc:date>2009-09-08T18:46:12Z</dc:date>
    <item>
      <title>REMOVING UNICODE CHARACTERS from file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493881#M364037</link>
      <description>Does anyone know how to remove unicode characters from a file in unix?</description>
      <pubDate>Tue, 08 Sep 2009 18:46:12 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493881#M364037</guid>
      <dc:creator>MBacc</dc:creator>
      <dc:date>2009-09-08T18:46:12Z</dc:date>
    </item>
    <item>
      <title>Re: REMOVING UNICODE CHARACTERS from file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493882#M364038</link>
      <description>Shalom,&lt;BR /&gt;&lt;BR /&gt;It would help to know what characters specifically and how they got there. Samba? FTP transfer. Email as an attachment? If so how was the file transmitted.&lt;BR /&gt;&lt;BR /&gt;dos2unix&lt;BR /&gt;&lt;BR /&gt;See the man page, it might help.&lt;BR /&gt;&lt;BR /&gt;SEP</description>
      <pubDate>Tue, 08 Sep 2009 19:30:40 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493882#M364038</guid>
      <dc:creator>Steven E. Protter</dc:creator>
      <dc:date>2009-09-08T19:30:40Z</dc:date>
    </item>
    <item>
      <title>Re: REMOVING UNICODE CHARACTERS from file</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493883#M364039</link>
      <description>Your problem can be re-phrased as "remove everything that is not an ASCII control character or an ASCII printable character".&lt;BR /&gt;&lt;BR /&gt;When a problem is presented in this way, it's easy to find a solution using the standard "tr" command.&lt;BR /&gt;&lt;BR /&gt;Example: file.utf8 contains Unicode UTF8 characters, and file.txt will be the stripped version.&lt;BR /&gt;&lt;BR /&gt;export LC_ALL=C &lt;BR /&gt;tr -dc '[:cntrl:][:print:]' &amp;lt; file.utf8 &amp;gt; file.txt&lt;BR /&gt;unset LC_ALL&lt;BR /&gt;&lt;BR /&gt;Setting the environment variable LC_ALL to C for the duration of this command is important: it explicitly switches off the Unicode support and tells tr that only ASCII characters are considered to be "printable".&lt;BR /&gt;&lt;BR /&gt;This command can be run as an one-liner too:&lt;BR /&gt;&lt;BR /&gt;LC_ALL=C tr -dc '[:cntrl:][:print:]' &amp;lt; file.utf8 &amp;gt; file.txt&lt;BR /&gt;&lt;BR /&gt;MK</description>
      <pubDate>Tue, 08 Sep 2009 19:45:15 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/removing-unicode-characters-from-file/m-p/4493883#M364039</guid>
      <dc:creator>Matti_Kurkela</dc:creator>
      <dc:date>2009-09-08T19:45:15Z</dc:date>
    </item>
  </channel>
</rss>

