Operating System - OpenVMS
1822193 Members
3949 Online
109640 Solutions
New Discussion юеВ

Utility to convert a file (format ASCII) to a file (format UTF-8)

 
SOLVED
Go to solution
GWL_1
Frequent Advisor

Utility to convert a file (format ASCII) to a file (format UTF-8)

Hi,

Is there an utility to convert a file (format ASCII) to a file (format UTF-8) on the VAX/VMS platform?
Is there an utility to convert a file (format ASCII) to a file (format UTF-8) on the AXP/VMS platform?

Thanks in advance,

Geert
7 REPLIES 7
Alexey Chupahin
Occasional Advisor

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

Please look

$HELP ICONV CONVERT

GWL_1
Frequent Advisor

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

Hi,
And this works for ASCII to UTF-8?
Can you send me the correct (full) command?
/fromcode and /tocode
Thanks in advance.
Geert
Alexey Chupahin
Occasional Advisor
Solution

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

I dont know exact codeset names,

Please try
$local show char

I have no localization installed on my system
GWL_1
Frequent Advisor

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

Hi,

$ local show char
%LOCALE-W-NOCMAPFND, no character definitions files found

$ show log SYS$I18N_LOCALE
"SYS$I18N_LOCALE" = "SYS$SYSROOT:[SYS$I18N.LOCALES.USER]" (LNM$SYSTEM_TABLE)
= "SYS$SYSROOT:[SYS$I18N.LOCALES.SYSTEM]"

$ dir SYS$I18N_LOCALE
Directory SYS$COMMON:[SYS$I18N.LOCALES.SYSTEM]

EN_US_ISO8859-1.LOCALE;1 UTF8-20.LOCALE;1

Total of 2 files.

How to proceed?

Thanks in advance,

Geert
Hein van den Heuvel
Honored Contributor

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

VMS tools in general deals with data files in terms of records and is blissfully (sic) ignorant of the exact byte contents.

Specifically the CONVERT tool behaves like that.

No tool on VMS really would know what to do with UTF-8 files. So may be assume that your consumer is in the Unix or Windoze space? If so, please consider doing the convert over there!

If there was a character transformation, in a standard tool then it would be CONV/DOCUMENT or EXCHANGE/NET but neither do.

Maybe the TCP/IP suite of tools has something?

Best I know Ascii to UTF-8 mapping simply means adding a zero-bye to map 8 bytes onto 16.

http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
http://en.wikipedia.org/wiki/UTF-8

If I had to do this on VMS I would bruteforce it with a tiny C program or a tiny perl script as per below

$CREATE e ascii.tmp
aap
noot
mies
$count -e ascii.tmp
ascii.tmp Counted as 3 records, 11 bytes, LRL=4, AVG=3
$perl -pe "s/(.)/\1\000/g" ascii.tmp > utf-8.tmp
$ mcr sys$login:count -e utf-8.tmp
utf-8.tmp Counted as 3 records, 22 bytes, LRL=8, AVG=7
$ type utf-8.tmp
aap
noot
mies
$ dump/reco utf-8.tmp ! Formatted some
0070 00610061 a.a.p........
0074006F 006F006E n.o.o.t......
00730065 0069006D m.i.e.s......
$
Hein van den Heuvel
Honored Contributor

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

Argh, can I withdraw my reply? :-)
I just saw the word CONV and knowing the convert code thought 'Convert is not going to do this'.
But it read ICONV and it looks like that will

Hein.
Craig A Berry
Honored Contributor

Re: Utility to convert a file (format ASCII) to a file (format UTF-8)

If you have any moderately recent version of Perl installed, you can use the piconv utility that comes with it.

$ piconv :== @perl_root:[utils]piconv.com
$ piconv
piconv.com;1 [-f from_encoding] [-t to_encoding] [-s string] [files...]
piconv.com;1 -l
piconv.com;1 -r encoding_alias
-l,--list
lists all available encodings
-r,--resolve encoding_alias
resolve encoding to its (Encode) canonical name
-f,--from from_encoding
when omitted, the current locale will be used
-t,--to to_encoding
when omitted, the current locale will be used
-s,--string string
"string" will be the input instead of STDIN or files
The following are mainly of interest to Encode hackers:
-D,--debug show debug information
-C N | -c | -p check the validity of the input
-S,--scheme scheme use the scheme for conversion