1826235 Members
2860 Online
109692 Solutions
New Discussion

Polish unicode file

 
SOLVED
Go to solution
Phil Smith_6
New Member

Polish unicode file

I am trying to transfer a file from Windows to HPUX that contains Polish characters żźŻŹłł. The environment on the HP machine is set up to read in pl_PL.iso88592 and my Putty session is set up with the same translation settings.

However, I believe that when I FTP the file it is being corrupted. Perhaps something to do with transfer in 7-bit rather than 8-bit ??

Any ideas to either a) create the file on the UNIX machine (including how to type the characters) or b) settings to transfer the file correctly or c) any settings I have ommitted on the HP side ?

Thank you

Phil
8 REPLIES 8
Steve Steel
Honored Contributor
Solution

Re: Polish unicode file

Hi


1)Extended chars so use 8 bit

2)Use a utf8 locale


You may need a patch like

Patch Name: PHCO_30241

Patch Description: s700_800 11.11 Eastern Europe utf8 locales patch

Creation Date: 04/01/21

Post Date: 04/02/26

Hardware Platforms - OS Releases:
s700: 11.11
s800: 11.11

Products: N/A

Filesets:
International.BULGARIAN,fr=B.11.11,fa=HP-UX_B.11.11_32/64,v=HP
International.CZECH,fr=B.11.11,fa=HP-UX_B.11.11_32/64,v=HP
International.HUNGARIAN,fr=B.11.11,fa=HP-UX_B.11.11_32/64,v=HP
International.POLISH,fr=B.11.11,fa=HP-UX_B.11.11_32/64,v=HP
International.RUMANIAN,fr=B.11.11,fa=HP-UX_B.11.11_32/64,v=HP
International.RUSSIAN,fr=B.11.11,fa=HP-UX_B.11.11_32/64,v=HP


Make sure terminal stty is set to
"cs8", "-istrip" and "-parenb".



Steve Steel
If you want truly to understand something, try to change it. (Kurt Lewin)
Wim Rombauts
Honored Contributor

Re: Polish unicode file

It could indeed be that you have received a file based on the utf8 character set.
Since this character set uses different binary codes to represent characters compared to the iso8859 character set, you can see gibberish if you open the file with an iso8859 locale.
However, most of the standard (english) characters should look OK. If you only see gibberish, there is something else going wrong.
Bill Hassell
Honored Contributor

Re: Polish unicode file

Always transfer special files as binary in ftp. This makes an exact copy and the content of the file is unmodified. The character codes should be correct but the end-of-line terminator may be the Windows CR/LF pair which will show up as extra ^M characters (the CR) in some applications. You can use dos2ux to remove the extra ^M characters.


Bill Hassell, sysadmin
Phil Smith_6
New Member

Re: Polish unicode file

Thanks to all so far.

Update:
I have installed the patch.
Transfered in binary (using a few choice ftp clients).
confirmed $LANG=pl_PL.iso88592

Still the file is "corrupted"

# cat abox.txt
Ë Å£START -e - POLTEST - [[BB||zzyyóóDD
SW_QPARAM1,[[BB||zzyyóóDD
STOP

The first characters appear to be the unicode header, the english is fine, but the special characters are incorrect.

Any further suggestions would be appriec
Steve Steel
Honored Contributor

Re: Polish unicode file

Hi


Did you do the stty I recommended


Steve Steel
If you want truly to understand something, try to change it. (Kurt Lewin)
Phil Smith_6
New Member

Re: Polish unicode file

sorry, yes, I did the following:

# stty cs8 -istrip -parenb
# stty -a
speed 9600 baud; line = 0;
rows = 24; columns = 113
min = 4; time = 0;
intr = ^C; quit = ^\; erase = DEL; kill = ^U
eof = ^D; eol = ^@; eol2 ; swtch
stop = ^S; start = ^Q; susp ; dsusp
werase ; lnext
-parenb -parodd cs8 -cstopb hupcl -cread -clocal -loblk -crts
-ignbrk brkint ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl -iuclc
ixon -ixany ixoff -imaxbel -rtsxoff -ctsxon -ienqak
isig icanon -iexten -xcase echo echoe echok -echonl -noflsh
-echoctl -echoprt -echoke -flusho -pendin
opost -olcuc onlcr -ocrnl -onocr -onlret -ofill -ofdel -tostop


H.Merijn Brand (procura
Honored Contributor

Re: Polish unicode file

Just for the record. UTF-8 and UTF-16 (Unicode) are NOT the same as iso-8859-1 .. iso8859-15.

For Unicode environment, you do not need special locales for each country. utf-8 will cover for all. I'm using iso10646-1 myself, and that will cover the whole of Europe, and since Unicode is far more portable than any of the other limited iso8859 standards, it might be a better option in the long run

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Wim Rombauts
Honored Contributor

Re: Polish unicode file

Do this to switch your environment to unicode :

$ export LC_ALL=univ.utf8

Try to open the file now.