Operating System - Linux
1748169 Members
4169 Online
108758 Solutions
New Discussion юеВ

Re: Special /French characters in a Linux file

 
Ravi S. Banda
Regular Advisor

Special /French characters in a Linux file

We have a file that has French characters in it - an excel .csv file.

Used sftp and ftp to transfer the file to the Linux (Redhat Enterprise Linux 5 update 4).
Is launching something like x-term the only way to read the file or to check the contents of the file?

Because regular telnet or ssh doesn't display the right contents of the file upon 'cat' or 'view' the file. Is there any other way? I hate to launch x-term everytime to verify the contents. can we change certain settings?

If x-term is showing the right characters, then, can we count on the file being 'understood' by the server properly - for us to proceed with the file - like loading in a database etc.?

LANG=en_US.UTF-8
TERM=vt100

Thanks!
Ravi.
2 REPLIES 2
Matti_Kurkela
Honored Contributor

Re: Special /French characters in a Linux file

Apparently your regular telnet or ssh client is not configured to expect UTF-8 characters from the remote side. Once you fix this, it should be possible to view those .csv files normally. (NOTE: this is something you must change on your workstation, not on the Linux server.)

Most telnet and SSH clients in Windows workstations will expect ISO-8859-1 characters by default. There is no way for the telnet or SSH protocol to communicate the remote character set to the local side, for various reasons. (For one, the LANG environment variable is usually set by login scripts, which start running only after the telnet or SSH session has already been set up.)

> If x-term is showing the right characters, then, can we count on the file being 'understood' by the server properly - for us to proceed with the file - like loading in a database etc.?

If you start an x-term in RHEL 5.4 with LANG set to en_US.UTF-8 and don't use any special options, you'll get an x-term that understands and uses UTF-8 characters. As you say it displays the .csv file correctly, I assume the file contains UTF-8 characters.

If the program you'll be using to process the file honors the locale settings, the program shouls understand the file correctly. However, some database tools may have separate character set settings: read their documentation before importing.

Many databases can automatically convert data from one character set to another. One character set is chosen as the database's "native" character set. When importing data to the database, the import tool will tell the database engine which character set the data is currently in (based on the information the user has given to the tool). If the character set is not the same as the database's native character set, the database engine will convert the characters before storing them.

Any program that uses the database must also declare which character set it wants to use when communicating with the database. Again, if the chosen character set is not the same as the database's native character set, the database engine will convert the query strings and the results on the fly.

Of course, all these conversions will produce very wrong results if you specify an incorrect character set when importing the data to the database (i.e. if you tell the system the data is in character set A when it really is in character set B).

MK
MK
Rob Leadbeater
Honored Contributor

Re: Special /French characters in a Linux file

Hi Ravi,

Does "cat -v " help ?

Cheers,

Rob