Operating System - HP-UX
1748183 Members
3368 Online
108759 Solutions
New Discussion юеВ

Difference between C.utf8 and en_us.utf8? (points!)

 
SOLVED
Go to solution
Christian Deutsch_1
Esteemed Contributor

Difference between C.utf8 and en_us.utf8? (points!)

Hi folks,

I'm curious if somebody knows if there are any (which?) differences between C.utf8 and en_us.utf8 locales (on HP-UX 11.31)?

The first three truly informative answers will be generously rewarded with points!

Thanks, Christian
Yeshua loves you!
7 REPLIES 7
Ganesan R
Honored Contributor

Re: Difference between C.utf8 and en_us.utf8? (points!)

Hi,

C.utf8 locale supports Computer English language whereas en_us.utf8 locale supports United state English.

Read this document for more details.

https://internal.support.hpe.com/hpesc/docDisplay?cc=us&docId=emr_na-c02722594&lang=en-us

Best wishes,

Ganesh.
Christian Deutsch_1
Esteemed Contributor

Re: Difference between C.utf8 and en_us.utf8? (points!)

Thanks Ganesan for the quick answer and interesting document. However the document is very high-level and I could not find in the document what the differences (if any) are between C.utf8 and en_us.utf8

Kind regards, Christian
Yeshua loves you!
Matti_Kurkela
Honored Contributor
Solution

Re: Difference between C.utf8 and en_us.utf8? (points!)

Based on a quick inspection of locale source files in /usr/lib/nls/loc/src:

C = POSIX standards-compliant default locale. Only strict ASCII characters are valid.

C.utf8 = POSIX standards-compliant locale, extended to allow the basic use of UTF-8. No character upper-lower case relationships and collation orders defined beyond ASCII.

(In other words: this sorts non-ASCII characters strictly according to their Unicode character encoding value. It does not understand that upper and lower case "A with diaeresis" are two versions of the same character and should be sorted near each other. For non-Latin alphabets, your guess is as good as mine.)

For all C.* locales, the default currency symbol is undefined -> POSIX default "$" is used. Thousands separators are not used in large numbers.

en_US.utf8 = American English UTF-8 locale.
It "knows" which non-ASCII Unicode characters are upper/lower case pairs, and sorts them together, upper case immediately before lower case. It also has default sorting orders defined for various non-Latin alphabets.

The currency symbol is "$" and the international version is explicitly defined as "USD ".
A comma is used as a thousands separator.
12-hour time presentation is preferred.
The answer to an Y/N question may also be written out as "yes" or "no" (case insensitive).

MK
MK
Dennis Handly
Acclaimed Contributor

Re: Difference between C.utf8 and en_us.utf8? (points!)

You can use locale(1) and localedef(1m) to see the differences.

Locales with C, are for American Nerd.
Christian Deutsch_1
Esteemed Contributor

Re: Difference between C.utf8 and en_us.utf8? (points!)

Thanks for your comments Denis. I did find them only a little bit useful. Matti's answer was much more helpful for me.

Kind regards, Christian
Yeshua loves you!
Dennis Handly
Acclaimed Contributor

Re: Difference between C.utf8 and en_us.utf8? (points!)

>I did find them only a little bit useful.

Here is a script fragment to do the compare:
CATS="LC_CTYPE LC_COLLATE LC_MONETARY LC_NUMERIC LC_TIME LC_MESSAGES"
echo "Doing C.utf8:"
LANG=C.utf8 locale -k $CATS > C.utf8.out

echo "Doing en_US.utf8:"
LANG=en_US.utf8 locale -k $CATS > en_US.utf8.out

echo "< C.utf8"
echo "> en_US.utf8"
diff C.utf8.out en_US.utf8.out

The differences are in:
int_curr_symbol
currency_symbol
mon_decimal_point
mon_thousands_sep
mon_grouping
negative_sign
int_frac_digits
frac_digits
p_cs_precedes
p_sep_by_space
n_cs_precedes
n_sep_by_space
p_sign_posn
n_sign_posn
crncystr
thousands_sep
grouping
d_t_fmt
d_fmt
t_fmt
yesexpr
noexpr

Instead of localedef(1m), I should have said nl_langinfo(3C).
Christian Deutsch_1
Esteemed Contributor

Re: Difference between C.utf8 and en_us.utf8? (points!)

Thanks Denis, that was much more helpful!

I think I now have all the answers and details that I need for this question.

Kind regards, Christian
Yeshua loves you!