* setup2.sgml (setup-locale): Mention three character codes per
ISO 639-3. * setup2.sgml (setup-locale): Adapt description to the C using ASCII change in 1.7.2.
This commit is contained in:
parent
d24015235c
commit
0b8e38dd8b
@ -1,3 +1,14 @@
|
|||||||
|
2010-01-17 Corinna Vinschen <corinna@vinschen.de>
|
||||||
|
|
||||||
|
* setup2.sgml (setup-locale): Mention three character codes per
|
||||||
|
ISO 639-3.
|
||||||
|
|
||||||
|
2010-01-17 Corinna Vinschen <corinna@vinschen.de>
|
||||||
|
Andy Koppe <andy.koppe@gmail.com>
|
||||||
|
|
||||||
|
* setup2.sgml (setup-locale): Adapt description to the C using ASCII
|
||||||
|
change in 1.7.2.
|
||||||
|
|
||||||
2010-01-16 Christopher Faylor <me+cygwin@cgf.cx>
|
2010-01-16 Christopher Faylor <me+cygwin@cgf.cx>
|
||||||
|
|
||||||
* setup-net.sgml: Remove obsolete assertion.
|
* setup-net.sgml: Remove obsolete assertion.
|
||||||
|
@ -183,8 +183,11 @@ specifier is</para>
|
|||||||
language[[_TERRITORY][.charset][@modifier]]
|
language[[_TERRITORY][.charset][@modifier]]
|
||||||
</screen>
|
</screen>
|
||||||
|
|
||||||
<para>"language" is a lowercase two character string per ISO 639-1,
|
<para>"language" is a lowercase two character string per ISO 639-1, or,
|
||||||
"TERRITORY" is an uppercase two character string per ISO 3166, charset is
|
if there is no ISO 639-1 code for the language (for instance, "Lower Sorbian"),
|
||||||
|
a three character string per ISO 639-3.</para>
|
||||||
|
|
||||||
|
<para>"TERRITORY" is an uppercase two character string per ISO 3166, charset is
|
||||||
one of a list of supported character sets, and the modifier doesn't matter
|
one of a list of supported character sets, and the modifier doesn't matter
|
||||||
here (though it might for some applications). If you're interested in the
|
here (though it might for some applications). If you're interested in the
|
||||||
exact description, you can find it in the online publication of the POSIX
|
exact description, you can find it in the online publication of the POSIX
|
||||||
@ -197,21 +200,23 @@ manual pages on the homepage of the
|
|||||||
"de_CH" language = German, territory = Switzerland, default charset
|
"de_CH" language = German, territory = Switzerland, default charset
|
||||||
"fr_FR.UTF-8" language = french, territory = France, charset = UTF-8
|
"fr_FR.UTF-8" language = french, territory = France, charset = UTF-8
|
||||||
"ko_KR.eucKR" language = korean, territory = South Korea, charset = eucKR
|
"ko_KR.eucKR" language = korean, territory = South Korea, charset = eucKR
|
||||||
|
"syr_SY" language = Syriac, territory = Syria, default charset
|
||||||
</screen>
|
</screen>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
At application startup, the application's locale is set to the default
|
At application startup, the application's locale is set to the default
|
||||||
"C" or "POSIX" locale. Under Cygwin, this locale defaults to the UTF-8
|
"C" or "POSIX" locale. Under Cygwin 1.7.2 and later, this locale defaults
|
||||||
character set. If you want to stick to the "C" locale and only change to
|
to the ASCII character set on the application level. If you want to stick
|
||||||
another charset, you can define this by setting one of the locale environment
|
to the "C" locale and only change to another charset, you can define this
|
||||||
variables to "C.charset". For instance</para>
|
by setting one of the locale environment variables to "C.charset". For
|
||||||
|
instance</para>
|
||||||
|
|
||||||
<screen>
|
<screen>
|
||||||
"C.ISO-8859-1"
|
"C.ISO-8859-1"
|
||||||
</screen>
|
</screen>
|
||||||
|
|
||||||
<para>The default locale in the absence of the aforementioned locale
|
<note><para>The default locale in the absence of the aforementioned locale
|
||||||
environment variables is "C.UTF-8".</para>
|
environment variables is "C.UTF-8".</para></note>
|
||||||
|
|
||||||
<para>Windows uses the UTF-16 charset exclusively to store the names
|
<para>Windows uses the UTF-16 charset exclusively to store the names
|
||||||
of any object used by the Operating System. This is especially important
|
of any object used by the Operating System. This is especially important
|
||||||
@ -232,8 +237,8 @@ process.</para>
|
|||||||
However, even if one of the locale environment variables is set to
|
However, even if one of the locale environment variables is set to
|
||||||
some other value than "C", this does <emphasis>only</emphasis> affect
|
some other value than "C", this does <emphasis>only</emphasis> affect
|
||||||
how Cygwin itself converts filenames. As the POSIX standard requires,
|
how Cygwin itself converts filenames. As the POSIX standard requires,
|
||||||
it's the applications responsibility to activate that locale for its
|
it's the application's responsibility to activate that locale for its
|
||||||
own purpose, typically by using the call</para>
|
own purposes, typically by using the call</para>
|
||||||
|
|
||||||
<screen>
|
<screen>
|
||||||
setlocale (LC_ALL, "");
|
setlocale (LC_ALL, "");
|
||||||
@ -244,6 +249,18 @@ lost: If the application calls setlocale as above, and there is none
|
|||||||
of the important locale variables set in the environment, the locale
|
of the important locale variables set in the environment, the locale
|
||||||
is set to the default locale, which is "C.UTF-8".</para>
|
is set to the default locale, which is "C.UTF-8".</para>
|
||||||
|
|
||||||
|
<para>But what about applications which are not locale-aware? Per POSIX,
|
||||||
|
they are running in the "C" or "POSIX" locale, which implies the ASCII
|
||||||
|
charset. The Cygwin DLL itself, however, will nevertheless use the locale
|
||||||
|
set in the environment (or the "C.UTF-8" default locale) for converting
|
||||||
|
filenames etc.</para>
|
||||||
|
|
||||||
|
<para>When the locale set in the environment specifies an ASCII charset,
|
||||||
|
for example "C" or "en_US.ASCII", Cygwin will still use UTF-8
|
||||||
|
under the hood to translate filenames. This allows for easier
|
||||||
|
interoperability with applications running in the default "C.UTF-8" locale.
|
||||||
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Right now the language and territory, as well as the modifier, are not
|
Right now the language and territory, as well as the modifier, are not
|
||||||
important to Cygwin, except to fix a single problem. There's a class of
|
important to Cygwin, except to fix a single problem. There's a class of
|
||||||
@ -274,11 +291,6 @@ How does that work?</para>
|
|||||||
|
|
||||||
<itemizedlist mark="bullet">
|
<itemizedlist mark="bullet">
|
||||||
|
|
||||||
<listitem><para>
|
|
||||||
The default locale is the "C" or "POSIX" locale. Under Cygwin this locale
|
|
||||||
defaults to the UTF-8 character set.</para>
|
|
||||||
</listitem>
|
|
||||||
|
|
||||||
<listitem><para>
|
<listitem><para>
|
||||||
Assume that you've set one of the aforementioned environment variables to some
|
Assume that you've set one of the aforementioned environment variables to some
|
||||||
valid POSIX locale value, other than "C" and "POSIX". Assume further that
|
valid POSIX locale value, other than "C" and "POSIX". Assume further that
|
||||||
|
Loading…
x
Reference in New Issue
Block a user