* setup2.sgml (setup-locale): Mention three character codes per
ISO 639-3. * setup2.sgml (setup-locale): Adapt description to the C using ASCII change in 1.7.2.
This commit is contained in:
		| @@ -1,3 +1,14 @@ | |||||||
|  | 2010-01-17  Corinna Vinschen  <corinna@vinschen.de> | ||||||
|  |  | ||||||
|  | 	* setup2.sgml (setup-locale): Mention three character codes per | ||||||
|  | 	ISO 639-3. | ||||||
|  |  | ||||||
|  | 2010-01-17  Corinna Vinschen  <corinna@vinschen.de> | ||||||
|  | 	    Andy Koppe <andy.koppe@gmail.com> | ||||||
|  |  | ||||||
|  | 	* setup2.sgml (setup-locale): Adapt description to the C using ASCII | ||||||
|  | 	change in 1.7.2. | ||||||
|  |  | ||||||
| 2010-01-16  Christopher Faylor  <me+cygwin@cgf.cx> | 2010-01-16  Christopher Faylor  <me+cygwin@cgf.cx> | ||||||
|  |  | ||||||
| 	* setup-net.sgml: Remove obsolete assertion. | 	* setup-net.sgml: Remove obsolete assertion. | ||||||
|   | |||||||
| @@ -183,8 +183,11 @@ specifier is</para> | |||||||
|   language[[_TERRITORY][.charset][@modifier]] |   language[[_TERRITORY][.charset][@modifier]] | ||||||
| </screen> | </screen> | ||||||
|  |  | ||||||
| <para>"language" is a lowercase two character string per ISO 639-1, | <para>"language" is a lowercase two character string per ISO 639-1, or, | ||||||
| "TERRITORY" is an uppercase two character string per ISO 3166, charset is | if there is no ISO 639-1 code for the language (for instance, "Lower Sorbian"), | ||||||
|  | a three character string per ISO 639-3.</para> | ||||||
|  |  | ||||||
|  | <para>"TERRITORY" is an uppercase two character string per ISO 3166, charset is | ||||||
| one of a list of supported character sets, and the modifier doesn't matter | one of a list of supported character sets, and the modifier doesn't matter | ||||||
| here (though it might for some applications).  If you're interested in the | here (though it might for some applications).  If you're interested in the | ||||||
| exact description, you can find it in the online publication of the POSIX | exact description, you can find it in the online publication of the POSIX | ||||||
| @@ -197,21 +200,23 @@ manual pages on the homepage of the | |||||||
|   "de_CH"	   language = German, territory = Switzerland, default charset |   "de_CH"	   language = German, territory = Switzerland, default charset | ||||||
|   "fr_FR.UTF-8"    language = french, territory = France, charset = UTF-8 |   "fr_FR.UTF-8"    language = french, territory = France, charset = UTF-8 | ||||||
|   "ko_KR.eucKR"    language = korean, territory = South Korea, charset = eucKR |   "ko_KR.eucKR"    language = korean, territory = South Korea, charset = eucKR | ||||||
|  |   "syr_SY"         language = Syriac, territory = Syria, default charset | ||||||
| </screen> | </screen> | ||||||
|  |  | ||||||
| <para> | <para> | ||||||
| At application startup, the application's locale is set to the default | At application startup, the application's locale is set to the default | ||||||
| "C" or "POSIX" locale.  Under Cygwin, this locale defaults to the UTF-8 | "C" or "POSIX" locale.  Under Cygwin 1.7.2 and later, this locale defaults | ||||||
| character set.  If you want to stick to the "C" locale and only change to | to the ASCII character set on the application level.  If you want to stick | ||||||
| another charset, you can define this by setting one of the locale environment | to the "C" locale and only change to another charset, you can define this | ||||||
| variables to "C.charset".  For instance</para> | by setting one of the locale environment variables to "C.charset".  For | ||||||
|  | instance</para> | ||||||
|  |  | ||||||
| <screen> | <screen> | ||||||
|   "C.ISO-8859-1" |   "C.ISO-8859-1" | ||||||
| </screen> | </screen> | ||||||
|  |  | ||||||
| <para>The default locale in the absence of the aforementioned locale | <note><para>The default locale in the absence of the aforementioned locale | ||||||
| environment variables is "C.UTF-8".</para> | environment variables is "C.UTF-8".</para></note> | ||||||
|  |  | ||||||
| <para>Windows uses the UTF-16 charset exclusively to store the names | <para>Windows uses the UTF-16 charset exclusively to store the names | ||||||
| of any object used by the Operating System.  This is especially important | of any object used by the Operating System.  This is especially important | ||||||
| @@ -232,8 +237,8 @@ process.</para> | |||||||
| However, even if one of the locale environment variables is set to | However, even if one of the locale environment variables is set to | ||||||
| some other value than "C", this does <emphasis>only</emphasis> affect | some other value than "C", this does <emphasis>only</emphasis> affect | ||||||
| how Cygwin itself converts filenames.  As the POSIX standard requires, | how Cygwin itself converts filenames.  As the POSIX standard requires, | ||||||
| it's the applications responsibility to activate that locale for its | it's the application's responsibility to activate that locale for its | ||||||
| own purpose, typically by using the call</para> | own purposes, typically by using the call</para> | ||||||
|  |  | ||||||
| <screen> | <screen> | ||||||
|   setlocale (LC_ALL, ""); |   setlocale (LC_ALL, ""); | ||||||
| @@ -244,6 +249,18 @@ lost:  If the application calls setlocale as above, and there is none | |||||||
| of the important locale variables set in the environment, the locale | of the important locale variables set in the environment, the locale | ||||||
| is set to the default locale, which is "C.UTF-8".</para> | is set to the default locale, which is "C.UTF-8".</para> | ||||||
|  |  | ||||||
|  | <para>But what about applications which are not locale-aware?  Per POSIX, | ||||||
|  | they are running in the "C" or "POSIX" locale, which implies the ASCII | ||||||
|  | charset.  The Cygwin DLL itself, however, will nevertheless use the locale | ||||||
|  | set in the environment (or the "C.UTF-8" default locale) for converting | ||||||
|  | filenames etc.</para> | ||||||
|  |  | ||||||
|  | <para>When the locale set in the environment specifies an ASCII charset, | ||||||
|  | for example "C" or "en_US.ASCII", Cygwin will still use UTF-8 | ||||||
|  | under the hood to translate filenames.  This allows for easier | ||||||
|  | interoperability with applications running in the default "C.UTF-8" locale. | ||||||
|  | </para> | ||||||
|  |  | ||||||
| <para> | <para> | ||||||
| Right now the language and territory, as well as the modifier, are not | Right now the language and territory, as well as the modifier, are not | ||||||
| important to Cygwin, except to fix a single problem.  There's a class of | important to Cygwin, except to fix a single problem.  There's a class of | ||||||
| @@ -274,11 +291,6 @@ How does that work?</para> | |||||||
|  |  | ||||||
| <itemizedlist mark="bullet"> | <itemizedlist mark="bullet"> | ||||||
|  |  | ||||||
| <listitem><para> |  | ||||||
| The default locale is the "C" or "POSIX" locale.  Under Cygwin this locale |  | ||||||
| defaults to the UTF-8 character set.</para> |  | ||||||
| </listitem> |  | ||||||
|  |  | ||||||
| <listitem><para> | <listitem><para> | ||||||
| Assume that you've set one of the aforementioned environment variables to some | Assume that you've set one of the aforementioned environment variables to some | ||||||
| valid POSIX locale value, other than "C" and "POSIX".  Assume further that | valid POSIX locale value, other than "C" and "POSIX".  Assume further that | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user