* cygwinenv.sgml: Move "codepage:xxx" to the removed options section.
Change text accordingly. * new-features.sgml: Try to explain new way to define character sets.
This commit is contained in:
		| @@ -1,3 +1,9 @@ | ||||
| 2009-03-24  Corinna Vinschen  <corinna@vinschen.de> | ||||
|  | ||||
| 	* cygwinenv.sgml: Move "codepage:xxx" to the removed options section. | ||||
| 	Change text accordingly. | ||||
| 	* new-features.sgml: Try to explain new way to define character sets. | ||||
|  | ||||
| 2009-03-18  Corinna Vinschen  <corinna@vinschen.de> | ||||
|  | ||||
| 	* cygwin-ug-net.in.sgml: Update date. | ||||
|   | ||||
| @@ -11,29 +11,6 @@ by prefixing with <literal>no</literal>.</para> | ||||
|  | ||||
| <itemizedlist mark="bullet"> | ||||
|  | ||||
| <listitem> | ||||
| <para><envar>codepage:[ansi|oem|utf8]</envar> - This option controls | ||||
| which single- or multibyte character set is used for file and console | ||||
| operations.  Windows is using UTF-16 characters internally and this | ||||
| option specifies how 8-byte character sets are converted to UTF-16 and | ||||
| vice versa.  The default setting is <envar>ansi</envar> which means, | ||||
| conversion is based on the current ANSI codepage, typically 1252 in | ||||
| many Western language versions of Windows.  The name originates from the | ||||
| ANSI Latin1 (ISO 8859-1) standard, used in Windows 1.0, though the | ||||
| character sets have since diverged from any standard.  The second | ||||
| setting selects an older, DOS-based character set, containing various | ||||
| line drawing and special characters.  It is called <envar>oem</envar> | ||||
| since it was originally encoded in the firmware of IBM PCs by original | ||||
| equipment manufacturers (OEMs).</para> | ||||
| <para>If you find that some characters (especially non-US or 'graphical' ones) | ||||
| do not display correctly in Cygwin, you can use this option to select an | ||||
| appropriate codepage.  Finally, <envar>utf8</envar> treats all file names | ||||
| and console characters as UTF-8 chars.  Please note that, for correct | ||||
| operation, you have to set the environment variable LANG or LC_ALL to | ||||
| somthing like "en_US.UTF-8", otherwise many applications will not be | ||||
| able to recognize UTF-8 strings correctly.</para> | ||||
| </listitem> | ||||
|  | ||||
| <listitem> | ||||
| <para><envar>(no)dosfilewarning</envar> - If set, Cygwin will warn the | ||||
| first time a user uses an "MS-DOS" style path name rather than a POSIX-style | ||||
| @@ -194,6 +171,16 @@ information, read the documentation in <xref linkend="mount-table"></xref> and | ||||
| <xref linkend="pathnames-casesensitive"></xref>.</para> | ||||
| </listitem> | ||||
|  | ||||
| <listitem> | ||||
| <para><envar>codepage:[ansi|oem]</envar> - This option controled | ||||
| which character set is used for file and console operations.  Since Cygwin | ||||
| is now doing all character conversion by itself, depending on the | ||||
| application call to the <function>setlocale()</function> function, and in | ||||
| turn by the setting of the environment variables <envar>$LANG</envar>, | ||||
| <envar>$LC_ALL</envar>, or <envar>$LC_CTYPE</envar>, this setting | ||||
| got useless.</para> | ||||
| </listitem> | ||||
|  | ||||
| <listitem> | ||||
| <para><envar>(no)ntea</envar> -  This option has been removed since it | ||||
| only fakes security which is considered dangerous and useless.  It also | ||||
|   | ||||
| @@ -17,14 +17,19 @@ | ||||
|   are only local to the current session and disappear when the last | ||||
|   Cygwin process in the session exits. | ||||
|  | ||||
| - If a filename cannot be represented in the current character set, | ||||
|   the character will be converted to a sequence Ctrl-N + UTF-8 representation | ||||
|   of the character.  This allows to access all files, even those not | ||||
|   having a valid representation of their filename in the current character | ||||
|   set (codepage).  To have always a valid string, use the UTF-8 charset | ||||
|   by setting the environment variable $LANG, $LC_ALL, or $LC_CTYPE to a | ||||
|   valid POSIX value, for instance in Cygwin.bat like this: | ||||
|  | ||||
|     set LC_CTYPE=en_US.UTF-8 | ||||
|  | ||||
| - PATH_MAX is now 4096.  Internally, path names can be as long as the | ||||
|   underlying OS can handle (32K). | ||||
|  | ||||
| - UTF-8 filenames are supported now.  So far, this requires to set | ||||
|   the environment variable CYGWIN to contain "codepage:utf8". but this | ||||
|   will likely disappear at one point.  The setting of $LANG or $LC_CTYPE | ||||
|   will be used instead. | ||||
|  | ||||
| - struct dirent now supports d_type, filled out with DT_REG or DT_DIR. | ||||
|   All other file types return as DT_UNKNOWN for performance reasons. | ||||
|  | ||||
| @@ -176,6 +181,19 @@ | ||||
| <sect2 id="ov-new1.7-posix"><title>Other POSIX related changes</title> | ||||
|  | ||||
| <screen> | ||||
| - A lot of character sets are supported now via a call to setlocale(). | ||||
|   The setting of the environment variables $LANG, $LC_ALL or $LC_CTYPE will | ||||
|   be used.  For instance, setting $LANG to "de_DE.ISO-8859-15" before | ||||
|   starting a Cygwin session will use the ISO-8859-15 character set in | ||||
|   the entire session.  UTF-8 is supported as well, as in "en_US.UTF-8". | ||||
|  | ||||
|   The full list of supported character sets: "ASCII", "ISO-8859-x" with x | ||||
|   in 1-16, except 12, "UTF-8", Windows codepages "CPxxx", with xxx in | ||||
|   (437, 720, 737, 775, 850, 852, 855, 857, 858, 862, 866, 874, 1125, | ||||
|   1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258), "JIS", "SJIS", | ||||
|   "eucJP", "Big5".  The leading language and territory part (en_US) is not | ||||
|   used by Cygwin yet, but is required for POSIX compatibility. | ||||
|  | ||||
| - Allow multiple concurrent read locks per thread for pthread_rwlock_t. | ||||
|  | ||||
| - Implement pthread_kill(thread, 0) as per POSIX. | ||||
|   | ||||
		Reference in New Issue
	
	Block a user