console character set setting. * faq: Throughout, change references to User's Guide to references to 1.7 User's Guide temporarily. * faq-setup.html (faq.using.unicode): Rephrase slightly. (faq.using.weirdchars): New FAQ entry for console charset problems.
		
			
				
	
	
		
			534 lines
		
	
	
		
			21 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			534 lines
		
	
	
		
			21 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
<sect1 id="setup-env"><title>Environment Variables</title>
 | 
						|
 | 
						|
<para>
 | 
						|
You may wish to specify settings of several important environment
 | 
						|
variables that affect Cygwin's operation.  Some of these settings need
 | 
						|
to be in effect prior to launching the initial Cygwin session (before
 | 
						|
starting your bash shell, for instance), and are, consequentially, best
 | 
						|
placed in a .bat file.  An initial file is named Cygwin.bat and is created
 | 
						|
in the Cygwin root directory that you specified during setup.  Note that
 | 
						|
the "Cygwin" option of the Start Menu points to Cygwin.bat.  Edit
 | 
						|
Cygwin.bat to your liking or create your own .bat files to start
 | 
						|
Cygwin processes.</para>
 | 
						|
 | 
						|
<para>
 | 
						|
The <envar>CYGWIN</envar> variable is used to configure many global
 | 
						|
settings for the Cygwin runtime system.  Initially you can leave
 | 
						|
<envar>CYGWIN</envar> unset or set it to <literal>tty</literal> (e.g.
 | 
						|
to support job control with ^Z etc...) using a syntax like this in the
 | 
						|
DOS shell, before launching bash.</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
<prompt>C:\></prompt> <userinput>set CYGWIN=tty notitle glob</userinput>
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>
 | 
						|
Locale support is controlled by the <envar>LANG</envar> and
 | 
						|
<envar>LC_xxx</envar> environment variables.  You can set all of them
 | 
						|
but Cygwin itself only honors the variables <envar>LC_ALL</envar>,
 | 
						|
<envar>LC_CTYPE</envar>, and <envar>LANG</envar>, in this order, according
 | 
						|
to the POSIX standard.  The first one found rules.  For a more detailed
 | 
						|
description see <xref linkend="setup-locale"></xref>.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
The <envar>PATH</envar> environment variable is used by Cygwin
 | 
						|
applications as a list of directories to search for executable files
 | 
						|
to run.  This environment variable is converted from Windows format
 | 
						|
(e.g. <filename>C:\Windows\system32;C:\Windows</filename>) to UNIX format
 | 
						|
(e.g., <filename>/cygdrive/c/Windows/system32:/cygdrive/c/Windows</filename>)
 | 
						|
when a Cygwin process first starts.
 | 
						|
Set it so that it contains at least the <filename>x:\cygwin\bin</filename>
 | 
						|
directory where "<filename>x:\cygwin</filename> is the "root" of your
 | 
						|
cygwin installation if you wish to use cygwin tools outside of bash.
 | 
						|
This is usually done by the batch file you're starting your shell with.
 | 
						|
</para>
 | 
						|
 | 
						|
<para> 
 | 
						|
The <envar>HOME</envar> environment variable is used by many programs to
 | 
						|
determine the location of your home directory and we recommend that it be
 | 
						|
defined.  This environment variable is also converted from Windows format
 | 
						|
when a Cygwin process first starts.  It's usually set in the shell
 | 
						|
profile scripts in the /etc directory.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
The <envar>TERM</envar> environment variable specifies your terminal
 | 
						|
type.  It is automatically set to <literal>cygwin</literal> if you have
 | 
						|
not set it to something else.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>The <envar>LD_LIBRARY_PATH</envar> environment variable is used by
 | 
						|
the Cygwin function <function>dlopen ()</function> as a list of
 | 
						|
directories to search for .dll files to load.  This environment variable
 | 
						|
is converted from Windows format to UNIX format when a Cygwin process
 | 
						|
first starts.  Most Cygwin applications do not make use of the
 | 
						|
<function>dlopen ()</function> call and do not need this variable.
 | 
						|
</para>
 | 
						|
 | 
						|
</sect1>
 | 
						|
 | 
						|
<sect1 id="setup-maxmem"><title>Changing Cygwin's Maximum Memory</title>
 | 
						|
 | 
						|
<para>
 | 
						|
Cygwin's heap is extensible.  However, it does start out at a fixed size
 | 
						|
and attempts to extend it may run into memory which has been previously
 | 
						|
allocated by Windows.  In some cases, this problem can be solved by
 | 
						|
adding an entry in the either the <literal>HKEY_LOCAL_MACHINE</literal>
 | 
						|
(to change the limit for all users) or
 | 
						|
<literal>HKEY_CURRENT_USER</literal> (for just the current user) section
 | 
						|
of the registry.  </para>
 | 
						|
 | 
						|
<para>
 | 
						|
Add the <literal>DWORD</literal> value <literal>heap_chunk_in_mb</literal> 
 | 
						|
and set it to the desired memory limit in decimal MB. It is preferred to do 
 | 
						|
this in Cygwin using the <command>regtool</command> program included in the 
 | 
						|
Cygwin package.
 | 
						|
(For more information about <command>regtool</command> or the other Cygwin 
 | 
						|
utilities, see <xref linkend="using-utils"></xref> or use the
 | 
						|
<literal>--help</literal> option of each util.)  You should always be careful 
 | 
						|
when using <command>regtool</command> since damaging your system registry can 
 | 
						|
result in an unusable system.  This example sets memory limit to 1024 MB:
 | 
						|
 | 
						|
<screen>
 | 
						|
regtool -i set /HKLM/Software/Cygwin/heap_chunk_in_mb 1024
 | 
						|
regtool -v list /HKLM/Software/Cygwin
 | 
						|
</screen>
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
Exit all running Cygwin processes and restart them. Memory can be allocated up 
 | 
						|
to the size of the system swap space minus any the size of any running 
 | 
						|
processes. The system swap should be at least as large as the physically 
 | 
						|
installed RAM and can be modified under the System category of the 
 | 
						|
Control Panel.  
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
Here is a small program written by DJ Delorie that tests the 
 | 
						|
memory allocation limit on your system:
 | 
						|
 | 
						|
<screen>
 | 
						|
main()
 | 
						|
{
 | 
						|
  unsigned int bit=0x40000000, sum=0;
 | 
						|
  char *x;
 | 
						|
  
 | 
						|
  while (bit > 4096) 
 | 
						|
  {
 | 
						|
    x = malloc(bit);
 | 
						|
    if (x)
 | 
						|
    sum += bit;
 | 
						|
    bit >>= 1;
 | 
						|
  }
 | 
						|
  printf("%08x bytes (%.1fMb)\n", sum, sum/1024.0/1024.0);
 | 
						|
  return 0;
 | 
						|
}
 | 
						|
</screen>
 | 
						|
 | 
						|
You can compile this program using:
 | 
						|
<screen>
 | 
						|
gcc max_memory.c -o max_memory.exe
 | 
						|
</screen>
 | 
						|
 | 
						|
Run the program and it will output the maximum amount of allocatable memory.
 | 
						|
</para>
 | 
						|
 | 
						|
</sect1>
 | 
						|
 | 
						|
<sect1 id="setup-locale"><title>Internationalization</title>
 | 
						|
 | 
						|
<sect2 id="setup-locale-ov"><title>Overview</title>
 | 
						|
 | 
						|
<para>
 | 
						|
Internationalization support is controlled by the <envar>LANG</envar> and
 | 
						|
<envar>LC_xxx</envar> environment variables.  You can set all of them
 | 
						|
but Cygwin itself only honors the variables <envar>LC_ALL</envar>,
 | 
						|
<envar>LC_CTYPE</envar>, and <envar>LANG</envar>, in this order, according
 | 
						|
to the POSIX standard.  The content of these variables should follow the
 | 
						|
POSIX standard for a locale specifier.  The correct form of a locale
 | 
						|
specifier is</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  language[[_TERRITORY][.charset][@modifier]]
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>"language" is a lowercase two character string per ISO 639-1,
 | 
						|
"TERRITORY" is an uppercase two character string per ISO 3166, charset is
 | 
						|
one of a list of supported character sets, and the modifier doesn't matter
 | 
						|
here (though it might for some applications).  If you're interested in the
 | 
						|
exact description, you can find it in the online publication of the POSIX
 | 
						|
manual pages on the homepage of the
 | 
						|
<ulink url="http://www.opengroup.org/">Open Group</ulink>.</para>
 | 
						|
 | 
						|
<para>Typical locale specifiers are</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  "de_CH"	   language = German, territory = Switzerland, default charset
 | 
						|
  "fr_FR.UTF-8"    language = french, territory = France, charset = UTF-8
 | 
						|
  "ko_KR.eucKR"    language = korean, territory = South Korea, charset = eucKR
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>
 | 
						|
And let's not forget the default locale called "C" or "POSIX"
 | 
						|
which basically only supports plain ASCII code.  If the aforementioned
 | 
						|
environment variables are not set, or set to "C" or "POSIX", you get the
 | 
						|
default ASCII-only behaviour.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
Right now the language and territory content is not evaluated by Cygwin any
 | 
						|
further.  The only important part so far is the character set.  How does that
 | 
						|
work?
 | 
						|
</para>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="setup-locale-how"><title>How to set the locale</title>
 | 
						|
 | 
						|
<itemizedlist mark="bullet">
 | 
						|
 | 
						|
<listitem><para>
 | 
						|
The default locale is the "C" or "POSIX" locale.  In this locale, basically
 | 
						|
only ASCII characters are supported.  Even if one of the aforementioned
 | 
						|
environment variables are set to something else, it's the application's
 | 
						|
responsibility to call the function <function>setlocale</function>,
 | 
						|
typically like this</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  setlocale (LC_ALL, "");
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>to switch to another locale according to the settings of the
 | 
						|
internationalization environment variables.
 | 
						|
</para></listitem>
 | 
						|
 | 
						|
<listitem><para>
 | 
						|
Assume that you've set one of the aforementioned environment variables to some
 | 
						|
valid POSIX locale value, other than "C" and "POSIX", and assume that you
 | 
						|
call an application which calls <function>setlocale</function> as above.</para>
 | 
						|
 | 
						|
<para>Assume further that you're living in Japan.  You might want to use
 | 
						|
the language code "ja" and the territory "JP", thus setting, say,
 | 
						|
<envar>LANG</envar> to "ja_JP".  You didn't set a character set, so
 | 
						|
what will Cygwin use now?  Easy!  It will use the default Windows ANSI
 | 
						|
codepage of your system, if it's supported by Cygwin.  Hopefully Cygwin
 | 
						|
supports all relevant default ANSI codepages...</para>
 | 
						|
 | 
						|
<note><para>For a list of supported character sets, see
 | 
						|
<xref linkend="setup-locale-charsetlist"></xref>
 | 
						|
</para></note>
 | 
						|
</listitem>
 | 
						|
 | 
						|
<listitem><para>
 | 
						|
You don't want to use the default Windows codepage as character set?
 | 
						|
In that case you have to specify the charset explicitely.  For instance,
 | 
						|
assume you're from Italy and don't want to use the default Windows codepage
 | 
						|
1252, but the more portable ISO-8859-15 character set.  What you can do is
 | 
						|
to set the <envar>LANG</envar> variable in the
 | 
						|
<filename>C:\cygwin\Cygwin.bat</filename> file which is the batch file
 | 
						|
to start a Cygwin session from the "Cygwin" desktop shortcut.</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  @echo off
 | 
						|
 | 
						|
  C:
 | 
						|
  chdir C:\cygwin\bin
 | 
						|
  set LANG=it_IT.ISO-8859-15
 | 
						|
  bash --login -i
 | 
						|
</screen>
 | 
						|
</listitem>
 | 
						|
 | 
						|
<listitem><para>
 | 
						|
Most singlebyte or doublebyte charsets have a disadvantage.  Windows
 | 
						|
filesystems use the Unicode character set in the UTF-16 encoding to store filename information.  Not all characters
 | 
						|
from the Unicode character set are available in a singlebyte or doublebyte
 | 
						|
charset.  While Cygwin has a workaround to access files with unusual
 | 
						|
characters (see <xref linkend="pathnames-unusual"></xref>), a better
 | 
						|
workaround is to use always the UTF-8 character set.  UTF-8 is the only
 | 
						|
multibyte character set which can represent <emphasis>every</emphasis>
 | 
						|
Unicode character.</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  set LANG=es_MX.UTF-8
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>For a description of the Unicode standard, see the homepage of the
 | 
						|
<ulink url="http://www.unicode.org/">Unicode Consortium</ulink>.
 | 
						|
</para></listitem>
 | 
						|
 | 
						|
</itemizedlist>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="setup-locale-console"><title>The Windows Console character set</title>
 | 
						|
 | 
						|
<para>Most of the time the Windows console is used to run Cygwin applications.
 | 
						|
While terminal emulations like <command>xterm</command> or
 | 
						|
<command>mintty</command> have a distinct way to set the character set
 | 
						|
used for in- and output, the Windows console hasn't such a way, since it's
 | 
						|
not an application in its own right.</para>
 | 
						|
 | 
						|
<para>This problem is solved in Cygwin as follows.  When the first Cygwin
 | 
						|
process is started in a Windows console (either explicitely from cmd.exe,
 | 
						|
or implicitly by, for instance, clicking on the Cygwin desktop icon, or
 | 
						|
running the Cygwin.bat file), the Console character set is determined by the
 | 
						|
setting of the aforementioned internationalization environment variables,
 | 
						|
the same way as described in <xref linkend="setup-locale-how"></xref>.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>However, in contrast to the application's character set, which is
 | 
						|
determined by the <function>setlocale</function> call, the console
 | 
						|
character set stays fixed for all subsequent Cygwin processes started
 | 
						|
from this first Cygwin process in the console.  So, for instance, if
 | 
						|
<envar>LANG</envar> was set to "en_US.UTF-8" when the first Cygwin process
 | 
						|
started, the console is a UTF-8 terminal for the entire Cygwin process
 | 
						|
tree started from this first Cygwin process.</para>
 | 
						|
 | 
						|
<para>You're asking "What is that good for?  Why not switch the console
 | 
						|
character set with the applications requirements?  After all, the
 | 
						|
application knows if it uses localization or not."  That's true, but
 | 
						|
what if the non-localized application calls a remote application which
 | 
						|
itself is localized?  This can happen with <command>ssh</command> or
 | 
						|
<command>rlogin</command>.  Both commands don't have and don't need
 | 
						|
localization and they never call <function>setlocale</function>.  This
 | 
						|
would have the unfortunate effect, that the console would run with the
 | 
						|
ASCII character set alone.  Native characters printed from the remote
 | 
						|
application would not show up correctly on your local console.</para>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="setup-locale-problems"><title>Potential Problems when using Locales</title>
 | 
						|
 | 
						|
<para>
 | 
						|
You can set the above internationalization variables not only in
 | 
						|
<filename>Cygwin.bat</filename> or in the Windows environment, but also
 | 
						|
in your Cygwin shell on the fly, even switch to yet another character
 | 
						|
set, and yet another.  In bash for instance:</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  <prompt>bash$</prompt> export LC_CTYPE="nl_BE.UTF-8"
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>However, here's a problem.  At the start of the first Cygwin process
 | 
						|
in a session, the Windows environment has to be converted from UTF-16 to
 | 
						|
some singlebyte or multibyte charset.  If the internationalization environment
 | 
						|
variable hasn't been set <emphasis>before</emphasis> starting this process,
 | 
						|
Cygwin has to make an educated guess which charset to use to convert
 | 
						|
the environment itself.  The only reproducible way to do that in the absence
 | 
						|
of <envar>LC_ALL</envar>, <envar>LC_CTYPE</envar>, or <envar>LANG</envar>,
 | 
						|
is to use the current Windows ANSI codepage.</para>
 | 
						|
 | 
						|
<para>As long as the environment only contains ASCII characters, this is
 | 
						|
no problem.  But if it contains native characters, and you're planning
 | 
						|
to use, say, UTF-8, the environment will result in invalid characters in
 | 
						|
the UTF-8 charset.  This would be especially a problem in variables like
 | 
						|
<envar>PATH</envar>.</para>
 | 
						|
 | 
						|
<note><para>Per POSIX, the name of an environment variable should only
 | 
						|
consist of valid ASCII characters, and only of uppercase letters, digits, and
 | 
						|
the underscore for maximum portablilty.</para></note>
 | 
						|
 | 
						|
<para>Symbolic links, too, may pose a problem when switching charsets on
 | 
						|
the fly.  A symbolic link contains the filename of the target file the
 | 
						|
symlink points to.  When a symlink had been created with older versions
 | 
						|
of Cygwin, the current ANSI or OEM character set had been used to store
 | 
						|
the target filename, dependent on the old <envar>CYGWIN</envar>
 | 
						|
environment variable setting <envar>codepage</envar> (see <xref
 | 
						|
linkend="cygwinenv-removed-options"></xref>.  If the target filename
 | 
						|
contains non-ASCII characters and you use another character set than
 | 
						|
your default ANSI/OEM charset, the target filename of the symlink is now
 | 
						|
potentially an invalid character sequence in the new character set.
 | 
						|
This behaviour is not different from the behaviour in other Operating
 | 
						|
Systems.  So, if you suddenly can't access a symlink anymore which
 | 
						|
worked all these years before, maybe it's because you switched to
 | 
						|
another character set.  This doesn't occur with symlinks created with
 | 
						|
Cygwin 1.7 or later.  </para>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="setup-locale-missing"><title>What does not work?</title>
 | 
						|
 | 
						|
<para>
 | 
						|
Except for <envar>LC_ALL</envar>, <envar>LC_CTYPE</envar>,
 | 
						|
and <envar>LANG</envar>, all other LC_xxx environment variables,
 | 
						|
<envar>LC_COLLATE</envar>, <envar>LC_MESSAGES</envar>,
 | 
						|
<envar>LC_MONETARY</envar>, <envar>LC_NUMERIC</envar>,
 | 
						|
and <envar>LC_TIME</envar>, are ignored right now.  This means, while Cygwin
 | 
						|
supports different character sets, it does <emphasis>not</emphasis> support
 | 
						|
real localization so far.  There's no support for locale-specific monetary
 | 
						|
symbols, for a decimalpoint other than '.', no support for native time
 | 
						|
formats, and no support for native language sorting orders.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>Cygwin's internationalization support is work in progress and we would
 | 
						|
be glad for coding help in this area.</para>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
<sect2 id="setup-locale-charsetlist"><title>List of supported character sets</title>
 | 
						|
 | 
						|
<para>Last but not least, here's the list of currently supported character
 | 
						|
sets.  The left-hand expression is the name of the charset, as you would use
 | 
						|
it in the internationalization environment variables as outlined above.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>The right-hand side is the number of the equivalent Windows
 | 
						|
codepage as well as the Windows name of the codepage.  They are only
 | 
						|
noted here for reference.  Don't try to use the bare codepage number or
 | 
						|
the Windows name of the codepage as charset in locale specifiers, unless
 | 
						|
they happen to be identical with the left-hand side.  Especially in case
 | 
						|
oif the "CPxxx" style charsets, always use them with the trailing "CP".</para>
 | 
						|
 | 
						|
<para>This works:</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  set LC_ALL=en_US.CP437
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>This does <emphasis>not</emphasis> work:</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
  set LC_ALL=en_US.437
 | 
						|
</screen>
 | 
						|
 | 
						|
<para>You can find a full list of Windows codepages on the Microsoft MSDN page
 | 
						|
<ulink url="http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx">Code Page Identifiers</ulink>.</para>
 | 
						|
 | 
						|
<screen>
 | 
						|
    Charset               Codepage
 | 
						|
 | 
						|
    CP437                   437 (OEM United States)
 | 
						|
    CP720                   720 (DOS Arabic)
 | 
						|
    CP737                   737 (OEM Greek)
 | 
						|
    CP775                   775 (OEM Baltic)
 | 
						|
    CP850                   850 (OEM Latin 1, Western European)
 | 
						|
    CP852                   852 (OEM Latin 2, Central European)
 | 
						|
    CP855                   855 (OEM Cyrillic)
 | 
						|
    CP857                   857 (OEM Turkish)
 | 
						|
    CP858                   858 (OEM Latin 1 + Euro Symbol)
 | 
						|
    CP862                   862 (OEM Hebrew)
 | 
						|
    CP866                   866 (OEM Russian)
 | 
						|
    CP874                   874 (ANSI/OEM Thai)
 | 
						|
    CP1125                 1125 (OEM Ukraine)
 | 
						|
    CP1250                 1250 (ANSI Central European)
 | 
						|
    CP1251                 1251 (ANSI Cyrillic)
 | 
						|
    CP1252                 1252 (ANSI Latin 1, Western European)
 | 
						|
    CP1253                 1253 (ANSI Greek)
 | 
						|
    CP1254                 1254 (ANSI Turkish)
 | 
						|
    CP1255                 1255 (ANSI Hebrew)
 | 
						|
    CP1256                 1256 (ANSI Arabic)
 | 
						|
    CP1257                 1257 (ANSI Baltic)
 | 
						|
    CP1258                 1258 (ANSI/OEM Vietnamese)
 | 
						|
 | 
						|
    ISO-8859-1            28591 (ISO-8859-1)
 | 
						|
    ISO-8859-2            28592 (ISO-8859-2)
 | 
						|
    ISO-8859-3            28593 (ISO-8859-3)
 | 
						|
    ISO-8859-4            28594 (ISO-8859-4)
 | 
						|
    ISO-8859-5            28595 (ISO-8859-5)
 | 
						|
    ISO-8859-6            28596 (ISO-8859-6)
 | 
						|
    ISO-8859-7            28597 (ISO-8859-7)
 | 
						|
    ISO-8859-8            28598 (ISO-8859-8)
 | 
						|
    ISO-8859-9            28599 (ISO-8859-9)
 | 
						|
    ISO-8859-10             -   (not available)
 | 
						|
    ISO-8859-11             -   (not available)
 | 
						|
    ISO-8859-13           28563 (ISO-8859-13)
 | 
						|
    ISO-8859-14             -   (not available)
 | 
						|
    ISO-8859-15           28565 (ISO-8859-15)
 | 
						|
    ISO-8859-16             -   (not available)
 | 
						|
 | 
						|
    SJIS                    932 (ANSI/OEM Japanese)
 | 
						|
    GBK                     936 (ANSI/OEM Simplified Chinese)
 | 
						|
    Big5                    950 (ANSI/OEM Traditional Chinese)
 | 
						|
    eucJP                 20932 (EUC Japanese)
 | 
						|
    eucKR                   949 (EUC Korean)
 | 
						|
 | 
						|
    UTF-8                 65001 (UTF-8)
 | 
						|
</screen>
 | 
						|
 | 
						|
</sect2>
 | 
						|
 | 
						|
</sect1>
 | 
						|
 | 
						|
<sect1 id="setup-files"><title>Customizing bash</title>
 | 
						|
 | 
						|
<para>
 | 
						|
To set up bash so that cut and paste work properly, click on the
 | 
						|
"Properties" button of the window, then on the "Misc" tab.  Make sure
 | 
						|
that "QuickEdit mode" and "Insert mode" are checked.  These settings
 | 
						|
will be remembered next time you run bash from that shortcut. Similarly
 | 
						|
you can set the working directory inside the "Program" tab. The entry
 | 
						|
"%HOME%" is valid, but requires that you set <envar>HOME</envar> in
 | 
						|
the Windows environment.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
Your home directory should contain three initialization files
 | 
						|
that control the behavior of bash.  They are
 | 
						|
<filename>.profile</filename>, <filename>.bashrc</filename> and
 | 
						|
<filename>.inputrc</filename>.  The Cygwin base installation creates
 | 
						|
stub files when you start bash for the first time.</para>
 | 
						|
 | 
						|
<para>
 | 
						|
<filename>.profile</filename> (other names are also valid, see the bash man
 | 
						|
page) contains bash commands.  It is executed when bash is started as login
 | 
						|
shell, e.g. from the command <command>bash --login</command>.
 | 
						|
This is a useful place to define and
 | 
						|
export environment variables and bash functions that will be used by bash
 | 
						|
and the programs invoked by bash.  It is a good place to redefine
 | 
						|
<envar>PATH</envar> if needed.  We recommend adding a ":." to the end of
 | 
						|
<envar>PATH</envar> to also search the current working directory (contrary
 | 
						|
to DOS, the local directory is not searched by default).  Also to avoid
 | 
						|
delays you should either <command>unset</command> <envar>MAILCHECK</envar> 
 | 
						|
or define <envar>MAILPATH</envar> to point to your existing mail inbox.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
<filename>.bashrc</filename> is similar to
 | 
						|
<filename>.profile</filename> but is executed each time an interactive
 | 
						|
bash shell is launched.  It serves to define elements that are not
 | 
						|
inherited through the environment, such as aliases. If you do not use
 | 
						|
login shells, you may want to put the contents of
 | 
						|
<filename>.profile</filename> as discussed above in this file
 | 
						|
instead.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
<screen>
 | 
						|
shopt -s nocaseglob
 | 
						|
</screen>
 | 
						|
will allow bash to glob filenames in a case-insensitive manner.
 | 
						|
Note that <filename>.bashrc</filename> is not called automatically for login 
 | 
						|
shells. You can source it from <filename>.profile</filename>.
 | 
						|
</para>
 | 
						|
 | 
						|
<para>
 | 
						|
<filename>.inputrc</filename> controls how programs using the readline
 | 
						|
library (including <command>bash</command>) behave.  It is loaded
 | 
						|
automatically.  For full details see the <literal>Function and Variable
 | 
						|
Index</literal> section of the GNU <systemitem>readline</systemitem> manual.
 | 
						|
Consider the following settings:
 | 
						|
<screen>
 | 
						|
# Ignore case while completing
 | 
						|
set completion-ignore-case on
 | 
						|
# Make Bash 8bit clean
 | 
						|
set meta-flag on
 | 
						|
set convert-meta off
 | 
						|
set output-meta on
 | 
						|
</screen>
 | 
						|
The first command makes filename completion case insensitive, which can
 | 
						|
be convenient in a Windows environment.  The next three commands allow
 | 
						|
<command>bash</command> to display 8-bit characters, useful for
 | 
						|
languages with accented characters.  Note that tools that do not use
 | 
						|
<systemitem>readline</systemitem> for display, such as
 | 
						|
<command>less</command> and <command>ls</command>, require additional
 | 
						|
settings, which could be put in your <filename>.bashrc</filename>:
 | 
						|
<screen>
 | 
						|
alias less='/bin/less -r'
 | 
						|
alias ls='/bin/ls -F --color=tty --show-control-chars'
 | 
						|
</screen>
 | 
						|
</para>
 | 
						|
 | 
						|
</sect1>
 | 
						|
 |