newlib/winsup/cygwin/how-cygheap-works.txt

Copyright 2001 Red Hat Inc., Christopher Faylor

Cygwin has recently adopted something called the "cygwin heap".  This is
an internal heap that is inherited by forked/execed children.  It
consists of process specific information that should be inherited.  So
things like the file descriptor table, the current working directory,
and the chroot value live there.

The cygheap is also used to pass argv information to a child process.
There is a problem here, though.  If you allocate space for argv on the
heap and then exec a process the child process (1) will happily use the
space in the heap.  But what happens when that process execs another
process (2)?  The space used by child process (1) still is being used in
child process (2) but it is basically just a memory leak.

To rectify this problem, memory used by child process 1 is tagged in
such a way that child process 2 will know to delete it.  This is in
cygheap_fixup_in_child.

The cygheap memory allocation functions are adapted from memory
allocators developed by DJ Delorie.  They are similar to early BSD
malloc and are intended to be relatively lightweight and relatively
fast.

How is the cygheap propagated to the child?

Well, it depends if you are running on Windows 9x or Windows NT.

On NT and 9x, just before CreateProcess is about to be called in
fork or exec, a shared memory region is prepared for copying of the
cygwin heap.  This is in cygheap_setup_for_child.  The handle to this
shared memory region is passed to the new process in the 'child_info'
structure.

If there are no handles that need "fixing up" prior to starting another
process, cygheap_setup_for_child will also copy the contents of the
cygwin heap to the shared memory region.

If there are any handles that need "fixing up" prior to invoking
another process (i.e., sockets) then the creation of the shared
memory region and copying of the current cygwin heap is a two
step process.

First the shared memory region is created and the process is started
in a "CREATE_SUSPENDED" state, inheriting the handle.  After the
process is created, the fixup_before_*() functions are called.  These
set information in the heap and duplicate handles in the child, essentially
ensuring that the child's fd table is correct.

(Note that it is vital that the cygwin heap should not grow during this
process.  Currently, there is no guard against this happening so this
operation is not thread safe.)

Meanwhile, back in fork_parent, the function
cygheap_setup_for_child_cleanup is called.  In the simple "one step"
case above, all that happens is that the shared memory is ummapped and
the handle referring to it is closed.

In the two step process, the cygheap is now copied to the shared memory
region, complete with new fdtab info (the child process will see the
updated information as soon as it starts).  Then the memory is unmapped,
the handle is closed, and upon return the child process is started.

It is in the child process that the difference between Windows 9x and
Windows NT becomes evident.

Under Windows NT, the operation is simple.  The shared memory handle is
used to map the information that the parent has set up into the cygheap
location in the child.  This means that the child has a copy of the
cygwin heap existing in "shared memory" but the only process with a view
to this "shared memory" is the child.

Under Windows 9x, due to address limitations, we can't just map the
shared memory region into the cygheap position.  So, instead, the memory
is mapped whereever Windows wants to put it, a new heap region is
allocated at the same place as in the parent, the contents of the shared
memory is *copied* to the new heap, and the shared memory is unmapped.
Simple, huh?

Why do we go to these contortions?  Previous versions (<1.3.3) of cygwin
used to block when creating a child so that the child could copy the
parent's cygheap.  The problem with this was that when a cygwin process
invoked a non-cygwin child, it would block forever waiting for the child
to let it know that it was done copying the heap.  That caused
understandable complaints from people who wanted to run non-cygwin
applications "in the background".

In Cygwin 1.3.3 (and presumably beyond) the location of the cygwin heap
has been fixed to be at the end of the cygwin1.dll address space.
Previously, we let the "OS" choose where to allocate the cygwin heap in
the initial cygwin process and attempted to use this same location in
subsequent cygwin processes started from this parent.

This was basically done to accomodate Windows XP, although there were
sporadic complaints of cygwin heap failures in other pathological
situations with both NT and 9x.  In Windows XP, Microsoft made the
allocation of memory less deterministic.  This is certainly their right.
Cygwin was previously relying on undocumented and "iffy" behavior before.

We're not exactly on completely firm ground now, though.  We are assuming
that there is sufficient space after the cygwin DLL for the allocation
of the cygwin heap.  So far this assumption has proved workable but there
is no guarantee that newer versions of Windows won't break this assumption
too.
add copyrights. 2001-09-14 18:57:32 +02:00			`Copyright 2001 Red Hat Inc., Christopher Faylor`
more words 2001-09-14 18:13:00 +02:00
Another in the how-it-works series. 2001-09-06 18:53:48 +02:00			`Cygwin has recently adopted something called the "cygwin heap". This is`
			`an internal heap that is inherited by forked/execed children. It`
			`consists of process specific information that should be inherited. So`
			`things like the file descriptor table, the current working directory,`
			`and the chroot value live there.`

			`The cygheap is also used to pass argv information to a child process.`
			`There is a problem here, though. If you allocate space for argv on the`
			`heap and then exec a process the child process (1) will happily use the`
			`space in the heap. But what happens when that process execs another`
			`process (2)? The space used by child process (1) still is being used in`
			`child process (2) but it is basically just a memory leak.`

			`To rectify this problem, memory used by child process 1 is tagged in`
			`such a way that child process 2 will know to delete it. This is in`
			`cygheap_fixup_in_child.`

			`The cygheap memory allocation functions are adapted from memory`
			`allocators developed by DJ Delorie. They are similar to early BSD`
			`malloc and are intended to be relatively lightweight and relatively`
			`fast.`
more words 2001-09-14 18:13:00 +02:00
			`How is the cygheap propagated to the child?`

			`Well, it depends if you are running on Windows 9x or Windows NT.`

			`On NT and 9x, just before CreateProcess is about to be called in`
			`fork or exec, a shared memory region is prepared for copying of the`
			`cygwin heap. This is in cygheap_setup_for_child. The handle to this`
			`shared memory region is passed to the new process in the 'child_info'`
			`structure.`

			`If there are no handles that need "fixing up" prior to starting another`
			`process, cygheap_setup_for_child will also copy the contents of the`
			`cygwin heap to the shared memory region.`

			`If there are any handles that need "fixing up" prior to invoking`
			`another process (i.e., sockets) then the creation of the shared`
			`memory region and copying of the current cygwin heap is a two`
			`step process.`

			`First the shared memory region is created and the process is started`
			`in a "CREATE_SUSPENDED" state, inheriting the handle. After the`
			`process is created, the fixup_before_*() functions are called. These`
			`set information in the heap and duplicate handles in the child, essentially`
			`ensuring that the child's fd table is correct.`

			`(Note that it is vital that the cygwin heap should not grow during this`
			`process. Currently, there is no guard against this happening so this`
			`operation is not thread safe.)`

			`Meanwhile, back in fork_parent, the function`
			`cygheap_setup_for_child_cleanup is called. In the simple "one step"`
			`case above, all that happens is that the shared memory is ummapped and`
			`the handle referring to it is closed.`

			`In the two step process, the cygheap is now copied to the shared memory`
			`region, complete with new fdtab info (the child process will see the`
			`updated information as soon as it starts). Then the memory is unmapped,`
			`the handle is closed, and upon return the child process is started.`

			`It is in the child process that the difference between Windows 9x and`
			`Windows NT becomes evident.`

			`Under Windows NT, the operation is simple. The shared memory handle is`
			`used to map the information that the parent has set up into the cygheap`
			`location in the child. This means that the child has a copy of the`
			`cygwin heap existing in "shared memory" but the only process with a view`
			`to this "shared memory" is the child.`

			`Under Windows 9x, due to address limitations, we can't just map the`
			`shared memory region into the cygheap position. So, instead, the memory`
			`is mapped whereever Windows wants to put it, a new heap region is`
			`allocated at the same place as in the parent, the contents of the shared`
			`memory is copied to the new heap, and the shared memory is unmapped.`
			`Simple, huh?`

			`Why do we go to these contortions? Previous versions (<1.3.3) of cygwin`
			`used to block when creating a child so that the child could copy the`
			`parent's cygheap. The problem with this was that when a cygwin process`
			`invoked a non-cygwin child, it would block forever waiting for the child`
			`to let it know that it was done copying the heap. That caused`
			`understandable complaints from people who wanted to run non-cygwin`
			`applications "in the background".`

			`In Cygwin 1.3.3 (and presumably beyond) the location of the cygwin heap`
			`has been fixed to be at the end of the cygwin1.dll address space.`
			`Previously, we let the "OS" choose where to allocate the cygwin heap in`
			`the initial cygwin process and attempted to use this same location in`
			`subsequent cygwin processes started from this parent.`

			`This was basically done to accomodate Windows XP, although there were`
			`sporadic complaints of cygwin heap failures in other pathological`
			`situations with both NT and 9x. In Windows XP, Microsoft made the`
			`allocation of memory less deterministic. This is certainly their right.`
			`Cygwin was previously relying on undocumented and "iffy" behavior before.`

			`We're not exactly on completely firm ground now, though. We are assuming`
			`that there is sufficient space after the cygwin DLL for the allocation`
			`of the cygwin heap. So far this assumption has proved workable but there`
			`is no guarantee that newer versions of Windows won't break this assumption`
			`too.`