newlib/winsup/cygwin/how-cygheap-works.txt

105 lines
5.0 KiB
Plaintext

Copyright 2001 Red Hat Inc., Christopher Faylor
Cygwin has recently adopted something called the "cygwin heap". This is
an internal heap that is inherited by forked/execed children. It
consists of process specific information that should be inherited. So
things like the file descriptor table, the current working directory,
and the chroot value live there.
The cygheap is also used to pass argv information to a child process.
There is a problem here, though. If you allocate space for argv on the
heap and then exec a process the child process (1) will happily use the
space in the heap. But what happens when that process execs another
process (2)? The space used by child process (1) still is being used in
child process (2) but it is basically just a memory leak.
To rectify this problem, memory used by child process 1 is tagged in
such a way that child process 2 will know to delete it. This is in
cygheap_fixup_in_child.
The cygheap memory allocation functions are adapted from memory
allocators developed by DJ Delorie. They are similar to early BSD
malloc and are intended to be relatively lightweight and relatively
fast.
How is the cygheap propagated to the child?
Well, it depends if you are running on Windows 9x or Windows NT.
On NT and 9x, just before CreateProcess is about to be called in
fork or exec, a shared memory region is prepared for copying of the
cygwin heap. This is in cygheap_setup_for_child. The handle to this
shared memory region is passed to the new process in the 'child_info'
structure.
If there are no handles that need "fixing up" prior to starting another
process, cygheap_setup_for_child will also copy the contents of the
cygwin heap to the shared memory region.
If there are any handles that need "fixing up" prior to invoking
another process (i.e., sockets) then the creation of the shared
memory region and copying of the current cygwin heap is a two
step process.
First the shared memory region is created and the process is started
in a "CREATE_SUSPENDED" state, inheriting the handle. After the
process is created, the fixup_before_*() functions are called. These
set information in the heap and duplicate handles in the child, essentially
ensuring that the child's fd table is correct.
(Note that it is vital that the cygwin heap should not grow during this
process. Currently, there is no guard against this happening so this
operation is not thread safe.)
Meanwhile, back in fork_parent, the function
cygheap_setup_for_child_cleanup is called. In the simple "one step"
case above, all that happens is that the shared memory is ummapped and
the handle referring to it is closed.
In the two step process, the cygheap is now copied to the shared memory
region, complete with new fdtab info (the child process will see the
updated information as soon as it starts). Then the memory is unmapped,
the handle is closed, and upon return the child process is started.
It is in the child process that the difference between Windows 9x and
Windows NT becomes evident.
Under Windows NT, the operation is simple. The shared memory handle is
used to map the information that the parent has set up into the cygheap
location in the child. This means that the child has a copy of the
cygwin heap existing in "shared memory" but the only process with a view
to this "shared memory" is the child.
Under Windows 9x, due to address limitations, we can't just map the
shared memory region into the cygheap position. So, instead, the memory
is mapped whereever Windows wants to put it, a new heap region is
allocated at the same place as in the parent, the contents of the shared
memory is *copied* to the new heap, and the shared memory is unmapped.
Simple, huh?
Why do we go to these contortions? Previous versions (<1.3.3) of cygwin
used to block when creating a child so that the child could copy the
parent's cygheap. The problem with this was that when a cygwin process
invoked a non-cygwin child, it would block forever waiting for the child
to let it know that it was done copying the heap. That caused
understandable complaints from people who wanted to run non-cygwin
applications "in the background".
In Cygwin 1.3.3 (and presumably beyond) the location of the cygwin heap
has been fixed to be at the end of the cygwin1.dll address space.
Previously, we let the "OS" choose where to allocate the cygwin heap in
the initial cygwin process and attempted to use this same location in
subsequent cygwin processes started from this parent.
This was basically done to accomodate Windows XP, although there were
sporadic complaints of cygwin heap failures in other pathological
situations with both NT and 9x. In Windows XP, Microsoft made the
allocation of memory less deterministic. This is certainly their right.
Cygwin was previously relying on undocumented and "iffy" behavior before.
We're not exactly on completely firm ground now, though. We are assuming
that there is sufficient space after the cygwin DLL for the allocation
of the cygwin heap. So far this assumption has proved workable but there
is no guarantee that newer versions of Windows won't break this assumption
too.