* DevNotes: Add entry cgf-000010.

* select.cc (set_handle_or_return_if_not_open): Remove unneeded final backslash
from definition.
(cygwin_select): Reorganize to incorporate outer retry loop.  Move remaining
time recalculation here for retry case.  Use select_stuff::wait_states for loop
control.
(select_stuff::cleanup): Avoid unneeded initialization.
(select_stuff::wait): Modify definition to return select_stuff::wait_states.
Eliminate is_cancelable.  Don't element 1 of an array if it is a cancel handle.
Remove loop.  Rely on being called from enclosing loop in cygwin_select.
Remove time recalculation when restarting.  Try harder to always return from
the bottom.
* select.h (select_stuff::wait_state): New enum.
(select_stuff::wait): Modify declaration to return select_stuff::wait_states.
This commit is contained in:
Christopher Faylor
2012-06-03 02:59:20 +00:00
parent faab45455a
commit 45b61a88be
5 changed files with 219 additions and 152 deletions

View File

@ -1,3 +1,45 @@
2012-06-02 cgf-000010
<1.7.16>
- Fix emacs problem which exposed an issue with Cygwin's select() function.
If a signal arrives while select is blocking and the program longjmps
out of the signal handler then threads and memory may be left hanging.
Fixes: http://cygwin.com/ml/cygwin/2012-05/threads.html#00275
</1.7.16>
This was try #4 or #5 to get select() signal handling working right.
It's still not there but it should now at least not leak memory or
threads.
I mucked with the interface between cygwin_select and select_stuff::wait
so that the "new" loop in select_stuff::wait() was essentially moved
into the caller. cygwin_select now uses various enum states to decide
what to do. It builds the select linked list at the beginning of the
loop, allowing wait() to tear everything down and restart. This is
necessary before calling a signal handler because the signal handler may
longjmp away.
I initially had this all coded up to use a special signal_cleanup
callback which could be called when a longjmp is called in a signal
handler. And cygwin_select() set up and tore down this callback. Once
I got everything compiling it, of course, dawned on me that just because
you call a longjmp in a signal handler it doesn't mean that you are
jumping *out* of the signal handler. So, if the signal handler invokes
the callback and returns it will be very bad for select(). Hence, this
slower, but hopefully more correct implementation.
(I still wonder if some sort of signal cleanup callback might still
be useful in the future)
TODO: I need to do an audit of other places where this problem could be
occurring.
As alluded to above, select's signal handling is still not right. It
still acts as if it could call a signal handler from something other
than the main thread but, AFAICT, from my STC, this doesn't seem to be
the case. It might be worthwhile to extend cygwait to just magically
figure this out and not even bother using w4[0] for scenarios like this.
2012-05-16 cgf-000009
<1.7.16>