* cygheap.h (init_cygheap::pid_handle): Delete. * dcrt0.cc (child_info_spawn::handle_spawn): Keep parent open if we have execed. * pinfo.cc (pinfo::thisproc): Remove pid_handle manipulations. (pinfo::init): Don't consider a reaped process to be available. * spawn.cc (child_info_spawn::worker): Remove pid_handle manipulations. Make wr_proc_pipe and parent noninheritable when starting a program which doesn't use the Cygwin DLL. Conditionally reset wr_proc_pipe to inheritable if CreateProcess fails. Inject wr_proc_pipe handle into non-Cygwin process. Consider a non-cygwin process to be 'synced'.
		
			
				
	
	
		
			119 lines
		
	
	
		
			5.5 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			119 lines
		
	
	
		
			5.5 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 2012-05-03  cgf-000003
 | |
| 
 | |
| <1.7.15>
 | |
| Don't make Cygwin wait for all children of a non-cygwin child program.
 | |
| Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00063.html,
 | |
|        http://cygwin.com/ml/cygwin/2012-05/msg00075.html
 | |
| </1.7.15>
 | |
| 
 | |
| This problem is due to a recent change which added some robustness and
 | |
| speed to Cygwin's exec/spawn handling by not trying to force inheritance
 | |
| every time a process is started.  See ChangeLog entries starting on
 | |
| 2012-03-20, and multiple on 2012-03-21.
 | |
| 
 | |
| Making the handle inheritable meant that, as usual, there were problems
 | |
| with non-Cygwin processes.  When Cygwin "execs" a non-Cygwin process N,
 | |
| all of its N + 1, N + 2, ...  children will also inherit the handle.
 | |
| That means that Cygwin will wait until all subprocesses have exited
 | |
| before it returns.
 | |
| 
 | |
| I was willing to make this a restriction of starting non-Cygwin
 | |
| processes but the problem with allowing that is that it can cause the
 | |
| creation of a "limbo" pid when N exits and N + 1 and friends are still
 | |
| around.  In this scenario, Cygwin dutifully notices that process N has
 | |
| died and sets the exit code to indicate that but N's parent will wait on
 | |
| rd_proc_pipe and will only return when every N + ...  windows process
 | |
| has exited.
 | |
| 
 | |
| The removal of cygheap::pid_handle was not related to the initial
 | |
| problem that I set out to fix.  The change came from the realization
 | |
| that we were duping the current process handle into the child twice and
 | |
| only needed to do it once.  The current process handle is used by exec
 | |
| to keep the Windows pid "alive" so that it will not be reused.  So, now
 | |
| we just close parent in child_info_spawn::handle_spawn iff we're not
 | |
| execing.
 | |
| 
 | |
| In debugging this it bothered me that 'ps' identified a nonactive pid as
 | |
| active.  Part of the reason for this was the 'parent' handle in
 | |
| child_info was opened in non-Cygwin processes, keeping the pid alive.
 | |
| That has been kluged around (more changes after 1.7.15) but that didn't
 | |
| fix the problem.  On further investigation, this seems to be caused by
 | |
| the fact that the shared memory region pid handles were still being
 | |
| passed to non-cygwin children, keeping the pid alive in a limbo-like
 | |
| fashion.  This was easily fixed by having pinfo::init() consider a
 | |
| memory region with PID_REAPED as not available.
 | |
| 
 | |
| This fixed the problem where a pid showed up in the list after a user
 | |
| does something like: "bash$ cmd /c start notepad" but, for some reason,
 | |
| it does not fix the problem where "bash$ setsid cmd /c start notepad".
 | |
| That bears investigation after 1.7.15 is released but it is not a
 | |
| regression and so is not a blocker for the release.
 | |
| 
 | |
| 2012-05-03  cgf-000002
 | |
| 
 | |
| <1.7.15>
 | |
| Fix problem where too much input was attempted to be read from a
 | |
| pty slave.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00049.html
 | |
| </1.7.15>
 | |
| 
 | |
| My change on 2012/04/05 reintroduced the problem first described by:
 | |
| http://cygwin.com/ml/cygwin/2011-10/threads.html#00445
 | |
| 
 | |
| The problem then was, IIRC, due to the fact that bytes sent to the pty
 | |
| pipe were not written as records.  Changing pipe to PIPE_TYPE_MESSAGE in
 | |
| pipe.cc fixed the problem since writing lines to one side of the pipe
 | |
| caused exactly that the number of characters to be read on the other
 | |
| even if there were more characters in the pipe.
 | |
| 
 | |
| To debug this, I first replaced fhandler_tty.cc with the 1.258,
 | |
| 2012/04/05 version.  The test case started working when I did that.
 | |
| 
 | |
| So, then, I replaced individual functions, one at a time, in
 | |
| fhandler_tty.cc with their previous versions.  I'd expected this to be a
 | |
| problem with fhandler_pty_master::process_slave_output since that had
 | |
| seen the most changes but was surprised to see that the culprit was
 | |
| fhandler_pty_slave::read().
 | |
| 
 | |
| The reason was that I really needed the bytes_available() function to
 | |
| return the number of bytes which would be read in the next operation
 | |
| rather than the number of bytes available in the pipe.  That's because
 | |
| there may be a number of lines available to be read but the number of
 | |
| bytes which will be read by ReadFile should reflect the mode of the pty
 | |
| and, if there is a line to read, only the number of bytes in the line
 | |
| should be seen as available for the next read.
 | |
| 
 | |
| Having bytes_available() return the number of bytes which would be read
 | |
| seemed to fix the problem but it could subtly change the behavior of
 | |
| other callers of this function.  However, I actually think this is
 | |
| probably a good thing since they probably should have been seeing the
 | |
| line behavior.
 | |
| 
 | |
| 2012-05-02  cgf-000001
 | |
| 
 | |
| <1.7.15>
 | |
| Fix problem setting parent pid to 1 when process with children execs
 | |
| itself.  Fixes: http://cygwin.com/ml/cygwin/2012-05/msg00009.html
 | |
| </1.7.15>
 | |
| 
 | |
| Investigating this problem with strace showed that ssh-agent was
 | |
| checking the parent pid and getting a 1 when it shouldn't have.  Other
 | |
| stuff looked ok so I chose to consider this a smoking gun.
 | |
| 
 | |
| Going back to the version that the OP said did not have the problem, I
 | |
| worked forward until I found where the problem first occurred -
 | |
| somewhere around 2012-03-19.  And, indeed, the getppid call returned the
 | |
| correct value in the working version.  That means that this stopped
 | |
| working when I redid the way the process pipe was inherited around
 | |
| this time period.
 | |
| 
 | |
| It isn't clear why (and I suspect I may have to debug this further at
 | |
| some poit) this hasn't always been a problem but I made the obvious fix.
 | |
| We shouldn't have been setting ppid = 1 when we're about to pass off to
 | |
| an execed process.
 | |
| 
 | |
| As I was writing this, I realized that it was necessary to add some
 | |
| additional checks.  Just checking for "have_execed" isn't enough.  If
 | |
| we've execed a non-cygwin process then it won't know how to deal with
 | |
| any inherited children.  So, always set ppid = 1 if we've execed a
 | |
| non-cygwin process.
 |