You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kyle McKay <ma...@gmail.com> on 2009/03/28 05:13:25 UTC

Re: zombie ssh processes WORKAROUND

As a workaround to the problem (see history below), you can do the  
following:

1. Revert change 35533 (or use a prior version such as 1.5.6 that  
doesn't have change 35533) so that apr_pool_note_subprocess is being  
called with APR_KILL_ALWAYS.  APR_KILL_ALWAYS is the fastest option  
(it never waits) and guaranteed not to leave any zombies behind or  
allow ugly SIGTERM-related error messages to be displayed.

2. Create a /usr/local/bin/svn_ssh file with these contents:

#!/bin/sh
exec 3<&0
/bin/sh -c 'exec 0<&3 3<&-; exec "$0" "$@"' ssh "$@" &
exit 0

# NOTE: This depends on the shell having this behavior:
# -c string If the -c option is present, then commands are read
#           from string.  If there are arguments after the string,
#           they are assigned to the positional parameters,
#           starting with $0.
# Standards compliant shells do this.  See
# http://www.opengroup.org/onlinepubs/009695399/utilities/sh.html

3. Make sure it's chmod a+rx (place it somewhere else if you like,  
just adjust the path in the next step).

4. Edit your ~/.subversion/config and in the [tunnels] section do this:

[tunnels]
ssh = $SVN_SSH /usr/local/bin/svn_ssh

5. Finally edit your ~/.ssh/config and add something like this:

ControlMaster auto
ControlPath /tmp/sshpool-%l-%r@%h:%p

That's it.  No zombies left behind and connection pooling works fine  
-- Subversion can even launch the ssh master connection without problem.

There is an apr_procattr_detach_set function that would cause the  
tunnel process to be detached much like the svn_ssh script above does,  
but unfortunately apr_proc_detach (which is what gets called in the  
child to do the detach) freopens stdin, stdout and stderr to /dev/null  
thereby stepping on the stdin/stdout pipes that Subversion actually  
requires to use the tunnel (apr_proc_detach is called after the pipes  
are dup2'd into 0, 1 and 2).

As a workaround to lack of a suitable apr option to detach but not  
redirect stdin/stdout/stderr to /dev/null, the find_tunnel_agent  
function in client.c could be enhanced to add the following three  
arguments to the FRONT of the argv array it generates (shown here as  
comma separated C strings):

"/bin/sh",
"-c",
"exec 3<&0; /bin/sh -c 'exec 0<&3 3<&-; " /* split line for email */
   "exec \"$0\" \"$@\"' \"$0\" \"$@\"& exit 0"

This will cause any tunnel program to be run detached as though set up  
with a script similar to svn_ssh above. (Only when run on a system  
with a standards compliant /bin/sh though.)  Steps #2, #3 and #4 above  
are no longer necessary if this change is made.  (Alternatively  
Subversion could start calling fork/exec directly or get an option  
added to apr_procattr_detach_set/apr_proc_detach to not redirect stdin/ 
stdout/stderr to /dev/null and start using that.)

Kyle

P.S. It ought to be possible to cram it all into a single "SVN_SSH"  
variable or "[tunnels] ssh =" setting but the proper combination of  
quoting to make apr_tokenize_to_argv AND the nested sh callouts happy  
is eluding me at the moment.

On Mar 27, 2009, at 11:30, Clark S. Cox III wrote:
> On Mar 27, 2009, at 9:51 AM, Hyrum K. Wright wrote:
>
>> On Mar 27, 2009, at 11:21 AM, Kyle McKay wrote:
>>
>>> From the xcode-users mailing list:
>>>
>>>> From: Chris Espinosa <cd...@apple.com>
>>>> Date: March 26, 2009 15:39:14 PDT
>>>> To: Xcode Users <xc...@lists.apple.com>
>>>> Subject: Re: Xcode 3.1.2 and Subversion 1.6
>>>>
>>>> On Mar 25, 2009, at 3:19 PM, Chris Espinosa wrote:
>>>>> On Mar 25, 2009, at 3:07 PM, Rob Lockstone wrote:
>>>>>
>>>>>> Has anyone tried using Xcode 3.1.2 with the subversion 1.6.0
>>>>>> client? I think I recall (but may be wrong) that newer versions  
>>>>>> of
>>>>>> Xcode don't make assumptions about the version of subversion
>>>>>> that's installed and simply use whatever version it finds.
>>>>>
>>>>> We have not yet qualified any version of Xcode with Subversion
>>>>> 1.6.0 and don't recommend replacing existing Subversion library or
>>>>> client code with 1.6 until we've given it the green light.
>>>>
>>>> We've discovered in internal testing that this patch in Subversion
>>>> 1.6:
>>>>
>>>> http://svn.collab.net/viewvc/svn?view=revision&revision=35533
>>>>
>>>> can cause Subversion 1.6 to leave behind zombie ssh processes every
>>>> time you save a file in Xcode, and eventually exhaust your ability
>>>> to spawn new processes.  We don't recommend using Subversion 1.6
>>>> with Xcode 3.1.x at this time.
>>>>
>>>> Chris
>>>
>>> From:
>>>
>>> http://svn.apache.org/repos/asf/apr/apr/tags/1.0.0/include/apr_thread_proc.h
>>>
>>> APR_KILL_NEVER         // process is never sent any signals
>>> APR_KILL_ALWAYS   // process is sent SIGKILL on apr_pool_t cleanup
>>> APR_KILL_AFTER_TIMEOUT // SIGTERM, wait 3 seconds, SIGKILL
>>> APR_JUST_WAIT          // wait forever for the process to complete
>>> APR_KILL_ONLY_ONCE     // send SIGTERM and then wait
>>>
>>> Restoring the apr_pool_note_subprocess and using APR_KILL_NEVER  
>>> would
>>> allow the children to be reaped provided they exit before pool
>>> cleanup.
>>>
>>> However, that would likely not eliminate the zombie problem in Xcode
>>> as pool cleanup probably happens faster than ssh cleanup and exit in
>>> some cases.  How about using APR_KILL_AFTER_TIMEOUT or
>>> APR_KILL_ONLY_ONCE (or even APR_JUST_WAIT) ?
>>
>> How would this interact with ssh connection pooling?  The case which
>> drove r35533 was a user who uses ssh connection pooling for svn
>> connections.  Having svn kill the ssh connection is obviously
>> hazardous to such a scheme, how would using the other APR_KILL_*
>> conditions behave there (and would they fix the problem with XCode)?
>
>
> I've built with each of the above options passed to  
> apr_pool_note_subprocess. For each case, I:
>
> 1) Started an 'svn co' over svn+ssh (which launched the ssh control  
> master).
> 2) ssh-ed to the same server over the shared connection
> 3) waited for the co to complete
>
> I also:
> Launched Xcode, and did the GUI equivalent to an svn co
>
>
> - APR_KILL_{ALWAYS,AFTER_TIMEOUT,ONLY_ONCE} are inappropriate in the  
> connection sharing case. When the svn co completed, the shared  
> connection was severed, and the second ssh session was disconnected,  
> however with those options, the ssh subprocesses of the library- 
> using clients (such as Xcode) were properly reaped.
>
> - APR_KILL_NEVER is inappropriate in the GUI client case (ssh  
> subprocesses of Xcode were not reaped until Xcode itself was  
> terminated).
>
> - APR_JUST_WAIT is inappropriate, as the process (whether it be svn  
> or Xcode) that started the master ssh connection blocks until any  
> other ssh's using the same shared connection are terminated.
>
> It seems that OpenSSH's connection sharing and svn's use of ssh are  
> fundamentally incompatible. At the very least, it doesn't seem that  
> there is a simple fix for both issues.
>
> -- 
> Clark S. Cox III
> clark.cox@apple.com

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1452200