You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Josh Pieper <jj...@pobox.com> on 2004/07/02 12:42:09 UTC

Issue #1586: Cancellation during gethostbyname() and connect()

Part of issue #1586 is that interrupting Subversion during DNS lookups
(gethostbyname) or when initiating a network connection (connect) is
difficult.  This can manifest itself, for instance, if network
connectivity is lost in mid operation.

Since this can't be solved by adding more cancellation checks, I am
feeling out how acceptable other solutions would be.  I have a few
possible alternatives that I have considered:

  a) Do not install the interrupt handler for any networked read only
     operation.  We must have the interrupt handler for file:///
     operations, or else BDB gets mad.  Using this option, the OS will
     just kill svn right at the interrupt point.  We aren't modifying
     the working copy or a local repository so nothing much should be
     hurt.  However, it may be difficult for the initial command line
     part to tell what is a network operation and what isn't.

  b) Temporarily disable the interrupt handler during possibly hanging
     network operations.  It could be tricky, because it's hard to tell
     what could be allocated in pools that won't be cleaned up.  This
     also seems like a very naughty thing for a library to do.

  c) Make subversion multi-threaded and push off all possibly hanging
     network operations into other threads.  Very complicated, but I'm
     including it for completeness.

  c) Do nothing... a user may need to use kill -9 in order to
     interrupt svn if a network operation hangs.

Are there any other options and/or opinions?  I'm leaning towards the
last option, but I want to see if there are any other ideas or
objections.

-Josh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Julian Foad <ju...@btopenworld.com>.
Josh Pieper wrote:
> Part of issue #1586 is that interrupting Subversion during DNS lookups
> (gethostbyname) or when initiating a network connection (connect) is
> difficult.  This can manifest itself, for instance, if network
> connectivity is lost in mid operation.
> 
> Since this can't be solved by adding more cancellation checks, I am
> feeling out how acceptable other solutions would be.  I have a few
> possible alternatives that I have considered:
> 
>   a) Do not install the interrupt handler for any networked read only
>      operation.  We must have the interrupt handler for file:///
>      operations, or else BDB gets mad.  Using this option, the OS will
>      just kill svn right at the interrupt point.  We aren't modifying
>      the working copy or a local repository so nothing much should be
>      hurt.  However, it may be difficult for the initial command line
>      part to tell what is a network operation and what isn't.
> 
>   b) Temporarily disable the interrupt handler during possibly hanging
>      network operations.  It could be tricky, because it's hard to tell
>      what could be allocated in pools that won't be cleaned up.  This
>      also seems like a very naughty thing for a library to do.
> 
>   c) Make subversion multi-threaded and push off all possibly hanging
>      network operations into other threads.  Very complicated, but I'm
>      including it for completeness.
> 
>   c) Do nothing... a user may need to use kill -9 in order to
>      interrupt svn if a network operation hangs.
> 
> Are there any other options and/or opinions?  I'm leaning towards the
> last option, but I want to see if there are any other ideas or
> objections.

 e) Can't our interrupt handler choose to terminate the application immediately if it wants to, by raising an Abort (or similar) signal?  If so, we could introduce something like this behaviour: If your Ctrl-C hasn't taken effect, then you can issue another Ctrl-C which will abort the application immediately.  The Ctrl-C handler would do something like the following:

   + If cancellation has already been requested at least once, then abort the application (by raising an Abort signal or something similar).

   + Otherwise, just note that cancellation has been requested (as it does now).

Would this solution actually work during the type of network wait in question?

(It could easily be modified to require that you wait at least (say) 10 seconds before the second Ctrl-C, or other variations.)

- Julian


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Mark Benedetto King <mb...@lowlatency.com>.
On Sat, Jul 03, 2004 at 02:05:03AM +0200, Branko ??ibej wrote:
> It's not thread-safe, so you can't do it in the library.
> 

The callback approach that Karl and I suggested could at least
theoretically be made thread safe (the burden is on the callback
provider).

--ben


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sun, 2004-07-04 at 14:44, Josh Pieper wrote:
> Following up on the comments to my four suggestions.

Well, some more wacko ideas:

  * Add a client context callback to set up signal handlers.  (The real
meaning of the callback is "we're about to do something which requires
cleanup, so try to ensure that it gets done".)  Now libsvn_client can
help decide whether cleanup is necessary, and network-only operations
can be interrupted.

  * APR is screwing us on connect() by intercepting EINTR.  Perhaps we
could add an interruptability flag to APR files and/or sockets.  We'd
then have to perform or own retry loops, but we could check the
cancellation callback each time.  (This would also help with our
prompting problem.)

  * I don't think WC cleanup ever involves making a gethostbyname()
call, although it can involve doing a variety of other things.  So it
should be theoretically sort-of possible to make it possible actually
perform cleanup from within the signal handler.  This idea is especially
wacko, though.

  * There's this non-portable function res_send_setqhook() which is
supposed to set a hook to be called between trials of a hostname
lookup.  The actual trials themselves are not interruptable, but we
could significantly reduce the time before interrupt if we could make a
query hook check the cancellation flag.  Unfortunately, in my
experiments on a Red Hat box, it didn't work; I think glibc may be
contriving to use a separate _res for gethostbyname's libresolv calls. 
So in addition to being nonportable, it wouldn't help on our most
popular platform.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Josh Pieper <jj...@pobox.com>.
Following up on the comments to my four suggestions.

On option a:

Mark Bendetto wrote:
>>   a) Do not install the interrupt handler for any networked read only
>>      operation.  We must have the interrupt handler for file:///
>>      operations, or else BDB gets mad.  Using this option, the OS will
>>      just kill svn right at the interrupt point.  We aren't modifying
>>      the working copy or a local repository so nothing much should be
>>      hurt.  However, it may be difficult for the initial command line
>>      part to tell what is a network operation and what isn't.
>
>I think this is a non-starter, since some operations may or may not
>access the network.


On option b:

Branko ??ibej wrote:
> kfogel@collab.net wrote:
> 
> >Josh Pieper <jj...@pobox.com> writes:
> > 
> >
> >> b) Temporarily disable the interrupt handler during possibly hanging
> >>    network operations.  It could be tricky, because it's hard to tell
> >>    what could be allocated in pools that won't be cleaned up.  This
> >>    also seems like a very naughty thing for a library to do.
> >
> >I think (b) may be feasible, actually.
> > 
> >
> It's not thread-safe, so you can't do it in the library.


On option c:

Mark Bendetto wrote:
>>   c) Make subversion multi-threaded and push off all possibly hanging
>>      network operations into other threads.  Very complicated, but I'm
>>      including it for completeness.
>
>
>I think the cmdline utility should remain single threaded if at
>all possible.  Threading is evil.


Is the verdict on b final?  If so, it seems that our only choice is to
ignore this take the (misnamed) option d.

Josh Pieper wrote:
>   c) Do nothing... a user may need to use kill -9 in order to
>      interrupt svn if a network operation hangs.
>


-Josh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Branko Čibej <br...@xbc.nu>.
kfogel@collab.net wrote:

>Josh Pieper <jj...@pobox.com> writes:
>  
>
>>  b) Temporarily disable the interrupt handler during possibly hanging
>>     network operations.  It could be tricky, because it's hard to tell
>>     what could be allocated in pools that won't be cleaned up.  This
>>     also seems like a very naughty thing for a library to do.
>>
>>  c) Make subversion multi-threaded and push off all possibly hanging
>>     network operations into other threads.  Very complicated, but I'm
>>     including it for completeness.
>>
>>  c) Do nothing... a user may need to use kill -9 in order to
>>     interrupt svn if a network operation hangs.
>>
>>Are there any other options and/or opinions?  I'm leaning towards the
>>last option, but I want to see if there are any other ideas or
>>objections.
>>    
>>
>
>I think (b) may be feasible, actually.
>  
>
It's not thread-safe, so you can't do it in the library.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by kf...@collab.net.
Josh Pieper <jj...@pobox.com> writes:
>   b) Temporarily disable the interrupt handler during possibly hanging
>      network operations.  It could be tricky, because it's hard to tell
>      what could be allocated in pools that won't be cleaned up.  This
>      also seems like a very naughty thing for a library to do.
> 
>   c) Make subversion multi-threaded and push off all possibly hanging
>      network operations into other threads.  Very complicated, but I'm
>      including it for completeness.
> 
>   c) Do nothing... a user may need to use kill -9 in order to
>      interrupt svn if a network operation hangs.
> 
> Are there any other options and/or opinions?  I'm leaning towards the
> last option, but I want to see if there are any other ideas or
> objections.

I think (b) may be feasible, actually.

What if Subversion offered an API (a callback, perhaps) to the RA
layers, such that an RA layer had a way to save, disable, and restore
the client's special interrupt handling.  The RA layer knows when it's
going to go into some possibly hangy network operation.  It could
'save(&baton)' and 'disable()' the interrupt handler right before
entering that code, then 'restore(baton)' right after.

I'm speaking casually here, without fully designing the callback
interface, but I'll bet potential pool lifetime problems could be
solved with the right baton discipline...

Thoughts?

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Friday, July 2, 2004 1:09 PM -0400 Mark Benedetto King 
<mb...@lowlatency.com> wrote:

> The same holds for our blocking network reads and writes.
>
> I haven't looked at Neon to see how much work it would be to
> implement this, but it should be (reasonably) straightforward.
> Except for the DNS (unless we go with ares/adns/etc).

neon isn't async and likely can't be easily retrofitted to do so.  In fact, 
someone just posted to the neon list asking about implementing async-DNS in 
neon, and, AFAICT, there really wasn't a way to do that without hooking at the 
system level.  (I could, of course, be mis-interpreting that thread though.)

*shrug*  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Issue #1586: Cancellation during gethostbyname() and connect()

Posted by Mark Benedetto King <mb...@lowlatency.com>.
On Fri, Jul 02, 2004 at 08:42:09AM -0400, Josh Pieper wrote:
> Part of issue #1586 is that interrupting Subversion during DNS lookups
> (gethostbyname) or when initiating a network connection (connect) is
> difficult.  This can manifest itself, for instance, if network
> connectivity is lost in mid operation.

Well, connect() can be made non-blocking, and there are a few robust
asynchronous DNS implementations (one of them even written by Greg
Hudson).

The same holds for our blocking network reads and writes.

I haven't looked at Neon to see how much work it would be to
implement this, but it should be (reasonably) straightforward.
Except for the DNS (unless we go with ares/adns/etc).

> 
> Since this can't be solved by adding more cancellation checks, I am
> feeling out how acceptable other solutions would be.  I have a few
> possible alternatives that I have considered:
> 
>   a) Do not install the interrupt handler for any networked read only
>      operation.  We must have the interrupt handler for file:///
>      operations, or else BDB gets mad.  Using this option, the OS will
>      just kill svn right at the interrupt point.  We aren't modifying
>      the working copy or a local repository so nothing much should be
>      hurt.  However, it may be difficult for the initial command line
>      part to tell what is a network operation and what isn't.

I think this is a non-starter, since some operations may or may not
access the network.

> 
>   b) Temporarily disable the interrupt handler during possibly hanging
>      network operations.  It could be tricky, because it's hard to tell
>      what could be allocated in pools that won't be cleaned up.  This
>      also seems like a very naughty thing for a library to do.

It is an evil thing for a library to do, but the library *could* notify
the client "starting blocking thing, you might want to remove any
interrupt handlers" and "done with blocking thing, you might want to
reinstall any interrupt handlers".   This may be the easiest approach.

> 
>   c) Make subversion multi-threaded and push off all possibly hanging
>      network operations into other threads.  Very complicated, but I'm
>      including it for completeness.
> 

I think the cmdline utility should remain single threaded if at
all possible.  Threading is evil.

>   c) Do nothing... a user may need to use kill -9 in order to
>      interrupt svn if a network operation hangs.
> 
> Are there any other options and/or opinions?  I'm leaning towards the
> last option, but I want to see if there are any other ideas or
> objections.
> 

Well, I'd learn towards (b) or patching neon to do non-blocking things.

--ben


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org