You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Garrett Rooney <ro...@electricjellyfish.net> on 2006/07/03 20:10:17 UTC

Re: BBD Pool Cleanup Ordering Bugs?

On 6/30/06, Branko Čibej <br...@xbc.nu> wrote:

> Sorry I sort of ignored your last mail, I've been quite busy lately. But
> it turns out that you guessed what worries me.

Ok, good to know I'm at least on the right track ;-)

> > Another question that jumps to mind, how does this stuff even work
> > with thread local variables like this?  Does it imply that __close
> > needs to be called from the same thread that was using the environment
> > (and thus that generated the errors) so that it can free the errors?
> > If that's the case, can cleanup_env even potentially destroy those
> > errors, how would it get at them?  I feel like I'm missing something
> > here...
> The idea is that if you create an svn_fs_t in one thread, you must
> always use it in that thread. I don't think we've spelled that out
> anywhere, but then, we've not really defined our thread-safety
> guarantees in general. Certainly a threaded svnserve uses svn_fs_t's in
> the way that code expects them to be used.

Yeah...  This kind of worries me, but it's not like we can really do
much better.

> I'm worried that it's possible to order svn_fs_t creation in such a way
> that the gathered svn_error_t's are destroyed before the last svn_fs_t
> that might want to use them. At this point, I don't even know if this
> scenario is possible, much less how to recover from it.

I think we're basically stuck in some of the potential situations.

Here's my current thinking, and my current patch.  Let me know if this
makes any sense at all, because currently I'm not quite sure ;-)

We actually do need a destructor callback for
apr_threadkey_private_create.  If we don't pass one then any thread
that messes with a bdb env but doesn't correctly clean it up will leak
memory.  This won't be called in the normal case, since the variable
will have been set to NULL in the __close function, but for the
abnormal case it is still required.

Side note here...  It looks like this callback will only ever be
called on Unix, Win32 doesn't even do anything with it.  Still, it's
not like we're relying on it for anything other than avoiding leaks
with poorly behaved users of the API, so this bug in APR doesn't
necessarily hurt us too badly.

There may also be a potential for races during shutdown where
cleanup_env happens before svn_fs_bdb__close finishes up.  I have no
idea how this could happen, but I'm willing to accept on faith that it
can.  It sounds like Branko also thinks this is possible, but can't
prove that fact.  To avoid this, I think the best bet is to go with
Branko's idea of just not doing the cleanup in the case where
bdb->pool is NULL.

Note that if that race can occur, then we're kind of stuck leaking
something.  The destruction of the thread keys in cleanup_env means
that any access to them at all is pretty much out of the question,
even if they were allocated in this thread.  Anything that was
allocated in other threads is already lost anyway, since we can't get
at those thread local variables anymore since the threads are gone.
On unix at least the cleanup function has a chance to help us with
those...

So, we check for the NULL pool in __close, and that keeps us from
theoretically messing with a threadkey that's no longer valid.

Note that there are almost certainly still error conditions that we
can't do much about.  Shutting down while other threads are still
running, for example, seems fraught with danger, and I'm not sure
there's really much we can do about it.

Anyway, here's the patch + log, let me know if it seems sane...

-garrett

[[[
Fix some potential issues when cleaning up BDB environments.

* subversion/libsvn_fs_base/bdb/env.c
  (cleanup_error_info): Cleanup callback for use with thread local
   variables.
  (create_env): Pass cleanup_error_info into apr_threadkey_private_create.
  (svn_fs_bdb__close): Don't try to clean anything up if bdb->pool is NULL,
   it implies that we've hit some strange pool cleanup ordering issue.
]]]

Re: BBD Pool Cleanup Ordering Bugs?

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 7/4/06, Branko Čibej <br...@xbc.nu> wrote:

> Haven't tested this, but it looks fine. Thanks for picking up on this,
> Garrett.

Ok, committed to trunk.  Will start testing the merge of this and your
other change (using malloc to manage that memory) into 1.4 and if all
goes well I'll propose it for backport.

-garrett

Re: BBD Pool Cleanup Ordering Bugs?

Posted by Branko Čibej <br...@xbc.nu>.
Garrett Rooney wrote:
[...]
> We actually do need a destructor callback for
> apr_threadkey_private_create.  If we don't pass one then any thread
> that messes with a bdb env but doesn't correctly clean it up will leak
> memory.  This won't be called in the normal case, since the variable
> will have been set to NULL in the __close function, but for the
> abnormal case it is still required.
>
> Side note here...  It looks like this callback will only ever be
> called on Unix, Win32 doesn't even do anything with it.  Still, it's
> not like we're relying on it for anything other than avoiding leaks
> with poorly behaved users of the API, so this bug in APR doesn't
> necessarily hurt us too badly.
Ah, you noticed. :)

> There may also be a potential for races during shutdown where
> cleanup_env happens before svn_fs_bdb__close finishes up.  I have no
> idea how this could happen, but I'm willing to accept on faith that it
> can.  It sounds like Branko also thinks this is possible, but can't
> prove that fact.  To avoid this, I think the best bet is to go with
> Branko's idea of just not doing the cleanup in the case where
> bdb->pool is NULL.
Actually I hope it's not possible ...

> Note that if that race can occur, then we're kind of stuck leaking
> something.  The destruction of the thread keys in cleanup_env means
> that any access to them at all is pretty much out of the question,
> even if they were allocated in this thread.  Anything that was
> allocated in other threads is already lost anyway, since we can't get
> at those thread local variables anymore since the threads are gone.
> On unix at least the cleanup function has a chance to help us with
> those...
... but as far as I can determine, it can only potentially happen during
apr_terminate, and a bit of a leak then isn't such a horrible thing.


> So, we check for the NULL pool in __close, and that keeps us from
> theoretically messing with a threadkey that's no longer valid.
>
> Note that there are almost certainly still error conditions that we
> can't do much about.  Shutting down while other threads are still
> running, for example, seems fraught with danger, and I'm not sure
> there's really much we can do about it.
Shutting down while threads are still active is _always_ a problem, not
just in this case.

(On that note, I think I just solved the halting problem: if it doesn't
halt by itself, pull the plug ... ah, never mind ...)

> Anyway, here's the patch + log, let me know if it seems sane...
>
> -garrett
>
> [[[
> Fix some potential issues when cleaning up BDB environments.
>
> * subversion/libsvn_fs_base/bdb/env.c
>  (cleanup_error_info): Cleanup callback for use with thread local
>   variables.
>  (create_env): Pass cleanup_error_info into apr_threadkey_private_create.
>  (svn_fs_bdb__close): Don't try to clean anything up if bdb->pool is
> NULL,
>   it implies that we've hit some strange pool cleanup ordering issue.
> ]]]
Haven't tested this, but it looks fine. Thanks for picking up on this,
Garrett.

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org