You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Justin Erenkrantz <je...@apache.org> on 2002/07/12 06:25:00 UTC

Problem with svn_wc_process_committed

Prior to r2470, we used to unlock the adm_access 'baton' before
calling svn_wc_process_committed in svn_client_commit().  In r2470,
we now delay the unlock until after svn_wc_process_committed is
called.  That doesn't work because svn_wc_process_committed doesn't
take the hash table of locked directories.  Therefore, it tries to
open a lock on a directory that was already held in locked_dirs.
And, we now fail commit test #8 with my caching patch because we
already have a lock on the subdirs.

Oops.  Either the locked_dirs hash needs to get passed to
svn_wc_process_committed or we need to go back to unlocking the
dirs before svn_wc_process_committed is called.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Alpha moved to Thursday, to make time for code review

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On Sun, Jul 14, 2002 at 04:37:28PM -0500, Karl Fogel wrote:

> Alpha is going to get noticed, so we want to release as stable a
> Subversion as possible.  Taking three days for a little risk-control
> is well worth it.

+1.  it would be /really/ bad to have a critical bug in the alpha
release, especially if we can avoid it by being patient.

-garrett 

-- 
garrett rooney                    Remember, any design flaw you're 
rooneg@electricjellyfish.net      sufficiently snide about becomes  
http://electricjellyfish.net/     a feature.       -- Dan Sugalski

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Alpha moved to Thursday, to make time for code review

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Ben Collins-Sussman <su...@collab.net> writes:
> This sounds like a more specific validation of my hypothesis; we've
> actually described exactly how, in our code, my hypothesis can
> happen.  And all of pilchie's symptoms seem to match.
> 
> In essence, the bug is that our commit code wasn't "playing ball" with
> other code in terms of honoring locks, and thus an unfinished log was
> never run.  Bang, inconsistent wc state.
> 
> I say:  we should review and apply the fix!

+1 to that!

I hope my fellow developers (in particular the CollabNet folk, who are
most involved in milestone scheduling) will forgive me this unilateral
move, but it's a weekend so timely discussions are harder to have:

I'd like to move Alpha from Monday (tomorrow) to Thursday.  This will
give us time to *carefully* review the entire locking strategy in
libsvn_wc, from the ground up, including the aforementioned
patch... And to test what we commit :-).

Alpha is going to get noticed, so we want to release as stable a
Subversion as possible.  Taking three days for a little risk-control
is well worth it.

I've updated the status page to the new date.

Comments welcome,
-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Ben Collins-Sussman <su...@collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:

> Karl Fogel <kf...@newton.ch.collab.net> writes:
> 
> > Philip Martin <ph...@codematters.co.uk> writes:
> > > I have a patch that implements this. It's quite straightforward and I
> > > believe it works as well as the current implementation.  Given our
> > > current corruption bug, should I commit it?
> > 
> > I hate to ask you to wait... But, I think it would be best to wait.
> > Until we know more, we should avoid further code churn in libsvn_wc,
> > except for outright bug fixes.
> > 
> > How about this: post the patch here first.  At least that way it can
> > be reviewed while we're working on the bug.
> 
> I believe the problem has been identified.  Kevin interrupted a
> previous commit, which would leave some directories locked.
> harvest_committables calls lock_dir to lock directories but doesn't
> check that it succeeds.  So there should have been an "already locked"
> error before starting the commit, but that did not occur, the commit
> went ahead and then the NULL lock structure from the failed lock
> attempt caused a core dump.

This sounds like a more specific validation of my hypothesis; we've
actually described exactly how, in our code, my hypothesis can
happen.  And all of pilchie's symptoms seem to match.

In essence, the bug is that our commit code wasn't "playing ball" with
other code in terms of honoring locks, and thus an unfinished log was
never run.  Bang, inconsistent wc state.

I say:  we should review and apply the fix!



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Philip Martin <ph...@codematters.co.uk> writes:
> > I have a patch that implements this. It's quite straightforward and I
> > believe it works as well as the current implementation.  Given our
> > current corruption bug, should I commit it?
> 
> I hate to ask you to wait... But, I think it would be best to wait.
> Until we know more, we should avoid further code churn in libsvn_wc,
> except for outright bug fixes.
> 
> How about this: post the patch here first.  At least that way it can
> be reviewed while we're working on the bug.

I believe the problem has been identified.  Kevin interrupted a
previous commit, which would leave some directories locked.
harvest_committables calls lock_dir to lock directories but doesn't
check that it succeeds.  So there should have been an "already locked"
error before starting the commit, but that did not occur, the commit
went ahead and then the NULL lock structure from the failed lock
attempt caused a core dump.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:
> I have a patch that implements this. It's quite straightforward and I
> believe it works as well as the current implementation.  Given our
> current corruption bug, should I commit it?

I hate to ask you to wait... But, I think it would be best to wait.
Until we know more, we should avoid further code churn in libsvn_wc,
except for outright bug fixes.

How about this: post the patch here first.  At least that way it can
be reviewed while we're working on the bug.

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Philip Martin <ph...@codematters.co.uk> writes:

>    svn_error_t *
>    svn_wc_adm_open(svn_wc_adm_access_t **adm_access,
>                    svn_wc_adm_access_t *parent_access,
>                    const char *path,
>                    svn_boolean_t write_lock,
>                    apr_pool_t *pool);
> 
> Where parent_access can be NULL. When parent_access is not NULL the
> first svn_wc_adm_open for a given path stores the new child baton in
> an internal hash. Subsequent calls for the same path return the baton
> out of the cache.

I have a patch that implements this. It's quite straightforward and I
believe it works as well as the current implementation.  Given our
current corruption bug, should I commit it?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:
> Compare
> 
> svn_error_t *
> svn_wc_adm_open (svn_wc_adm_access_t **adm_access,
>                  svn_wc_adm_access_t *parent_access,
>                  ...);
> 
> with the current pool code
> 
> apr_pool_t *
> svn_pool_create (apr_pool_t *pool);
> 
> The pool case could be recast as
> 
> svn_error_t *
> svn_pool_create (apr_pool_t **sub_pool,
>                  apr_pool_t *pool);

Oh, I understand now, yes, quite agree.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Philip Martin <ph...@codematters.co.uk> writes:
> > I don't think it's that bad, it's rather like creating subpools.
> 
> Child edit batons, rather, like file_baton while inside the "scope" of
> a dir_baton.  (I don't think we do this with pools -- that is, pass
> two pools to a function.  Or if we do, it's very rare.)

Compare

svn_error_t *
svn_wc_adm_open (svn_wc_adm_access_t **adm_access,
                 svn_wc_adm_access_t *parent_access,
                 ...);

with the current pool code

apr_pool_t *
svn_pool_create (apr_pool_t *pool);

The pool case could be recast as

svn_error_t *
svn_pool_create (apr_pool_t **sub_pool,
                 apr_pool_t *pool);

Not much difference!


-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:
> I don't think it's that bad, it's rather like creating subpools.

Child edit batons, rather, like file_baton while inside the "scope" of
a dir_baton.  (I don't think we do this with pools -- that is, pass
two pools to a function.  Or if we do, it's very rare.)

Yes, I see what you mean.

Maybe it's not SO bad... :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Philip Martin <ph...@codematters.co.uk> writes:
> > 
> >    svn_error_t *
> >    svn_wc_adm_open(svn_wc_adm_access_t **adm_access,
> >                    svn_wc_adm_access_t *parent_access,
> >                    const char *path,
> >                    svn_boolean_t write_lock,
> >                    apr_pool_t *pool);
> 
> Yick.
> 
> If we need a hierarchy, can it at least be hidden inside the access_t
> structures?  Passing two of them in the public interfaces seems
> awkward...

I don't think it's that bad, it's rather like creating subpools.  It
gets used like this

    some_function (svn_wc_adm_access_t *adm_access, ... )
    {
      /* Recursively loop over all children. */
      for (hi = apr_hash_first (pool, entries); hi; hi = apr_hash_next (hi))
        {
          apr_hash_this (hi, &key, NULL, &val);
          name = key;
          current_entry = val;
        
          if (current_entry->kind == svn_node_dir)
            {
               /* Lock subdirs */
               svn_wc_adm_open (&child_access, adm_access, ... );
               some_function (child_access, ...);
            }
        }
     }

This is much like the present code except svn_wc_adm_open call doesn't
take the second baton, and there is generally an svn_wc_adm_close call
after some_function.  The close call may or may not get removed,
dependes whether we want to retain the cache.  If it's not closed
explicitly I envisage that it gets closed automatically when the
parent gets closed.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Justin Erenkrantz <je...@apache.org>.
On Fri, Jul 12, 2002 at 10:31:15AM -0500, Karl Fogel wrote:
> Philip Martin <ph...@codematters.co.uk> writes:
> > I have been thinking about putting the svn_wc_adm_access_t structures
> > into a hierarchy, with parents keeping a list of children, and
> > children a pointer to the parent. Then the initial commit processing
> > could lock the entire directory tree being committed before starting
> > the commit, and svn_wc_process_committed would not need to take out
> > any new locks.  This might help with the cacheing as well, giving
> > parents access to children and vice-versa.
> > 
> >    svn_error_t *
> >    svn_wc_adm_open(svn_wc_adm_access_t **adm_access,
> >                    svn_wc_adm_access_t *parent_access,
> >                    const char *path,
> >                    svn_boolean_t write_lock,
> >                    apr_pool_t *pool);
> 
> Yick.
> 
> If we need a hierarchy, can it at least be hidden inside the access_t
> structures?  Passing two of them in the public interfaces seems
> awkward...

In short, I eventually came to the same conclusion as Philip.  We
need to handle the hierarchy inside of the structure.

My only variation could be that we could use a static list of
locks that we have.  I'm not sure how much I like that idea
though.  But, that would avoid having to pass the parent_access
baton around.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:
> I have been thinking about putting the svn_wc_adm_access_t structures
> into a hierarchy, with parents keeping a list of children, and
> children a pointer to the parent. Then the initial commit processing
> could lock the entire directory tree being committed before starting
> the commit, and svn_wc_process_committed would not need to take out
> any new locks.  This might help with the cacheing as well, giving
> parents access to children and vice-versa.
> 
>    svn_error_t *
>    svn_wc_adm_open(svn_wc_adm_access_t **adm_access,
>                    svn_wc_adm_access_t *parent_access,
>                    const char *path,
>                    svn_boolean_t write_lock,
>                    apr_pool_t *pool);

Yick.

If we need a hierarchy, can it at least be hidden inside the access_t
structures?  Passing two of them in the public interfaces seems
awkward...

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Justin Erenkrantz <je...@apache.org> writes:

> Oops.  Either the locked_dirs hash needs to get passed to
> svn_wc_process_committed or we need to go back to unlocking the
> dirs before svn_wc_process_committed is called.  -- justin

Unlocking and then relocking doesn't look right to me, not only will
it kill any cacheing, but dropping locks in what should be an atomic
process is a bit odd.

I have been thinking about putting the svn_wc_adm_access_t structures
into a hierarchy, with parents keeping a list of children, and
children a pointer to the parent. Then the initial commit processing
could lock the entire directory tree being committed before starting
the commit, and svn_wc_process_committed would not need to take out
any new locks.  This might help with the cacheing as well, giving
parents access to children and vice-versa.

   svn_error_t *
   svn_wc_adm_open(svn_wc_adm_access_t **adm_access,
                   svn_wc_adm_access_t *parent_access,
                   const char *path,
                   svn_boolean_t write_lock,
                   apr_pool_t *pool);

Where parent_access can be NULL. When parent_access is not NULL the
first svn_wc_adm_open for a given path stores the new child baton in
an internal hash. Subsequent calls for the same path return the baton
out of the cache.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Justin Erenkrantz <je...@apache.org> writes:

> On Fri, Jul 12, 2002 at 07:16:34PM +0100, Philip Martin wrote:
> > But svn_path_condense_targets is called before harvest_committables
> > and that is supposed to remove the ancestor/descendant relationship.
> 
> No, but harvest_committables *needs* to walk down the tree and read
> every entries file for its descendants.  So, it'd have to lock the
> administrative directories as it determines what to commit.
> 
> > > Yes, that is what commit test #8 does.  I don't believe the commit
> > > processing strips out children.  -- justin
> > 
> > I'm still confused, as it doesn't fail when I run the test.
> 
> Because harvest_committables isn't locking anything now.  I said
> before that you need to apply my caching patch.  I wanted to check
> in the code that would pass the baton around and I can't do that
> because svn_wc_process_committed is broken.  -- justin

Ah, I see, I didn't twig that you were putting more locks into the
code, sorry.

Broken is a bit strong, I said I would introduce the new locks in
stages. I'll just claim that your code violates the current
pre-conditions for calling svn_wc_process_committed :-)

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Justin Erenkrantz <je...@apache.org>.
On Fri, Jul 12, 2002 at 07:16:34PM +0100, Philip Martin wrote:
> But svn_path_condense_targets is called before harvest_committables
> and that is supposed to remove the ancestor/descendant relationship.

No, but harvest_committables *needs* to walk down the tree and read
every entries file for its descendants.  So, it'd have to lock the
administrative directories as it determines what to commit.

> > Yes, that is what commit test #8 does.  I don't believe the commit
> > processing strips out children.  -- justin
> 
> I'm still confused, as it doesn't fail when I run the test.

Because harvest_committables isn't locking anything now.  I said
before that you need to apply my caching patch.  I wanted to check
in the code that would pass the baton around and I can't do that
because svn_wc_process_committed is broken.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Justin Erenkrantz <je...@apache.org> writes:

> On Fri, Jul 12, 2002 at 11:40:44AM +0100, Philip Martin wrote:
> > Does your cacheing patch change the locking in some way?  I don't see
> > a test failure with unmodified r2478.
> 
> harvest_committables will lock the children as it descends down
> the tree.  Since it doesn't unlock, when it exits, it still holds
> onto the lock (and is stored in *locked_dirs).
> svn_wc_process_committed will then attempt to re-acquire a lock we
> already have.

But svn_path_condense_targets is called before harvest_committables
and that is supposed to remove the ancestor/descendant relationship.

> 
> > svn_wc_process_committed only locks children so the failure you are
> > seeing appears to imply that commit is trying to commit items that are
> > in an ancestor-child relationship.  I though the commit processing
> > stripped out the children in this case, prior to taking out locks.
> 
> Yes, that is what commit test #8 does.  I don't believe the commit
> processing strips out children.  -- justin

I'm still confused, as it doesn't fail when I run the test.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Justin Erenkrantz <je...@apache.org>.
On Fri, Jul 12, 2002 at 11:40:44AM +0100, Philip Martin wrote:
> Does your cacheing patch change the locking in some way?  I don't see
> a test failure with unmodified r2478.

harvest_committables will lock the children as it descends down
the tree.  Since it doesn't unlock, when it exits, it still holds
onto the lock (and is stored in *locked_dirs).
svn_wc_process_committed will then attempt to re-acquire a lock we
already have.

> svn_wc_process_committed only locks children so the failure you are
> seeing appears to imply that commit is trying to commit items that are
> in an ancestor-child relationship.  I though the commit processing
> stripped out the children in this case, prior to taking out locks.

Yes, that is what commit test #8 does.  I don't believe the commit
processing strips out children.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Problem with svn_wc_process_committed

Posted by Philip Martin <ph...@codematters.co.uk>.
Justin Erenkrantz <je...@apache.org> writes:

> Prior to r2470, we used to unlock the adm_access 'baton' before
> calling svn_wc_process_committed in svn_client_commit().  In r2470,
> we now delay the unlock until after svn_wc_process_committed is
> called.  That doesn't work because svn_wc_process_committed doesn't
> take the hash table of locked directories.  Therefore, it tries to
> open a lock on a directory that was already held in locked_dirs.
> And, we now fail commit test #8 with my caching patch because we
> already have a lock on the subdirs.

Does your cacheing patch change the locking in some way?  I don't see
a test failure with unmodified r2478.

svn_wc_process_committed only locks children so the failure you are
seeing appears to imply that commit is trying to commit items that are
in an ancestor-child relationship.  I though the commit processing
stripped out the children in this case, prior to taking out locks.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org