You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Burba <pt...@gmail.com> on 2008/10/29 13:57:09 UTC

Re: Why does "update" treat an obstruction differently if adding without history?

On Wed, Oct 29, 2008 at 8:21 AM, Julian Foad <ju...@btopenworld.com> wrote:
> Can anyone explain this special handling of items scheduled for addition
> *without history*, in "svn update"'s "add file" code path?
>
> It seems to me that if a file to be added is obstructed by a file on
> disk, it wouldn't matter whether it's being added with or without
> history.
>
> subversion/libsvn_wc/update_editor.c
>
> ('kind' is the on-disk node kind.)
>
> [[[
> @@ -2541,8 +2540,6 @@ add_file(const char *path,
>  /* When adding, there should be nothing with this name unless unversioned
>     obstructions are permitted or the obstruction is scheduled for addition
>     (or replacement) without history. */
>  /* ### " or the obstruction is scheduled for addition
>     without history." ??? */
>  if (kind != svn_node_none)
>    {
>      if (eb->allow_unver_obstructions
>          || (entry && ((entry->schedule == svn_wc_schedule_add
>                         || entry->schedule == svn_wc_schedule_replace))))
>        {
>          /* ### now detected as a tree conflict instead.
>          if (entry && entry->copied)
>            {
>              return svn_error_createf(SVN_ERR_WC_OBSTRUCTED_UPDATE,
>                                       NULL,
>                                       _("Failed to add file '%s': a "
>                                         "file of the same name is "
>                                         "already scheduled for addition "
>                                         "with history"),
>                                       svn_path_local_style(fb->path,
>                                                            pool));
>            }
>          */
> ...
> ]]]
>
> - Julian

Julian,

That was related to a follow-up to the "takeover" functionality, and
was filed as issue #2593.

See also (the threads are many and a bit broken, but these are the key points):

Mark describes the problem: http://svn.haxx.se/dev/archive-2006-08/0336.shtml
Your thoughts: http://svn.haxx.se/dev/archive-2006-08/0718.shtml
My proposed patch: http://svn.haxx.se/dev/archive-2006-09/0362.shtml

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Why does "update" treat an obstruction differently if adding without history?

Posted by Julian Foad <ju...@btopenworld.com>.
On Wed, 2008-10-29 at 12:09 -0700, Greg Stein wrote:
> On Wed, Oct 29, 2008 at 9:46 AM, Stephen Butler <sb...@elego.de> wrote:
> > Quoting Greg Stein <gs...@gmail.com>:
> >
> >> On Wed, Oct 29, 2008 at 8:03 AM, Stephen Butler <sb...@elego.de> wrote:
> >>>
> >>> ...
> >>> I'm hacking away at the skipping problem, too.  For the
> >>> allow-existing-add
> >>> feature, I simply gave check_tree_conflicts() a ptr-to-bool arg.  Here's
> >>> the signature for that function in my incomplete patch:
> >>
> >> How big and ugly is that function? update_editor.c has a ton of these
> >> huge nasty functions. The code complexity is really bad.
> >
> > Maybe I've been working in the update-editor mine too long and lost
> > perspective...
> 
> Um... is it? or not? If it is a straightforward function, then ignore
> me. But given the lengthy parameter list, I'm guessing "long", so that
> was my question to you.
> 
> >> I'm assuming this is similar, and so in that vein:
> >>
> >>> /* First, check whether a significant conflict has already been raised
> >>>  * on FULL_PATH.  For any incoming change ACTION, check for a tree
> >>>  * conflict.  When adding an item, text and prop conflicts may also be
> >>>  * checked.  If an existing conflict is found, return NULL in
> >>>  * *PCONFLICT and set *SKIP_P to TRUE.
> >>
> >> Can this first check be broken out into a function, say,
> >> covered_by_existing_conflict(...) ?
> >
> > Yes.  Actually, it already is.  Doh!
> 
> Okay. So it *is*. Does that mean you're going to keep it that way, or
> are you proposing to mush the two together? Not sure what your answer
> meant.
> 
> >>>  * Second, check whether the incoming change ACTION on FULL_PATH would
> >>>  * conflict with FULL_PATH's scheduled change.  If so, then FULL_PATH
> >>>  * is a new tree conflict victim.  Raise a persistent tree conflict by
> >>>  * appending log actions to LOG_ACCUM.  Return a conflict description
> >>>  * in *PCONFLICT and set *SKIP_P to TRUE.
> >>
> >> Rather than taking LOG_ACCUM, can this function instead rely on the
> >> caller to perform log actions. IOW, only *check* for a conflict. Don't
> >> take any actions on it.
> >
> > We could pull out the code that writes the tree conflict data to the
> > parent dir's entry.  But if we find a new tree conflict, we always
> > write it.
> 
> Hunh? I mean the code that puts stuff into LOG_ACCUM. Can we move that
> code out of this function, thereby not need to take a LOG_ACCUM
> parameter? And thus also making this file a *check* rather than a
> *modify*.
> 
> > On a related note, we expect the caller to flush and run the log
> > (for update and switch).
> 
> Assumed so, yeah.
> 
> >>>  * If no conflicts are found, return NULL in *PCONFLICT and set
> >>>  * *SKIP_P to FALSE.
> >>>  *
> >>>  * When adding a file or directory, the KIND is the kind of the item
> >>>  * to be added, and *ADD_EXISTED_P will be set to TRUE if the item is
> >>>  * already scheduled for add without history (which is not a
> >>>  * conflict).  For other actions, KIND and ADD_EXISTED_P are ignored.
> >>
> >> Can this be left to the caller? This doesn't seem to be related to
> >> tree conflicts.
> >
> > Actually it is.  It's a special case where Subversion automatically
> > resolves a potential tree conflict without raising it.
> 
> Alrighty. And I presume the caller needs to know this for some reason.
> 
> Would the argument better be named something like ADD_CONFLICT_CLEARED
> ? Or whatever. It seems to suggest that a conflict should be
> *cleared*. Sure, there was an add, but what is the *actual* semantic
> of what happened or should happen?
> 
> >>>  * The edit baton EB gives information including whether the operation is
> >>>  * an update or a switch.
> >>
> >> How many pieces of EB are accessed? Having "the whole EB" can give
> >> this function ugly powers. If you only need one or two parameters,
> >> then I'd suggest passing those directly.
> >
> > Let's see, from the edit baton we need these members for the tree
> > conflict check:
> >
> >  allow_unver_obstructions
> >  cancel_func
> >  cancel_baton
> >
> > And one more for the tree conflict persistence:
> >
> >  switch_url (just a boolean if-null check)
> >
> > Not so many after all.
> 
> Then I might suggest a couple booleans, and a cancel func/baton pair.
> 
> Part of the problem in update_editor is the prevelent use of the
> batons as a pseudo-global variable space. As a result, data flow and
> coupling is near-impossible to track down. I just spent a great number
> of days unwinding some of that around fb->new_text_base_path. After
> unwinding all of that, plus some previous work, I turned *four* passes
> over a file into *one*. We're pretty inefficient :-(
> 
> >>>  * PARENT_ADM_ACCESS is the admin access baton of FULL_PATH's parent
> >>>  * directory.
> >>
> >> Parent? Why not pass the corresponding access baton, and if/when this
> >> function needs it, then it can retrieve it itself. Maybe this is
> >> nomenclature. Consider:
> >>
> >> If kind==file, then is this baton, the directory that the file resides
> >> in? Or the parent of *that* directory?
> >>
> >> If kind==dir, then is this baton related to self, or is this the
> >> parent of the directory?
> >>
> >> Please clarify.
> >
> > It's always the parent dir of the potential tree conflict victim.
> 
> Okay. In pseudo-code: get_access_baton(get_dirname(full_path)).
> 
> > Should each function retrieve this dir's access baton on its own?
> > Or is it better to pass around the same baton, which the caller
> > can use later (if there's no conflict and the update proceeds)?
> 
> Doesn't really matter. The access batons can quickly translate between
> all affected directories. So if you have one, then you have all of
> them.

Steve, the stuff Greg says makes sense to me too :-)  I'm not saying you
necessarily have to do all of it.

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Why does "update" treat an obstruction differently if adding without history?

Posted by Greg Stein <gs...@gmail.com>.
On Wed, Oct 29, 2008 at 9:46 AM, Stephen Butler <sb...@elego.de> wrote:
> Quoting Greg Stein <gs...@gmail.com>:
>
>> On Wed, Oct 29, 2008 at 8:03 AM, Stephen Butler <sb...@elego.de> wrote:
>>>
>>> ...
>>> I'm hacking away at the skipping problem, too.  For the
>>> allow-existing-add
>>> feature, I simply gave check_tree_conflicts() a ptr-to-bool arg.  Here's
>>> the signature for that function in my incomplete patch:
>>
>> How big and ugly is that function? update_editor.c has a ton of these
>> huge nasty functions. The code complexity is really bad.
>
> Maybe I've been working in the update-editor mine too long and lost
> perspective...

Um... is it? or not? If it is a straightforward function, then ignore
me. But given the lengthy parameter list, I'm guessing "long", so that
was my question to you.

>> I'm assuming this is similar, and so in that vein:
>>
>>> /* First, check whether a significant conflict has already been raised
>>>  * on FULL_PATH.  For any incoming change ACTION, check for a tree
>>>  * conflict.  When adding an item, text and prop conflicts may also be
>>>  * checked.  If an existing conflict is found, return NULL in
>>>  * *PCONFLICT and set *SKIP_P to TRUE.
>>
>> Can this first check be broken out into a function, say,
>> covered_by_existing_conflict(...) ?
>
> Yes.  Actually, it already is.  Doh!

Okay. So it *is*. Does that mean you're going to keep it that way, or
are you proposing to mush the two together? Not sure what your answer
meant.

>>>  * Second, check whether the incoming change ACTION on FULL_PATH would
>>>  * conflict with FULL_PATH's scheduled change.  If so, then FULL_PATH
>>>  * is a new tree conflict victim.  Raise a persistent tree conflict by
>>>  * appending log actions to LOG_ACCUM.  Return a conflict description
>>>  * in *PCONFLICT and set *SKIP_P to TRUE.
>>
>> Rather than taking LOG_ACCUM, can this function instead rely on the
>> caller to perform log actions. IOW, only *check* for a conflict. Don't
>> take any actions on it.
>
> We could pull out the code that writes the tree conflict data to the
> parent dir's entry.  But if we find a new tree conflict, we always
> write it.

Hunh? I mean the code that puts stuff into LOG_ACCUM. Can we move that
code out of this function, thereby not need to take a LOG_ACCUM
parameter? And thus also making this file a *check* rather than a
*modify*.

> On a related note, we expect the caller to flush and run the log
> (for update and switch).

Assumed so, yeah.

>>>  * If no conflicts are found, return NULL in *PCONFLICT and set
>>>  * *SKIP_P to FALSE.
>>>  *
>>>  * When adding a file or directory, the KIND is the kind of the item
>>>  * to be added, and *ADD_EXISTED_P will be set to TRUE if the item is
>>>  * already scheduled for add without history (which is not a
>>>  * conflict).  For other actions, KIND and ADD_EXISTED_P are ignored.
>>
>> Can this be left to the caller? This doesn't seem to be related to
>> tree conflicts.
>
> Actually it is.  It's a special case where Subversion automatically
> resolves a potential tree conflict without raising it.

Alrighty. And I presume the caller needs to know this for some reason.

Would the argument better be named something like ADD_CONFLICT_CLEARED
? Or whatever. It seems to suggest that a conflict should be
*cleared*. Sure, there was an add, but what is the *actual* semantic
of what happened or should happen?

>>>  * The edit baton EB gives information including whether the operation is
>>>  * an update or a switch.
>>
>> How many pieces of EB are accessed? Having "the whole EB" can give
>> this function ugly powers. If you only need one or two parameters,
>> then I'd suggest passing those directly.
>
> Let's see, from the edit baton we need these members for the tree
> conflict check:
>
>  allow_unver_obstructions
>  cancel_func
>  cancel_baton
>
> And one more for the tree conflict persistence:
>
>  switch_url (just a boolean if-null check)
>
> Not so many after all.

Then I might suggest a couple booleans, and a cancel func/baton pair.

Part of the problem in update_editor is the prevelent use of the
batons as a pseudo-global variable space. As a result, data flow and
coupling is near-impossible to track down. I just spent a great number
of days unwinding some of that around fb->new_text_base_path. After
unwinding all of that, plus some previous work, I turned *four* passes
over a file into *one*. We're pretty inefficient :-(

>>>  * PARENT_ADM_ACCESS is the admin access baton of FULL_PATH's parent
>>>  * directory.
>>
>> Parent? Why not pass the corresponding access baton, and if/when this
>> function needs it, then it can retrieve it itself. Maybe this is
>> nomenclature. Consider:
>>
>> If kind==file, then is this baton, the directory that the file resides
>> in? Or the parent of *that* directory?
>>
>> If kind==dir, then is this baton related to self, or is this the
>> parent of the directory?
>>
>> Please clarify.
>
> It's always the parent dir of the potential tree conflict victim.

Okay. In pseudo-code: get_access_baton(get_dirname(full_path)).

> Should each function retrieve this dir's access baton on its own?
> Or is it better to pass around the same baton, which the caller
> can use later (if there's no conflict and the update proceeds)?

Doesn't really matter. The access batons can quickly translate between
all affected directories. So if you have one, then you have all of
them.

Cheers,
-g

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Why does "update" treat an obstruction differently if adding without history?

Posted by Stephen Butler <sb...@elego.de>.
Quoting Greg Stein <gs...@gmail.com>:

> On Wed, Oct 29, 2008 at 8:03 AM, Stephen Butler <sb...@elego.de> wrote:
>> ...
>> I'm hacking away at the skipping problem, too.  For the allow-existing-add
>> feature, I simply gave check_tree_conflicts() a ptr-to-bool arg.  Here's
>> the signature for that function in my incomplete patch:
>
> How big and ugly is that function? update_editor.c has a ton of these
> huge nasty functions. The code complexity is really bad.

Maybe I've been working in the update-editor mine too long and lost
perspective...

>
> I'm assuming this is similar, and so in that vein:
>
>> /* First, check whether a significant conflict has already been raised
>>  * on FULL_PATH.  For any incoming change ACTION, check for a tree
>>  * conflict.  When adding an item, text and prop conflicts may also be
>>  * checked.  If an existing conflict is found, return NULL in
>>  * *PCONFLICT and set *SKIP_P to TRUE.
>
> Can this first check be broken out into a function, say,
> covered_by_existing_conflict(...) ?

Yes.  Actually, it already is.  Doh!

>
>>  *
>>  * Second, check whether the incoming change ACTION on FULL_PATH would
>>  * conflict with FULL_PATH's scheduled change.  If so, then FULL_PATH
>>  * is a new tree conflict victim.  Raise a persistent tree conflict by
>>  * appending log actions to LOG_ACCUM.  Return a conflict description
>>  * in *PCONFLICT and set *SKIP_P to TRUE.
>
> Rather than taking LOG_ACCUM, can this function instead rely on the
> caller to perform log actions. IOW, only *check* for a conflict. Don't
> take any actions on it.

We could pull out the code that writes the tree conflict data to the
parent dir's entry.  But if we find a new tree conflict, we always
write it.

On a related note, we expect the caller to flush and run the log
(for update and switch).

>
>>  *
>>  * If no conflicts are found, return NULL in *PCONFLICT and set
>>  * *SKIP_P to FALSE.
>>  *
>>  * When adding a file or directory, the KIND is the kind of the item
>>  * to be added, and *ADD_EXISTED_P will be set to TRUE if the item is
>>  * already scheduled for add without history (which is not a
>>  * conflict).  For other actions, KIND and ADD_EXISTED_P are ignored.
>
> Can this be left to the caller? This doesn't seem to be related to
> tree conflicts.

Actually it is.  It's a special case where Subversion automatically
resolves a potential tree conflict without raising it.

>
>>  *
>>  * The edit baton EB gives information including whether the operation is
>>  * an update or a switch.
>
> How many pieces of EB are accessed? Having "the whole EB" can give
> this function ugly powers. If you only need one or two parameters,
> then I'd suggest passing those directly.

Let's see, from the edit baton we need these members for the tree
conflict check:

   allow_unver_obstructions
   cancel_func
   cancel_baton

And one more for the tree conflict persistence:

   switch_url (just a boolean if-null check)

Not so many after all.

>
>>  *
>>  * PARENT_ADM_ACCESS is the admin access baton of FULL_PATH's parent
>>  * directory.
>
> Parent? Why not pass the corresponding access baton, and if/when this
> function needs it, then it can retrieve it itself. Maybe this is
> nomenclature. Consider:
>
> If kind==file, then is this baton, the directory that the file resides
> in? Or the parent of *that* directory?
>
> If kind==dir, then is this baton related to self, or is this the
> parent of the directory?
>
> Please clarify.

It's always the parent dir of the potential tree conflict victim.

Should each function retrieve this dir's access baton on its own?
Or is it better to pass around the same baton, which the caller
can use later (if there's no conflict and the update proceeds)?

Thanks for the feedback.
Steve

-- 
Stephen Butler | Software Developer
elego Software Solutions GmbH
Gustav-Meyer-Allee 25 | 13355 Berlin | Germany
fon: +49 30 2345 8696 | mobile: +49 163 25 45 015
fax: +49 30 2345 8695 | http://www.elegosoft.com
Geschäftsführer: Olaf Wagner | Sitz der Gesellschaft: Berlin
Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Why does "update" treat an obstruction differently if adding without history?

Posted by Greg Stein <gs...@gmail.com>.
On Wed, Oct 29, 2008 at 8:03 AM, Stephen Butler <sb...@elego.de> wrote:
>...
> I'm hacking away at the skipping problem, too.  For the allow-existing-add
> feature, I simply gave check_tree_conflicts() a ptr-to-bool arg.  Here's
> the signature for that function in my incomplete patch:

How big and ugly is that function? update_editor.c has a ton of these
huge nasty functions. The code complexity is really bad.

I'm assuming this is similar, and so in that vein:

> /* First, check whether a significant conflict has already been raised
>  * on FULL_PATH.  For any incoming change ACTION, check for a tree
>  * conflict.  When adding an item, text and prop conflicts may also be
>  * checked.  If an existing conflict is found, return NULL in
>  * *PCONFLICT and set *SKIP_P to TRUE.

Can this first check be broken out into a function, say,
covered_by_existing_conflict(...) ?

>  *
>  * Second, check whether the incoming change ACTION on FULL_PATH would
>  * conflict with FULL_PATH's scheduled change.  If so, then FULL_PATH
>  * is a new tree conflict victim.  Raise a persistent tree conflict by
>  * appending log actions to LOG_ACCUM.  Return a conflict description
>  * in *PCONFLICT and set *SKIP_P to TRUE.

Rather than taking LOG_ACCUM, can this function instead rely on the
caller to perform log actions. IOW, only *check* for a conflict. Don't
take any actions on it.

>  *
>  * If no conflicts are found, return NULL in *PCONFLICT and set
>  * *SKIP_P to FALSE.
>  *
>  * When adding a file or directory, the KIND is the kind of the item
>  * to be added, and *ADD_EXISTED_P will be set to TRUE if the item is
>  * already scheduled for add without history (which is not a
>  * conflict).  For other actions, KIND and ADD_EXISTED_P are ignored.

Can this be left to the caller? This doesn't seem to be related to
tree conflicts.

>  *
>  * The edit baton EB gives information including whether the operation is
>  * an update or a switch.

How many pieces of EB are accessed? Having "the whole EB" can give
this function ugly powers. If you only need one or two parameters,
then I'd suggest passing those directly.

>  *
>  * PARENT_ADM_ACCESS is the admin access baton of FULL_PATH's parent
>  * directory.

Parent? Why not pass the corresponding access baton, and if/when this
function needs it, then it can retrieve it itself. Maybe this is
nomenclature. Consider:

If kind==file, then is this baton, the directory that the file resides
in? Or the parent of *that* directory?

If kind==dir, then is this baton related to self, or is this the
parent of the directory?

Please clarify.

Cheers,
-g

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Why does "update" treat an obstruction differently if adding without history?

Posted by Stephen Butler <sb...@elego.de>.
Quoting Paul Burba <pt...@gmail.com>:

> On Wed, Oct 29, 2008 at 8:21 AM, Julian Foad   
> <ju...@btopenworld.com> wrote:
>> Can anyone explain this special handling of items scheduled for addition
>> *without history*, in "svn update"'s "add file" code path?

[...]

> Julian,
>
> That was related to a follow-up to the "takeover" functionality, and
> was filed as issue #2593.
>
> See also (the threads are many and a bit broken, but these are the   
> key points):
>
> Mark describes the problem: http://svn.haxx.se/dev/archive-2006-08/0336.shtml
> Your thoughts: http://svn.haxx.se/dev/archive-2006-08/0718.shtml

Julian:  Hoist by your own petard!  ;-)

> My proposed patch: http://svn.haxx.se/dev/archive-2006-09/0362.shtml
>
> Paul

Tree conflict fans:

I'm hacking away at the skipping problem, too.  For the allow-existing-add
feature, I simply gave check_tree_conflicts() a ptr-to-bool arg.  Here's
the signature for that function in my incomplete patch:

/* First, check whether a significant conflict has already been raised
  * on FULL_PATH.  For any incoming change ACTION, check for a tree
  * conflict.  When adding an item, text and prop conflicts may also be
  * checked.  If an existing conflict is found, return NULL in
  * *PCONFLICT and set *SKIP_P to TRUE.
  *
  * Second, check whether the incoming change ACTION on FULL_PATH would
  * conflict with FULL_PATH's scheduled change.  If so, then FULL_PATH
  * is a new tree conflict victim.  Raise a persistent tree conflict by
  * appending log actions to LOG_ACCUM.  Return a conflict description
  * in *PCONFLICT and set *SKIP_P to TRUE.
  *
  * If no conflicts are found, return NULL in *PCONFLICT and set
  * *SKIP_P to FALSE.
  *
  * When adding a file or directory, the KIND is the kind of the item
  * to be added, and *ADD_EXISTED_P will be set to TRUE if the item is
  * already scheduled for add without history (which is not a
  * conflict).  For other actions, KIND and ADD_EXISTED_P are ignored.
  *
  * The edit baton EB gives information including whether the operation is
  * an update or a switch.
  *
  * PARENT_ADM_ACCESS is the admin access baton of FULL_PATH's parent
  * directory.
  */
static svn_error_t *
check_tree_conflict(svn_wc_conflict_description_t **pconflict,
                     svn_boolean_t *skip_p,
                     svn_boolean_t *add_existed_p,
                     struct edit_baton *eb,
                     svn_stringbuf_t *log_accum,
                     const char *full_path,
                     svn_wc_adm_access_t *parent_adm_access,
                     svn_wc_conflict_action_t action,
                     svn_node_kind_t kind,
                     apr_pool_t *pool)

To quote Alan Kay, I believe, "If you write a function with ten
parameters, you've forgotten one."  Uh oh.

Cheers,
Steve


-- 
Stephen Butler | Software Developer
elego Software Solutions GmbH
Gustav-Meyer-Allee 25 | 13355 Berlin | Germany
fon: +49 30 2345 8696 | mobile: +49 163 25 45 015
fax: +49 30 2345 8695 | http://www.elegosoft.com
Geschäftsführer: Olaf Wagner | Sitz der Gesellschaft: Berlin
Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org