You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Neels Janosch Hofmeyr <ne...@elego.de> on 2009/08/11 18:51:15 UTC

Re: Defining atomic "replace"

Hi Julian and everyone else,

I'd like to summarize a little about atomic replace operations in the new
editor interface (subversion/include/svn_editor.h,
subversion/libsvn_delta/editor.c) and have a conclusion.

My intent is to further tweak svn_editor.h shortly, so feedback is welcome.



The argument started off like this:

If a file or folder at a certain path is deleted, and some other file or
folder is added at the same path, that is a replace.

Sometimes it is nice and/or necessary to be able to relate the delete() and
the add(), for example to print
 R  foo
instead of
 D  foo
 M  other_stuff
 A  foo
, and for detecting tree-conflicts.

Currently, wherever that happens, the *receiving* side of the editor
callbacks has to figure out whether a delete coincides with an add at the
same path. That involves caching deleted nodes == bloat and complication.

Fact is, there is at least one *driver* of the editor callbacks that has
already made the correlation of a delete followed by an add, but splits it
up into a delete followed by an add, following callback design.

My crusade is to move the work for combining delete() + add() into a single
"replace" into the *driver* of the new callbacks, considering that that
information may already be lying around for free in most cases.

Even if it isn't, I think it is better to make the *driver* do the
combination work "once", and not burden "all those receivers" with the work
of correlating deletes with adds.

The problem I ran into is this:

To effectively take the burden off the receivers, we need to *enforce* that
the driver actually combines delete + add = replace. If the driver had a
choice to combine or not combine "replace" calls, all those receivers have
to be able to handle *both* cases, and the whole plan backfires, adding more
code instead of simplifying.

I will now try to theoretically home in on a Good API definition around
atomic replace.


Aspects:


(*) Receivers should not have to combine "replace" operations.

--> We need to *disallow* issuing a delete() followed by an add() operation
on the same path. The drivers all need to use the atomic-replace channel
instead, which is to issue an add() operation with a REPLACES_REV arg.


(*) Two-sided vs. detailed history
In current merge (and all other operations AFAIK), if a certain path was
deleted and added multiple times in a given range, all the intermediate
steps are dropped and the only things that matter are
 - the initial state and
 - the final state.
e.g. (D, A, M, D, A) becomes (D, A), the three in the center are dropped.
However, there are rumors that at some point all the intermediate steps will
be necessary (keyword "operational merge").

--> We need to allow multiple atomic replaces in a given range, so that each
sub-sequence of (D, A) becomes a (R):
  (D, A, M, D, A) becomes (R, M, R) in an operational merge.


(*) Atomic move
In the new editor callbacks, there is a separate move() call to represent
atomic moves. It is thus possible to issue a delete() followed by a move()
to the same path that was deleted. Which is, essentially, a replace!

--> We need to also add a REPLACES_REV argument to the move() callback, and
we need to disallow issuing a delete followed by a move to the same path.
Actually, has this case ever been considered? Currently, a move() becomes an
"add-with-history" plus a delete(). Our API doesn't indicate a move
replacing a path. We could need another update/status letter for this.


(*) Copy
There also is a new copy() call instead of the "add(copyfrom_revision)". I
guess it's a bikeshed whether the copy() callback should be separate instead
of adding a "copyfrom_rev" to the add() callback. In short, copy == add+.
Like with moves, it is possible to send a delete() followed by a copy() to
the same path, which amounts to yet another "replace" variant.

--> We need to also add a REPLACES_REV argument to the copy() callback, and
disallow issuing a delete followed by a copy to the same path.


(*) Different meanings
It is conceivable that one single "replace" call may have a different
meaning from separate delete() and add() calls. That would be in the way
that the one type represents a closer relation than the other.

--> My conclusion is that this is not the case / does not matter. An atomic
replace is simply a new node taking the path of another node, the two of
them being unrelated. There *are* no less-related cases than a "replace"
operation (on the same path). There are more-related cases than a "replace",
which then become a move() or a copy() (a.k.a. add-with-history). (Thus I
suggest adding the REPLACES_REV arg to move() and copy() as well, above. If
the added node is related to the node that is being replaced, heavens
forbid, it should become a copy() operation with SRC_RELPATH == DST_RELPATH
and REPLACES_REV == SRC_REVISION.)


Thanks for feedback,

~Neels


Julian Foad wrote:
> On Tue, 2009-07-21, Neels J Hofmeyr wrote:
>> Julian Foad wrote:
>>>> The driver is required to use add() with REPLACES_REV instead of a separate
>>>> delete() call before this add(). Receivers of this callback in general are
>>>> not required to support a separate delete() call for a 'replace'.
>>> I think it would be clearer to state the editor requirements separately
>>> from the usage recommendations. (This last paragraph and the following
>>> one seem to mix the two.)
>>>
>>>> If you intend to implement a separate delete() call when driving this
>>>> editor, make sure you really have to and put up "big red signs" in the
>>>> documentation of your driver. Be aware that, in general, the driver should
>>>> do all of the (possibly hard) work of combining a delete() with an add() to
>>>> form an atomic replace, taking that burden off the various receivers. So if
>>>> you write a driver that doesn't do that, you have to make sure that all your
>>>> receivers (preferably very few) can handle a non-atomic replace.
>>>> Furthermore, a delete() call should occur always before, never after an
>>>> add() call for the same node.
>>>> ]]]
>> Yes, you're right.
>> Where is the line between editor requirements and usage recommendations?
>>
>> Maybe you can help to choose a side... Or are these sides chosen by
>> individual implementations anyway?
> 
> It is important to decide and clearly draw the line somewhere. The whole
> point of having a standard "interface" is so that multiple producers and
> consumers can be written independently and someone else can plug them
> together and know that they will work together.
> 
>> I am more for forcing an atomic replace. I would rather remove the loophole
>> that says "well, but if you must, just issue separate calls". I'd personally
>> not mind if some parts of the code send separate calls, but I want e.g. the
>> merge callback receivers to be certain that they will *not* receive separate
>> calls *ever*.
> 
> OK, we can make that a requirement. We must ensure that all of the
> current producers obey it.
> 
> 
>> How about we choose the stricter comment, but we let people use it
>> differently if they must without saying so now?
> 
> No, let's not invite trouble.
> 
> 
> [...]
>> - is it ok to disallow multiple add() and delete() calls on the same path,
>>   in principle?
>>    e.g. during merge, there are only the two sides, merge-left
>>    and merge-right. A replace means that both exist but are "unrelated". So
>>    a merge will never need to send more than one (delete, add) tuple.
>>    Usually it's either *one* delete or *one* add. The (add, delete) tuple
>>    simply cancels out, nothing being sent.
> 
> Yes, it's absolutely fine to specify that the editor must be driven with
> a single change for each path.
> 
> Let's imagine that the user does "svn rm X; svn add X; svn rm X; svn add
> X" in his/her WC. Let's imagine that the WC code has an "undo history"
> that recorded all the separate actions in this sequence. When the user
> makes a commit (or a diff or an update or whatever), then it's the
> calling code's job to ensure that the editor is driven with a single
> change (replace X(old) with X(new)).
> 
>>   But are there cases where we also send all the steps in-between? Then we
>>   could allow this: add(), add(REPLACES_REV), add(REPLACES_REV), delete().
>>   But *are there*?
> 
> I don't know whether there are currently cases where we do this, other
> than a single delete followed by a single add. But we are defining the
> interface. We must define what we think is best and then the callers
> must do what we say.
> 
>> - is it necessary to disallow multiple add() and delete() calls on the same
>>   path, to make the atomic replace useful?
>>    e.g. during merge, the receiver needs to cache all delete()s and match
>>    with add()s, carrying out the leftovers during dir_close(). And merge
>>    is not the only one in need of this quirky code construct.
>>   I want to eradicate this madness -- I'd love to force atomic replaces
>>   on everyone, so the receivers are entirely relieved of this quirkiness.
>>   Will allowing both separate and atomic replace calls defy my plan by
>>   actually adding more code to the receivers?
> 
> Basically, yes, I think it is necessary.
> 
> If a receiver wants to handle a replacement as a single operation and
> the editor API does not guarantee that it will get a single call, then
> it can achieve the same end result by batching up the delete requests
> and executing them just before closing the directory. But if it
> implements that, then there would be no benefit in having the single
> "replace" call available.
> 
>> - is there a case where separate calls have a different meaning than an
>>   atomic replace?
>>    e.g. for merge, it doesn't matter whether the nodes are "related".
>>   I can't think of any case where we'd want to highlight that the node was
>>   deleted and later re-added with the same history, or something. Right?
> 
> This is the question of critical importance: Is there any semantic
> difference (difference in meaning) between
> 
>   replace X with (something)
> 
> and
> 
>   delete X; add (something) at path "X"
> 
> ?
> 
> Before I try to answer that, first let's review the history of the
> semantics of "svn move" (also know as "svn rename"). Initially we
> decided there was no semantic difference between
> 
>   move X to Y
> 
> and
> 
>   delete X; add Y as a copy of X
> 
> That is true as far as it goes, but unfortunately in the implementation
> we lost the implicit requirement that the "X" being deleted in the first
> half is the very same "X" that's copied in the second half. It got
> translated to
> 
>   delete X
>   add Y as a copy of (X's absolute path at a specific recent-ish
> revision)
> 
> That works just about well enough within a single branch, but when
> merging such a change to another branch the connection between the two
> X's is completely broken and the result is not what we meant at all.
> 
> We usually want a move in the merge source branch, such as
> 
>   delete source/X
>   add source/Y as a copy of (source/X at its latest revision)
> 
> to be merged into the target branch as
> 
>   delete target/X
>   add target/Y as a copy of (target/X at its latest revision)
> 
> So, working back to how we could better have defined the semantics of
> "move", we could say "move X to Y" means
> 
>   remove entry X from X's directory
>     (but don't destroy its content, because it's just being moved)
>   add entry Y into Y's directory,
>     taking its content from the X that's being removed
> 
> Note that the two parts are intimately connected together, and that the
> "delete" half is semantically different from a plain "delete".
> 
> I think that's where we need to be going with the "true renames" issue
> <http://subversion.tigris.org/issues/show_bug.cgi?id=898>.
> 
> That's not quite the end of the story with "move". One side issue is
> that I think the semantics of "copy" should be updated in the same way,
> so when "copy source/X to source/Y" is merged it should add target/Y as
> a copy of (target/X at latest revision), not of (source/X at some
> specific old revision). Another is that I've hand-waved on how the path
> "source/X" is mapped to the path "target/X"; I think it involves
> treating them both as relative to the respective merge root paths, and
> adjusting for any directories moved within the merge. The merge should
> throw an error if the move source is outside the merged tree.
> 
> Now, what does that teach us about how we want "replace" to behave? Are
> we making some assumption when we "replace X with a copy of Y" that is
> subtly stronger than "delete X; add a new X that's a copy of Y"? Does it
> make a difference if instead of "Y" we say "a previous revision of X"?
> 
> - Julian
> 
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2376254

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2382628

Re: Defining atomic "replace"

Posted by Bill Tutt <bi...@gmail.com>.

On Thu, Aug 13, 2009 at 7:24 PM, Julian Foad <ju...@btopenworld.com>wrote:

> On Thu, 2009-08-13, Bill Tutt wrote: > Move, you mentioned move. You're
> giving me a headache flashback.
>
>
> Hi Bill. Thanks for the thought input...
>  > i.e. The directory heirarchy changing dependant case:
> > A\a.cs A\B\C\c.cs
> >
> > mv A\B B
> > mv A B\A
> >
> > This ends up with B\A\a.cs and B\C\c.cs.
> >
> > Regardless of whether not you can submit such a thing in one
> > changeset, a non-operational merge would need to deal with the
> > problem. (i.e. not allowing it in the general changeset case doesn't
> > prevent it from the non-operational merge case)
>
>
> Yes, OK...
>  > or the substantially less annoying and fairly simple dependant rename
> > case:
> >
> > mv a.cs temp.cs
> > mv b.cs a.cs
> > mv temp.cs b.cs
>
>
> which swaps a.cs with b.cs, and is equivalent to (and let's say the current
> revision is 3)
>
> del a.cs
> del b.cs
> copy <URL of a.cs>@3 b.cs
> copy <URL of b.cs>@3 a.cs
>
>
> > Both of the above cases also suffer from the problem of needing to
> > disallow not including all of the pending moves in a submission.
>
>
> <parsing double-negatives>... <done>
>

To rephrase and clarify:

Dependant rename set: A set of rename operations that require the operations
to execute in a specific order in the WC so that all file contents end up
with the correct path and contents. There are three possible cases here:

File dependant rename example:
mv a.cs -> b.cs
mv b.cs -> a.cs

Directory dependant rename example:
mv A -> B
mv B -> A

Directory hierarchy changing rename example:
A\B\C are directories that contain files:
mv A\B -> B
mv A -> B\A

If Subversion did not allow the replace mechanism with these types of moves:
All of the pending moves in a dependent rename set that was NOT "Directory
hierarchy changing" would need to be submitted at the same time.

"Directory hierarchy changing" renames do not have to be submitted
simultaneously, however allowing them seems like a very bad idea when
balanced with a typical version control end user concept of: A given
changeset should always be buildalbe. You would be submitting a tree
structure that never actually occurred in the working copy. *shivers*

The code necessary to unwind only part of a "directory hierarchy changing"
rename when you have other pending renames in the same submission that
interact with the "directory hirarchy changing" rename isn't particulary
pretty or cheap. (It's that fun two pass rename algorthim dreamt up by the
author of arch.) I'd recommend only implementing that code for 'svn
revert'/''svn merge' if you ever write that code at all.

> In Subversion, currently, as you know, you are allowed to commit just
> one half of the moves. Let's examine what happens if you do. Note that
> the selection of what to include in the commit is done by path, not by
> "object identity". If in your latter case we choose to commit just
> "a.cs", the result would be:
>
> r4:
>  a.cs is replaced with a copy of b.cs@3
>  b.cs is unaffected
>
> I agree that there is a usability benefit in disallowing this partial
> commit, but I don't see that it causes any particularly bad or
> surprising result. You can still commit the other half (b.cs) later and
> get the intended end result:
>
 r5:
>  a.cs is unaffected
>  b.cs is replaced with a copy of a.cs@3
>
> The WC remembered that b.cs should be a copy of a.cs@3. If instead it
> had remembered it as "a copy of head revision of a.cs", this second
> commit would be effectively a no-op because a.cs@HEAD was at this time
> already the same as b.cs, and that would not be the intended end result.
>
>

You're correct with replace only doing a paritial commit of the file
dependant rename set is fairly harmless. Odd and wrong, but fairly harmless.

>  As part of "true renames" support, we will probably want to change the
> semantics of moves and copies to say "a copy of head revision of ..."
> rather than a specific revision number. Then we will have to watch out
> for this split commit, and probably disallow it.
>

Why would you ever want to say "a copy of head revision of ..."? You want to
generate conflicts appropriately when you make your submission. If
the file/directory was renamed behind your back you need to reverify the
user's desire to maintain the same destination path.

>
>  > (Esp. of directories, but disallowing it entirely always seemed like a
> > good idea to me.) If you allow partial pending move submissions then
> > you're submitting a tree state you haven't built yet.
>
>
> Yes, but that's true of any kind of partial commit where you make WC modifications
> in one order and then commit the pieces in another order.
> You feel this is particularly bad in the case of splitting a move. I
> think we all agree it's worse and not a good idea, but I don't
> understand whether you have some special viewpoint that makes you feel
> it's so terrible or such a headache.
>
>

My big fear is users accidentally submitting a tree structure change that
wasn't the shape of their WC (for each dependant rename set) at time of
submission. That seems like a black hole of badness and its not particulary
hard to check for. i.e. If something is always a bad idea why not prevent
the user from accidently hanging themselves?

Not submitting all renames in a file dependant rename set is certainly odd,
but substantially less scary to me than not submitting all renames in a
directory dependant rename set.

Bill

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383695

Re: Defining atomic "replace"

Posted by Julian Foad <ju...@btopenworld.com>.

On Thu, 2009-08-13, Bill Tutt wrote:
> Move, you mentioned move. You're giving me a headache flashback.

Hi Bill. Thanks for the thought input...

> i.e. The directory heirarchy changing dependant case:
> A\a.cs A\B\C\c.cs
>  
> mv A\B B
> mv A B\A
>  
> This ends up with B\A\a.cs and B\C\c.cs.
>  
> Regardless of whether not you can submit such a thing in one
> changeset, a non-operational merge would need to deal with the
> problem. (i.e. not allowing it in the general changeset case doesn't
> prevent it from the non-operational merge case)

Yes, OK...

> or the substantially less annoying and fairly simple dependant rename
> case:
>  
> mv a.cs temp.cs
> mv b.cs a.cs
> mv temp.cs b.cs

which swaps a.cs with b.cs, and is equivalent to (and let's say the
current revision is 3)

del a.cs
del b.cs
copy <URL of a.cs>@3 b.cs
copy <URL of b.cs>@3 a.cs

> Both of the above cases also suffer from the problem of needing to
> disallow not including all of the pending moves in a submission.

<parsing double-negatives>... <done>

In Subversion, currently, as you know, you are allowed to commit just
one half of the moves. Let's examine what happens if you do. Note that
the selection of what to include in the commit is done by path, not by
"object identity". If in your latter case we choose to commit just
"a.cs", the result would be:

r4:
  a.cs is replaced with a copy of b.cs@3
  b.cs is unaffected

I agree that there is a usability benefit in disallowing this partial
commit, but I don't see that it causes any particularly bad or
surprising result. You can still commit the other half (b.cs) later and
get the intended end result:

r5:
  a.cs is unaffected
  b.cs is replaced with a copy of a.cs@3

The WC remembered that b.cs should be a copy of a.cs@3. If instead it
had remembered it as "a copy of head revision of a.cs", this second
commit would be effectively a no-op because a.cs@HEAD was at this time
already the same as b.cs, and that would not be the intended end result.

As part of "true renames" support, we will probably want to change the
semantics of moves and copies to say "a copy of head revision of ..."
rather than a specific revision number. Then we will have to watch out
for this split commit, and probably disallow it.

> (Esp. of directories, but disallowing it entirely always seemed like a
> good idea to me.) If you allow partial pending move submissions then
> you're submitting a tree state you haven't built yet. 

Yes, but that's true of any kind of partial commit where you make WC
modifications in one order and then commit the pieces in another order.
You feel this is particularly bad in the case of splitting a move. I
think we all agree it's worse and not a good idea, but I don't
understand whether you have some special viewpoint that makes you feel
it's so terrible or such a headache.

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383472

Re: Defining atomic "replace"

Posted by Bill Tutt <bi...@gmail.com>.

On Tue, Aug 11, 2009 at 4:04 PM, Greg Stein <gs...@gmail.com> wrote:

> > (*) Atomic move > In the new editor callbacks, there is a separate
> move() call to represent
> > atomic moves. It is thus possible to issue a delete() followed by a
> move()
> > to the same path that was deleted. Which is, essentially, a replace!
> >
> > --> We need to also add a REPLACES_REV argument to the move() callback,
> and
> > we need to disallow issuing a delete followed by a move to the same path.
>
> Agreed.
>  > Actually, has this case ever been considered? Currently, a move()
> becomes an
> > "add-with-history" plus a delete(). Our API doesn't indicate a move
> > replacing a path. We could need another update/status letter for this.
>
>
> There are LOTS of APIs in our system that do not understand moves. The new
> wc_db can record moves, but (in 1.7) nothing is going to *tell* it
> to do that. This new editor API can describe moves, but nothing will
> drive it that way. Or maybe it will, and the receiver will break it
> down to a delete/add to deal with legacy APIs that do not have a move
> operation.
>
> I expect that, over time, we will start propagating a "move" concept
> further and further through our codebase. Only at legacy points (eg.
> speaking to an old server) will they be broken into a delete/add pair.
>

Apologies for the slightly off topic direction shift:
Move, you mentioned move. You're giving me a headache flashback.

i.e. The directory heirarchy changing dependant case:
A\a.cs A\B\C\c.cs

mv A\B B
mv A B\A

This ends up with B\A\a.cs and B\C\c.cs.

Regardless of whether not you can submit such a thing in one changeset, a
non-operational merge would need to deal with the problem. (i.e. not
allowing it in the general changeset case doesn't prevent it from the
non-operational merge case)

or the substantially less annoying and fairly simple dependant rename case:

mv a.cs temp.cs
mv b.cs a.cs
mv temp.cs b.cs

Both of the above cases also suffer from the problem of needing to disallow
not including all of the pending moves in a submission.
(Esp. of directories, but disallowing it entirely always seemed like a good
idea to me.) If you allow partial pending move submissions then you're
submitting a tree state you haven't built yet.

Ugh. Ok, the headache flashback is over now, back to your normal subversion
work. :)

Bill

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383306

Re: Defining atomic "replace"

Posted by Greg Stein <gs...@gmail.com>.

In short: sounds great.

On Tue, Aug 11, 2009 at 20:51, Neels J Hofmeyr<ne...@elego.de> wrote:
>...
> Currently, wherever that happens, the *receiving* side of the editor
> callbacks has to figure out whether a delete coincides with an add at the
> same path. That involves caching deleted nodes == bloat and complication.

Agreed.

And I also agree that we can define a requirement upon drivers'
pattern of operation on the API.

I've also speculated on having the editor.c code implement a debug
mode that verifies all constraints. Since *all* calls on the editor v2
must pass through those functions, we can implement all the semantic
checks. For example, we can put a hash table into the editor structure
and record all the deletes, then verify that an add/copy/move never
arrives for the same path.

> Fact is, there is at least one *driver* of the editor callbacks that has
> already made the correlation of a delete followed by an add, but splits it
> up into a delete followed by an add, following callback design.
>
> My crusade is to move the work for combining delete() + add() into a single
> "replace" into the *driver* of the new callbacks, considering that that
> information may already be lying around for free in most cases.

Right. This is also why I changed the model to "present all children
at directory creation time". This is always known, as far as I'm
aware. And specifying it up front has a number of advantages (as
stated in editor-v2.txt).

> Even if it isn't, I think it is better to make the *driver* do the
> combination work "once", and not burden "all those receivers" with the work
> of correlating deletes with adds.
>
> The problem I ran into is this:
>
> To effectively take the burden off the receivers, we need to *enforce* that
> the driver actually combines delete + add = replace. If the driver had a
> choice to combine or not combine "replace" calls, all those receivers have
> to be able to handle *both* cases, and the whole plan backfires, adding more
> code instead of simplifying.

Yup. And as I said above: we can have a debug mode to validate proper
behaviors from all drivers.

> I will now try to theoretically home in on a Good API definition around
> atomic replace.
>
>
> Aspects:
>
>
> (*) Receivers should not have to combine "replace" operations.
>
> --> We need to *disallow* issuing a delete() followed by an add() operation
> on the same path. The drivers all need to use the atomic-replace channel
> instead, which is to issue an add() operation with a REPLACES_REV arg.

Agreed.

> (*) Two-sided vs. detailed history
> In current merge (and all other operations AFAIK), if a certain path was
> deleted and added multiple times in a given range, all the intermediate
> steps are dropped and the only things that matter are
>  - the initial state and
>  - the final state.
> e.g. (D, A, M, D, A) becomes (D, A), the three in the center are dropped.
> However, there are rumors that at some point all the intermediate steps will
> be necessary (keyword "operational merge").
>
> --> We need to allow multiple atomic replaces in a given range, so that each
> sub-sequence of (D, A) becomes a (R):
>  (D, A, M, D, A) becomes (R, M, R) in an operational merge.

I'm not sure that I understand this, but agree with the restriction of
"must combine".

But this does seem to raise a new question: you're talking about an
extra operation here. Some kind of "merge" operation. Does that need
to be present in the new editor API?

Note: if the answer is "unsure; we'll know later", then I consider
that fine, too. This v2 API is much more extensible, so I believe that
we can add more "callbacks" without a much problem.

> (*) Atomic move
> In the new editor callbacks, there is a separate move() call to represent
> atomic moves. It is thus possible to issue a delete() followed by a move()
> to the same path that was deleted. Which is, essentially, a replace!
>
> --> We need to also add a REPLACES_REV argument to the move() callback, and
> we need to disallow issuing a delete followed by a move to the same path.

Agreed.

> Actually, has this case ever been considered? Currently, a move() becomes an
> "add-with-history" plus a delete(). Our API doesn't indicate a move
> replacing a path. We could need another update/status letter for this.

There are LOTS of APIs in our system that do not understand moves. The
new wc_db can record moves, but (in 1.7) nothing is going to *tell* it
to do that. This new editor API can describe moves, but nothing will
drive it that way. Or maybe it will, and the receiver will break it
down to a delete/add to deal with legacy APIs that do not have a move
operation.

I expect that, over time, we will start propagating a "move" concept
further and further through our codebase. Only at legacy points (eg.
speaking to an old server) will they be broken into a delete/add pair.

> (*) Copy
> There also is a new copy() call instead of the "add(copyfrom_revision)". I
> guess it's a bikeshed whether the copy() callback should be separate instead
> of adding a "copyfrom_rev" to the add() callback. In short, copy == add+.

It isn't a bikeshed. There are *four* add variants, based on the kind
of the node you're adding. There is only *one* copy callback because
the node's kind does not magically change during the copy process.
Adding a copy revision/path pair of arguments to all four "add"
variants increases complexity, I think. I also believe that newcomers
have a harder time dealing with the "copy == add with history"
concept.

> Like with moves, it is possible to send a delete() followed by a copy() to
> the same path, which amounts to yet another "replace" variant.
>
> --> We need to also add a REPLACES_REV argument to the copy() callback, and
> disallow issuing a delete followed by a copy to the same path.

Agreed.

> (*) Different meanings
> It is conceivable that one single "replace" call may have a different
> meaning from separate delete() and add() calls. That would be in the way
> that the one type represents a closer relation than the other.
>
> --> My conclusion is that this is not the case / does not matter. An atomic
> replace is simply a new node taking the path of another node, the two of
> them being unrelated. There *are* no less-related cases than a "replace"
> operation (on the same path). There are more-related cases than a "replace",
> which then become a move() or a copy() (a.k.a. add-with-history). (Thus I

Agreed.

> suggest adding the REPLACES_REV arg to move() and copy() as well, above. If
> the added node is related to the node that is being replaced, heavens
> forbid, it should become a copy() operation with SRC_RELPATH == DST_RELPATH
> and REPLACES_REV == SRC_REVISION.)

Eh? Wouldn't this just become a no-op?

>...

Cheers,
-g

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2382655

Re: Defining atomic "replace"

Posted by Neels Janosch Hofmeyr <ne...@elego.de>.

Julian Foad wrote:
> On Tue, 2009-08-11 at 20:51 +0200, Neels J Hofmeyr wrote:
>> Hi Julian and everyone else,
>>
>> I'd like to summarize a little about atomic replace operations in the new
>> editor interface (subversion/include/svn_editor.h,
>> subversion/libsvn_delta/editor.c) and have a conclusion.
>>
>> My intent is to further tweak svn_editor.h shortly, so feedback is welcome.
>>
>>
>>
>> The argument started off like this:
>>
>> If a file or folder at a certain path is deleted, and some other file or
>> folder is added at the same path, that is a replace.
>>
>> Sometimes it is nice and/or necessary to be able to relate the delete() and
>> the add(), for example to print
>>  R  foo
>> instead of
>>  D  foo
>>  M  other_stuff
>>  A  foo
>> , and for detecting tree-conflicts.
>>
>> Currently, wherever that happens, the *receiving* side of the editor
>> callbacks has to figure out whether a delete coincides with an add at the
>> same path. That involves caching deleted nodes == bloat and complication.
>>
>> Fact is, there is at least one *driver* of the editor callbacks that has
>> already made the correlation of a delete followed by an add, but splits it
>> up into a delete followed by an add, following callback design.
>>
>> My crusade is to move the work for combining delete() + add() into a single
>> "replace" into the *driver* of the new callbacks, considering that that
>> information may already be lying around for free in most cases.
>>
>> Even if it isn't, I think it is better to make the *driver* do the
>> combination work "once", and not burden "all those receivers" with the work
>> of correlating deletes with adds.
>>
>> The problem I ran into is this:
>>
>> To effectively take the burden off the receivers, we need to *enforce* that
>> the driver actually combines delete + add = replace. If the driver had a
>> choice to combine or not combine "replace" calls, all those receivers have
>> to be able to handle *both* cases, and the whole plan backfires, adding more
>> code instead of simplifying.
>>
>> I will now try to theoretically home in on a Good API definition around
>> atomic replace.
>>
>>
>> Aspects:
>>
>>
>> (*) Receivers should not have to combine "replace" operations.
>>
>> --> We need to *disallow* issuing a delete() followed by an add() operation
>> on the same path. The drivers all need to use the atomic-replace channel
>> instead, which is to issue an add() operation with a REPLACES_REV arg.
>>
>>
>> (*) Two-sided vs. detailed history
>> In current merge (and all other operations AFAIK), if a certain path was
>> deleted and added multiple times in a given range, all the intermediate
>> steps are dropped and the only things that matter are
>>  - the initial state and
>>  - the final state.
>> e.g. (D, A, M, D, A) becomes (D, A), the three in the center are dropped.
>> However, there are rumors that at some point all the intermediate steps will
>> be necessary (keyword "operational merge").
> 
> (Greg: Neels is talking about a hypothetical new "operation-based" merge
> editor that is a consumer of the editor API. It would want to be driven
> not simply with the difference between merge-left and merge-right
> sources, but rather with the full sequence of changes that were
> originally made to get from merge-left to merge-right.)
(yep.)

>> --> We need to allow multiple atomic replaces in a given range, so that each
>> sub-sequence of (D, A) becomes a (R):
>>   (D, A, M, D, A) becomes (R, M, R) in an operational merge.
> 
> Even if we decide to go towards "operation-based" merging, I think we
> have no idea yet what it would look like, and we don't know whether we
> would want to represent that (R, M, R) sequence. So I don't think there
> is any point in considering that now.

Well, I was just hypothetically testing the constraint of "must combine" for
the "operational merge" concept. I didn't want to introduce a constraint
that would cause design pains soon after. So, in the docs, I'll say
something like "you must combine D,A to R, but you could have multiple Rs in
a row to represent multiple steps". Legal: A, R, R, M, R, D

> For now, it can recreate the existing indication(s) on multiple lines.
> Or rather
>
> D  move-source-path
> R+ replaced-path (copied from move-source-path)

Yea, I was thinking the same.

> 
> D  move-source-path
> D  replaced-path
> A+ replaced-path (copied from move-source-path)
> 
> 
>> (*) Copy
>> There also is a new copy() call instead of the "add(copyfrom_revision)". I
>> guess it's a bikeshed whether the copy() callback should be separate instead
>> of adding a "copyfrom_rev" to the add() callback. In short, copy == add+.
>> Like with moves, it is possible to send a delete() followed by a copy() to
>> the same path, which amounts to yet another "replace" variant.
>>
>> --> We need to also add a REPLACES_REV argument to the copy() callback, and
>> disallow issuing a delete followed by a copy to the same path.
> 
> Sounds good.
> 
> 
>> (*) Different meanings
>> It is conceivable that one single "replace" call may have a different
>> meaning from separate delete() and add() calls. That would be in the way
>> that the one type represents a closer relation than the other.
>>
>> --> My conclusion is that this is not the case / does not matter. An atomic
>> replace is simply a new node taking the path of another node, the two of
>> them being unrelated. There *are* no less-related cases than a "replace"
>> operation (on the same path).
> 
> This all sounds good to me.
> 
>>  There are more-related cases than a "replace",
> 
> You talk about copies and moves that replace an existing node. I don't
> think these operations involve a "more related" kind of replacement,
> they just involve the same kind of replacement within a different
> operation.
> 
> I can't think of any more or less related kinds of replacement. In other
> words, I can't think of any reason why a user might consider the
> "replace" operation to mean something different from the combination of
> "delete" and "add".
> 
> 
>> which then become a move() or a copy() (a.k.a. add-with-history). (Thus I
>> suggest adding the REPLACES_REV arg to move() and copy() as well, above. If
>> the added node is related to the node that is being replaced, heavens
> 
> "related to"? Do you mean "a copy of"?

I was thinking specifically of a resurrection:

rev op
1   A  foo.c
2   M  foo.c
3   D  foo.c
4   A  foo.c  <-- an unrelated node
5   R+ foo.c (from foo.c@2)

hm, what was I thinking again?

>> forbid, it should become a copy() operation with SRC_RELPATH == DST_RELPATH
>> and REPLACES_REV == SRC_REVISION.)

Ah, ok, that thinking was flawed, actually REPLACES_REV > SRC_REVISION (4>2)
-- sorry.

> So the result is: file@10 is copied from file@9. I don't know if a
> Subversion repository can tell the difference between that and a no-op
> (or a simple content modification).

But that case wouldn't modify any content. It could result in a log message
with maybe props modified, but no code modified at all. Heh, we should
probably forbid/skip that.

> So: +1 to the plan.

Greg, Julian, t'was a great pleasure to read your responses. Nice to see I'm
on the right track.

~Neels

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2382964

Re: no-op-copy -- was: Re: Defining atomic "replace"

Posted by Neels Janosch Hofmeyr <ne...@elego.de>.

Julian Foad wrote:
> At the moment I don't mind that it works this way. Not sure what you
> mean by "worth testing for".

Yes, that's exactly what I meant ;)
"it's not worth the trouble to test for this case and change this behavior
at the moment".

~Neels

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383092

Re: no-op-copy -- was: Re: Defining atomic "replace"

Posted by Julian Foad <ju...@btopenworld.com>.

Neels J Hofmeyr wrote:
> + svn rm file
> D         file
> + svn cp '^/file@1' file
> A         file
> + svn st
> R  +    file
> + svn ci -m noop
> Replacing      file
> 
> Committed revision 2.
> + svn log file
> ------------------------------------------------------------------------
> r2 | neels | 2009-08-12 20:43:07 +0200 (Wed, 12 Aug 2009) | 1 line
> 
> noop
> ------------------------------------------------------------------------
> r1 | neels | 2009-08-12 20:43:05 +0200 (Wed, 12 Aug 2009) | 1 line
> 
> add
> ------------------------------------------------------------------------
> + svn diff -c 2
> [no diff output]
> ]]]
> 
> So yes, svn currently permits this no-op.
> But I don't really understand the question. What do you mean by "can svn
> tell the difference"?

To answer my question you need to run "svn log -v". What I mean is: does
Subversion treat "replaced with SELF@(REV-1)" differently from a no-op?

Answer:
[[[
$ svn16 log -v --stop-on-copy wc/file
------------------------------------------------------------------------
r2 | julianfoad | 2009-08-13 00:37:37 +0100 (Thu, 13 Aug 2009) | 1 line
Changed paths:
   R /file (from /file:1)

noop
------------------------------------------------------------------------
]]]

and the --stop-on-copy stops there, which is very different from how a
completely no-op change would appear (or rather not appear) in history.

So Subversion CAN tell the difference between that and a no-op, even
though the diff is a no-op.


> Does anyone think this case is worth the trouble testing for? It would make
> an awful lot of sense to cancel this to void, but in practice ... the manual
> says "revert" anyway, right?

At the moment I don't mind that it works this way. Not sure what you
mean by "worth testing for".

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383080

no-op-copy -- was: Re: Defining atomic "replace"

Posted by Neels Janosch Hofmeyr <ne...@elego.de>.

Julian Foad wrote:
> So the result is: file@10 is copied from file@9. I don't know if a
> Subversion repository can tell the difference between that and a no-op
> (or a simple content modification). I would be interested to know
> whether it can.

With Subversion 1.6.4, the attached script shows that this no-op is allowed.
I get this output:

[[[
[...]
+ cd wc
+ echo one
+ svn add file
A         file
+ svn ci -m add
Adding         file
Transmitting file data .
Committed revision 1.
+ svn rm file
D         file
+ svn cp '^/file@1' file
A         file
+ svn st
R  +    file
+ svn ci -m noop
Replacing      file

Committed revision 2.
+ svn log file
------------------------------------------------------------------------
r2 | neels | 2009-08-12 20:43:07 +0200 (Wed, 12 Aug 2009) | 1 line

noop
------------------------------------------------------------------------
r1 | neels | 2009-08-12 20:43:05 +0200 (Wed, 12 Aug 2009) | 1 line

add
------------------------------------------------------------------------
+ svn diff -c 2
[no diff output]
]]]

So yes, svn currently permits this no-op.
But I don't really understand the question. What do you mean by "can svn
tell the difference"?

Does anyone think this case is worth the trouble testing for? It would make
an awful lot of sense to cancel this to void, but in practice ... the manual
says "revert" anyway, right?

~Neels

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2382983

Re: Defining atomic "replace"

Posted by Julian Foad <ju...@btopenworld.com>.

On Wed, 2009-08-12, Julian Foad wrote:
> > (*) Atomic move
> > In the new editor callbacks, there is a separate move() call to represent
> > atomic moves. It is thus possible to issue a delete() followed by a move()
> > to the same path that was deleted. Which is, essentially, a replace!
> > 
> > --> We need to also add a REPLACES_REV argument to the move() callback, and
> > we need to disallow issuing a delete followed by a move to the same path.
> > Actually, has this case ever been considered? Currently, a move() becomes an
> > "add-with-history" plus a delete(). Our API doesn't indicate a move
> > replacing a path.
> 
> I think the current representation of this would be two deletes (delete
> the move-source and delete the node being replaced) and then an
> add-with-history.
> 
> >  We could need another update/status letter for this.
> 
> For now, it can recreate the existing indication(s) on multiple lines.
> 
> D  move-source-path
> D  replaced-path
> A+ replaced-path (copied from move-source-path)

Or rather

D  move-source-path
R+ replaced-path (copied from move-source-path)

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2382731

Re: Defining atomic "replace"

Posted by Julian Foad <ju...@btopenworld.com>.

On Tue, 2009-08-11 at 20:51 +0200, Neels J Hofmeyr wrote:
> Hi Julian and everyone else,
> 
> I'd like to summarize a little about atomic replace operations in the new
> editor interface (subversion/include/svn_editor.h,
> subversion/libsvn_delta/editor.c) and have a conclusion.
> 
> My intent is to further tweak svn_editor.h shortly, so feedback is welcome.
> 
> 
> 
> The argument started off like this:
> 
> If a file or folder at a certain path is deleted, and some other file or
> folder is added at the same path, that is a replace.
> 
> Sometimes it is nice and/or necessary to be able to relate the delete() and
> the add(), for example to print
>  R  foo
> instead of
>  D  foo
>  M  other_stuff
>  A  foo
> , and for detecting tree-conflicts.
> 
> Currently, wherever that happens, the *receiving* side of the editor
> callbacks has to figure out whether a delete coincides with an add at the
> same path. That involves caching deleted nodes == bloat and complication.
> 
> Fact is, there is at least one *driver* of the editor callbacks that has
> already made the correlation of a delete followed by an add, but splits it
> up into a delete followed by an add, following callback design.
> 
> My crusade is to move the work for combining delete() + add() into a single
> "replace" into the *driver* of the new callbacks, considering that that
> information may already be lying around for free in most cases.
> 
> Even if it isn't, I think it is better to make the *driver* do the
> combination work "once", and not burden "all those receivers" with the work
> of correlating deletes with adds.
> 
> The problem I ran into is this:
> 
> To effectively take the burden off the receivers, we need to *enforce* that
> the driver actually combines delete + add = replace. If the driver had a
> choice to combine or not combine "replace" calls, all those receivers have
> to be able to handle *both* cases, and the whole plan backfires, adding more
> code instead of simplifying.
> 
> I will now try to theoretically home in on a Good API definition around
> atomic replace.
> 
> 
> Aspects:
> 
> 
> (*) Receivers should not have to combine "replace" operations.
> 
> --> We need to *disallow* issuing a delete() followed by an add() operation
> on the same path. The drivers all need to use the atomic-replace channel
> instead, which is to issue an add() operation with a REPLACES_REV arg.
> 
> 
> (*) Two-sided vs. detailed history
> In current merge (and all other operations AFAIK), if a certain path was
> deleted and added multiple times in a given range, all the intermediate
> steps are dropped and the only things that matter are
>  - the initial state and
>  - the final state.
> e.g. (D, A, M, D, A) becomes (D, A), the three in the center are dropped.
> However, there are rumors that at some point all the intermediate steps will
> be necessary (keyword "operational merge").

(Greg: Neels is talking about a hypothetical new "operation-based" merge
editor that is a consumer of the editor API. It would want to be driven
not simply with the difference between merge-left and merge-right
sources, but rather with the full sequence of changes that were
originally made to get from merge-left to merge-right.)

> --> We need to allow multiple atomic replaces in a given range, so that each
> sub-sequence of (D, A) becomes a (R):
>   (D, A, M, D, A) becomes (R, M, R) in an operational merge.

Even if we decide to go towards "operation-based" merging, I think we
have no idea yet what it would look like, and we don't know whether we
would want to represent that (R, M, R) sequence. So I don't think there
is any point in considering that now.


> (*) Atomic move
> In the new editor callbacks, there is a separate move() call to represent
> atomic moves. It is thus possible to issue a delete() followed by a move()
> to the same path that was deleted. Which is, essentially, a replace!
> 
> --> We need to also add a REPLACES_REV argument to the move() callback, and
> we need to disallow issuing a delete followed by a move to the same path.
> Actually, has this case ever been considered? Currently, a move() becomes an
> "add-with-history" plus a delete(). Our API doesn't indicate a move
> replacing a path.

I think the current representation of this would be two deletes (delete
the move-source and delete the node being replaced) and then an
add-with-history.

>  We could need another update/status letter for this.

For now, it can recreate the existing indication(s) on multiple lines.

D  move-source-path
D  replaced-path
A+ replaced-path (copied from move-source-path)


> (*) Copy
> There also is a new copy() call instead of the "add(copyfrom_revision)". I
> guess it's a bikeshed whether the copy() callback should be separate instead
> of adding a "copyfrom_rev" to the add() callback. In short, copy == add+.
> Like with moves, it is possible to send a delete() followed by a copy() to
> the same path, which amounts to yet another "replace" variant.
> 
> --> We need to also add a REPLACES_REV argument to the copy() callback, and
> disallow issuing a delete followed by a copy to the same path.

Sounds good.


> (*) Different meanings
> It is conceivable that one single "replace" call may have a different
> meaning from separate delete() and add() calls. That would be in the way
> that the one type represents a closer relation than the other.
> 
> --> My conclusion is that this is not the case / does not matter. An atomic
> replace is simply a new node taking the path of another node, the two of
> them being unrelated. There *are* no less-related cases than a "replace"
> operation (on the same path).

This all sounds good to me.

>  There are more-related cases than a "replace",

You talk about copies and moves that replace an existing node. I don't
think these operations involve a "more related" kind of replacement,
they just involve the same kind of replacement within a different
operation.

I can't think of any more or less related kinds of replacement. In other
words, I can't think of any reason why a user might consider the
"replace" operation to mean something different from the combination of
"delete" and "add".


> which then become a move() or a copy() (a.k.a. add-with-history). (Thus I
> suggest adding the REPLACES_REV arg to move() and copy() as well, above. If
> the added node is related to the node that is being replaced, heavens

"related to"? Do you mean "a copy of"?

> forbid, it should become a copy() operation with SRC_RELPATH == DST_RELPATH
> and REPLACES_REV == SRC_REVISION.)

So the result is: file@10 is copied from file@9. I don't know if a
Subversion repository can tell the difference between that and a no-op
(or a simple content modification). I would be interested to know
whether it can. Either way, I don't think it affects this proposal.

So: +1 to the plan.

- Julian


> Thanks for feedback,
> 
> ~Neels
> 
> 
> Julian Foad wrote:
> > On Tue, 2009-07-21, Neels J Hofmeyr wrote:
> >> Julian Foad wrote:
> >>>> The driver is required to use add() with REPLACES_REV instead of a separate
> >>>> delete() call before this add(). Receivers of this callback in general are
> >>>> not required to support a separate delete() call for a 'replace'.
> >>> I think it would be clearer to state the editor requirements separately
> >>> from the usage recommendations. (This last paragraph and the following
> >>> one seem to mix the two.)
> >>>
> >>>> If you intend to implement a separate delete() call when driving this
> >>>> editor, make sure you really have to and put up "big red signs" in the
> >>>> documentation of your driver. Be aware that, in general, the driver should
> >>>> do all of the (possibly hard) work of combining a delete() with an add() to
> >>>> form an atomic replace, taking that burden off the various receivers. So if
> >>>> you write a driver that doesn't do that, you have to make sure that all your
> >>>> receivers (preferably very few) can handle a non-atomic replace.
> >>>> Furthermore, a delete() call should occur always before, never after an
> >>>> add() call for the same node.
> >>>> ]]]
> >> Yes, you're right.
> >> Where is the line between editor requirements and usage recommendations?
> >>
> >> Maybe you can help to choose a side... Or are these sides chosen by
> >> individual implementations anyway?
> > 
> > It is important to decide and clearly draw the line somewhere. The whole
> > point of having a standard "interface" is so that multiple producers and
> > consumers can be written independently and someone else can plug them
> > together and know that they will work together.
> > 
> >> I am more for forcing an atomic replace. I would rather remove the loophole
> >> that says "well, but if you must, just issue separate calls". I'd personally
> >> not mind if some parts of the code send separate calls, but I want e.g. the
> >> merge callback receivers to be certain that they will *not* receive separate
> >> calls *ever*.
> > 
> > OK, we can make that a requirement. We must ensure that all of the
> > current producers obey it.
> > 
> > 
> >> How about we choose the stricter comment, but we let people use it
> >> differently if they must without saying so now?
> > 
> > No, let's not invite trouble.
> > 
> > 
> > [...]
> >> - is it ok to disallow multiple add() and delete() calls on the same path,
> >>   in principle?
> >>    e.g. during merge, there are only the two sides, merge-left
> >>    and merge-right. A replace means that both exist but are "unrelated". So
> >>    a merge will never need to send more than one (delete, add) tuple.
> >>    Usually it's either *one* delete or *one* add. The (add, delete) tuple
> >>    simply cancels out, nothing being sent.
> > 
> > Yes, it's absolutely fine to specify that the editor must be driven with
> > a single change for each path.
> > 
> > Let's imagine that the user does "svn rm X; svn add X; svn rm X; svn add
> > X" in his/her WC. Let's imagine that the WC code has an "undo history"
> > that recorded all the separate actions in this sequence. When the user
> > makes a commit (or a diff or an update or whatever), then it's the
> > calling code's job to ensure that the editor is driven with a single
> > change (replace X(old) with X(new)).
> > 
> >>   But are there cases where we also send all the steps in-between? Then we
> >>   could allow this: add(), add(REPLACES_REV), add(REPLACES_REV), delete().
> >>   But *are there*?
> > 
> > I don't know whether there are currently cases where we do this, other
> > than a single delete followed by a single add. But we are defining the
> > interface. We must define what we think is best and then the callers
> > must do what we say.
> > 
> >> - is it necessary to disallow multiple add() and delete() calls on the same
> >>   path, to make the atomic replace useful?
> >>    e.g. during merge, the receiver needs to cache all delete()s and match
> >>    with add()s, carrying out the leftovers during dir_close(). And merge
> >>    is not the only one in need of this quirky code construct.
> >>   I want to eradicate this madness -- I'd love to force atomic replaces
> >>   on everyone, so the receivers are entirely relieved of this quirkiness.
> >>   Will allowing both separate and atomic replace calls defy my plan by
> >>   actually adding more code to the receivers?
> > 
> > Basically, yes, I think it is necessary.
> > 
> > If a receiver wants to handle a replacement as a single operation and
> > the editor API does not guarantee that it will get a single call, then
> > it can achieve the same end result by batching up the delete requests
> > and executing them just before closing the directory. But if it
> > implements that, then there would be no benefit in having the single
> > "replace" call available.
> > 
> >> - is there a case where separate calls have a different meaning than an
> >>   atomic replace?
> >>    e.g. for merge, it doesn't matter whether the nodes are "related".
> >>   I can't think of any case where we'd want to highlight that the node was
> >>   deleted and later re-added with the same history, or something. Right?
> > 
> > This is the question of critical importance: Is there any semantic
> > difference (difference in meaning) between
> > 
> >   replace X with (something)
> > 
> > and
> > 
> >   delete X; add (something) at path "X"
> > 
> > ?
> > 
> > Before I try to answer that, first let's review the history of the
> > semantics of "svn move" (also know as "svn rename"). Initially we
> > decided there was no semantic difference between
> > 
> >   move X to Y
> > 
> > and
> > 
> >   delete X; add Y as a copy of X
> > 
> > That is true as far as it goes, but unfortunately in the implementation
> > we lost the implicit requirement that the "X" being deleted in the first
> > half is the very same "X" that's copied in the second half. It got
> > translated to
> > 
> >   delete X
> >   add Y as a copy of (X's absolute path at a specific recent-ish
> > revision)
> > 
> > That works just about well enough within a single branch, but when
> > merging such a change to another branch the connection between the two
> > X's is completely broken and the result is not what we meant at all.
> > 
> > We usually want a move in the merge source branch, such as
> > 
> >   delete source/X
> >   add source/Y as a copy of (source/X at its latest revision)
> > 
> > to be merged into the target branch as
> > 
> >   delete target/X
> >   add target/Y as a copy of (target/X at its latest revision)
> > 
> > So, working back to how we could better have defined the semantics of
> > "move", we could say "move X to Y" means
> > 
> >   remove entry X from X's directory
> >     (but don't destroy its content, because it's just being moved)
> >   add entry Y into Y's directory,
> >     taking its content from the X that's being removed
> > 
> > Note that the two parts are intimately connected together, and that the
> > "delete" half is semantically different from a plain "delete".
> > 
> > I think that's where we need to be going with the "true renames" issue
> > <http://subversion.tigris.org/issues/show_bug.cgi?id=898>.
> > 
> > That's not quite the end of the story with "move". One side issue is
> > that I think the semantics of "copy" should be updated in the same way,
> > so when "copy source/X to source/Y" is merged it should add target/Y as
> > a copy of (target/X at latest revision), not of (source/X at some
> > specific old revision). Another is that I've hand-waved on how the path
> > "source/X" is mapped to the path "target/X"; I think it involves
> > treating them both as relative to the respective merge root paths, and
> > adjusting for any directories moved within the merge. The merge should
> > throw an error if the move source is outside the merged tree.
> > 
> > Now, what does that teach us about how we want "replace" to behave? Are
> > we making some assumption when we "replace X with a copy of Y" that is
> > subtly stronger than "delete X; add a new X that's a copy of Y"? Does it
> > make a difference if instead of "Y" we say "a previous revision of X"?
> > 
> > - Julian
> > 
> > ------------------------------------------------------
> > http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2376254
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2382730