You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ben Collins-Sussman <su...@collab.net> on 2004/01/15 17:25:20 UTC

RFC: change 'svn merge' default behavior.

Gentlefolk, 

Karl and I want to propose a very simple change to 'svn merge' default
behavior in svn 0.37.  Allow me to explain.


1.  We have a problem

Early this week I merged all of the book changes from trunk to the 1.0
branch.  The merge seemed simple enough -- a bunch of files got
patched, but one had a conflict.  I couldn't figure out why there'd be
a conflict... the book never changed on the branch.  After some
investigation, I discovered that totally bogus hunks of text had been
applied to the wrong files!  Most the local mods from 'svn merge' were
bogus.

What happened?  My trunk changes involved renames; I moved ch06->ch07,
ch07->ch08, and ch08->ch09.  'svn merge', just like 'svn diff',
ignores file ancestry (copy history) by default.  So it was applying
ch07 changes from trunk to a similarly named (but totally unrelated!)
file on the branch.  After reverting the bogus merge, I ran 'svn merge
--notice-ancestry' and got the correct set of changes: a bunch of
deletes and adds of identically-name files.  Everything was fine.


2.  The history of this issue

Once upon a time, diff and merge used to both notice ancestry by
default.  But this angered users of 'svn diff'.  When running 'svn
diff URL1 URL2', people would be annoyed that diff was being "too
smart".  It would notice that two identical file paths were unrelated,
and show a delete of one file (all minuses) and the add of the other
file (all pluses).  But when running 'svn diff', the user usually
doesn't give a darn about whether identically-named files are related
or not: they just want to see the diff between them!  (The diff format
doesn't understand tree changes anyway, so why include that
information?)

So, we did the right thing: we made 'svn diff URL1 URL2' ignore
ancestry by default, and "dumbly" diff identical file paths.  But in
the interest of consistency, folks decided that diff and merge should
behave the same.  So both commands now ignore ancestry by default, and
require a long option to notice ancestry.


3.  The proposal

Very simple: allow 'svn diff' continue to ignore ancestry by default,
but change 'svn merge' to notice ancestry by default.

(The patch for this involves no API changes; internally, the diff and
merge client functions already have the ability to ignore ancestry or
not.  The patch is a trivial change to cmdline client's default
behaviors only.  A long option can always be used to override either
subcommand's default behavior.)

Yes, I understand that this breaks our nice consistency between diff
and merge.  But Karl and I think the risk for producing munged local
mods here is just too high.

99% of the time, when people are merging changesets from one branch to
another, people want ancestry to be noticed and copies to be honored.
It's dangerous and foolish to default to the 1% use-case.  Keep in
mind, I wrote the *book* chapter about branching and merging, and I
still got burned by merge's default behavior.  It's not always so
obvious to detect the munged local mods... if I hadn't received a
conflict, I might have accidentally committed garbage.

So yes, it's "nice" to have consistent default behaviors in diff and
merge, but this route leads to either (1) 'svn diff' not showing diffs
by default (very user-unfriendly), or alternately (2) a significant
risk of 'svn merge' creating bogus local mods by default.   In my
mind, neither evil is worth the "niceness" of consistency.  It's hard
to enumerate the concrete benefits of consistency, except,
well... "it's consistent."


I've already talked about this proposal  with kfogel, ghudson, and epg.
But I wanted other developers to see the proposal and have a chance to
react.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Mike Mason <mg...@thoughtworks.net>.
Ben Collins-Sussman wrote:

>I've already talked about this proposal  with kfogel, ghudson, and epg.
>But I wanted other developers to see the proposal and have a chance to
>react.
>  
>

I'm not a developer, but please do make the change. I can see myself 
getting bitten by this, and would categorise it as a serious detriment 
to a "strategy of least suprise". Subversion should by default do the 
sensible thing.

Cheers,
Mike.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Ben Collins-Sussman <su...@collab.net>.
(posted for amusement)

<ghudson> The diff/merge thing reminds me of the "marriage penalty" in
tax law, where it all depends on what you think is "fair."

<sussman> yah, we have a 3 way tension.

<sussman> it's most convenient to have diff act one way, merge act
another way

<sussman> and it's also convenient to have them act the same way!

<ghudson> Clearly users should be required to type in a paragraph about
what they're really trying to do before they can run any command.  Then
we can pick the most intuitive behavior for them. :)

<sussman> exactly!


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2004-01-15 at 12:42, Sander Striker wrote:
> > (The diff format
> > doesn't understand tree changes anyway,
> 
> yet..

Well, I don't think the format accepted by "patch" can really be
extended to understand tree changes.  So, if we ever have a variant of
diff which uses a format which understands tree changes, it will have to
be an option or (less likely) a separate command.  I would say that
specifying the option should flip the default, such that a tree-diff
normally pays attention to ancestry.

> I still am not supporting this decission.  svn diff TARGET1 TARGET2
> should indeed ignore history.  However, svn diff -r X:Y TARGET should
> notice history.  Hmpf, we need that peg revision proposal implemented.

I'm not sure I buy that these two forms of diff are so fundamentally
different.  You're right that svn diff -r X:Y TARGET necessarily diffs
between two items with a historical relationship, but given the output
format, I don't think it really makes sense to pay attention to ancestry
when performing a diff.

(Yet another choice is to always ignore ancestry at the top level of a
diff--right now we always do so if the anchor is the target, but not
otherwise--and pay attention to ancestry by default below that.  But I
don't support that choice, at least for diff, because of the output
format issue.)

> > Very simple: allow 'svn diff' continue to ignore ancestry by default,
> > but change 'svn merge' to notice ancestry by default.
> 
> Does this mean that svn diff -r X TARGET compares the WC against the
> URL@X instead of the file in the WC at version X?

Ignoring ancestry has nothing to do with picking a target.  Until
cmpilato's peg-rev proposal is implemented, we erroneously compare
against URL@X.

> The alternative is... a config option for both diff and merge,
> defaulting to ignore-ancestry for diff and notice-ancestry for merge.
> People can then set the default to their preference.  Although I must
> admit I don't really want to add yet another config option.

I don't see the value in adding a config option, and it certainly
doesn't save us from having to choose sane defaults.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Branko Čibej <br...@xbc.nu>.
Sander Striker wrote:

>On Thu, 2004-01-15 at 18:25, Ben Collins-Sussman wrote:
>  
>
>> so why include that
>>information?)
>>
>>So, we did the right thing: we made 'svn diff URL1 URL2' ignore
>>ancestry by default, and "dumbly" diff identical file paths.
>>    
>>
>
>I still am not supporting this decission.  svn diff TARGET1 TARGET2
>should indeed ignore history.  However, svn diff -r X:Y TARGET should
>notice history.
>
+1, emphatically.

>  Hmpf, we need that peg revision proposal implemented.
>  
>
Yah, post-1.0 it seems, infortunately... I think this ommission will be
one of the more embarrasing sore thumbs in 1.0.

>  
>
>>  But in
>>the interest of consistency, folks decided that diff and merge should
>>behave the same.  So both commands now ignore ancestry by default, and
>>require a long option to notice ancestry.
>>    
>>
>
>Merging should always notice ancestry by default.  The only time it
>is interesting to not notice ancestry is vendor branch merging.  This
>should be uncommon enough to warrant --ignore-history.
>  
>
Ditto +1


To the proposal: It totally wrong for either diff or merge to ignore
ancestry by default. This goes against the intent and spirit of
directory versioning. yes, diff output should note the different file names.

We have the 2-argument variants if you want to see the difference
between two unrelated files that happen to have the same name in
different revisions.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Branko Čibej <br...@xbc.nu>.
Eric Gillespie wrote:

>My primary point all along has been this:
>
>svn export URL1 1
>svn export URL2 2
>diff -ru 1 2
>
>should do the same as
>
>svn diff URL1 URL2
>  
>
I disagree. There is a fundamental difference between the two: In the
first case, you're comparing two unversioned trees; even if the diff
tool knew how to use history information, there isn't any.

In the second case, you're comparing two parts (branches) of a versioned
repository, and all history information is available. This diff should
tell you how one branch is related to the other, and ignoring history
defeats this purpose.

>It shouldn't matter whether the URLs represent files or
>directories or vendor branches or whatever.  In other words, Just
>Diff Them Dammit! :)
>  
>
Yes, this is a valid use case -- but should not be the default.

>If svn diff is to behave differently than diff(1) by default, we
>should rename it.  I don't like that option though...
>  
>
That's nonsense. "svn mv", "svn cp", "svn rm" all behave differently
than their shell equivalents -- in subtle ways, of course. Diff is the same.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Ben Collins-Sussman <su...@collab.net> writes:

> Mike Pilato points out that if diff and merge have different
> default behaviors, it makes it impossible to run diff as a "dry
> run" for a merge.

As Greg Hudson points out, it is not impossible.

> It's likely that you'll see very different results.  That's
> really the Really Big Argument for keeping diff/merge's default
> behaviors consistent.

I don't think it's such a big argument.  This is not the common
case for diff.

> Further, Sander points out that the main time 'svn diff' needs
> to ignore history is when you're comparing unrelated vendor
> import trees.  This isn't all that common a use-case... not
> nearly as common as using 'svn diff' for a merge dry-run.

I don't think that's it.

My primary point all along has been this:

svn export URL1 1
svn export URL2 2
diff -ru 1 2

should do the same as

svn diff URL1 URL2

It shouldn't matter whether the URLs represent files or
directories or vendor branches or whatever.  In other words, Just
Diff Them Dammit! :)

If svn diff is to behave differently than diff(1) by default, we
should rename it.  I don't like that option though...

> So here's a better proposal, one that I/kfogel/cmpilato like:
> 
>     * make diff and merge both default to noticing ancestry.
>     * rename the existing long option to '--ignore-ancestry'
>     * recommend this flag for vendor branching use-cases.

I don't like it.

--  
Eric Gillespie <*> epg@pretzelnet.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2004-01-15 at 13:12, Ben Collins-Sussman wrote:
> Mike Pilato points out that if diff and merge have different default
> behaviors, it makes it impossible to run diff as a "dry run" for a
> merge.

No, it merely means you have to supply an option to diff to use it as a
"dry run" for a merge, which is not the common case of performing a
diff.

Right now, the closest approximation to that option is
--notice-ancestry; an option to show a real tree diff (in a
yet-to-be-determined format) would of course be better.

> Further, Sander points out that the main time 'svn diff' needs to ignore
> history is when you're comparing unrelated vendor import trees.  This
> isn't all that common a use-case... not nearly as common as using 'svn
> diff' for a merge dry-run.

As I recall, one of the big complaints was that "svn diff url-of-file1
url-of-file2" would show a delete-then-add instead of a diff between the
two file contents.  So, I would say that at the very least, it's pretty
anti-intuitive to heed ancestry on the top-level target of a diff.  (But
that's not the case for a merge.)

> So here's a better proposal, one that I/kfogel/cmpilato like:
> 
>     * make diff and merge both default to noticing ancestry.
>     * rename the existing long option to '--ignore-ancestry'
>     * recommend this flag for vendor branching use-cases.

I don't like it, primarily for reasons I expressed in my reply to
Sander.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Ben Collins-Sussman <su...@collab.net>.
On Thu, 2004-01-15 at 11:42, Sander Striker wrote:

> > So, we did the right thing: we made 'svn diff URL1 URL2' ignore
> > ancestry by default, and "dumbly" diff identical file paths.
> 
> I still am not supporting this decission.  svn diff TARGET1 TARGET2
> should indeed ignore history.  However, svn diff -r X:Y TARGET should
> notice history.  Hmpf, we need that peg revision proposal implemented.
> 

See my comments below.  I think we have a new proposal.


> Merging should always notice ancestry by default.  The only time it
> is interesting to not notice ancestry is vendor branch merging.  This
> should be uncommon enough to warrant --ignore-history.

I'm glad you agree.  So far, you, I, cmpilato, kfogel, ghudson and epg
all agree that 'svn merge' should default to noticing history.  That's
the real "core issue" here.  The only question is:  what's the best
proposal which includes this change?

Mike Pilato points out that if diff and merge have different default
behaviors, it makes it impossible to run diff as a "dry run" for a
merge.  It's likely that you'll see very different results.  That's
really the Really Big Argument for keeping diff/merge's default
behaviors consistent.

Further, Sander points out that the main time 'svn diff' needs to ignore
history is when you're comparing unrelated vendor import trees.  This
isn't all that common a use-case... not nearly as common as using 'svn
diff' for a merge dry-run.

Looking back at history, all of the complaints about 'svn diff' came
from the fact that at that time, diff *could not* ignore history.  The
option simply wasn't available.  Mike's repsonse was to add the feature,
and also make it the default.

So here's a better proposal, one that I/kfogel/cmpilato like:

    * make diff and merge both default to noticing ancestry.
    * rename the existing long option to '--ignore-ancestry'
    * recommend this flag for vendor branching use-cases.

This proposal keeps 'merge' safe, and keeps it consistent with diff for
dry-run use.  And it still allows people to make diff 'dumb' when
comparing vendor trees.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: change 'svn merge' default behavior.

Posted by Sander Striker <st...@apache.org>.
On Thu, 2004-01-15 at 18:25, Ben Collins-Sussman wrote:
> Gentlefolk, 
> 
> Karl and I want to propose a very simple change to 'svn merge' default
> behavior in svn 0.37.  Allow me to explain.
> 
> 
> 1.  We have a problem
> 
> Early this week I merged all of the book changes from trunk to the 1.0
> branch.  The merge seemed simple enough -- a bunch of files got
> patched, but one had a conflict.  I couldn't figure out why there'd be
> a conflict... the book never changed on the branch.  After some
> investigation, I discovered that totally bogus hunks of text had been
> applied to the wrong files!  Most the local mods from 'svn merge' were
> bogus.

Ouch!

> What happened?  My trunk changes involved renames; I moved ch06->ch07,
> ch07->ch08, and ch08->ch09.  'svn merge', just like 'svn diff',
> ignores file ancestry (copy history) by default.  So it was applying
> ch07 changes from trunk to a similarly named (but totally unrelated!)
> file on the branch.  After reverting the bogus merge, I ran 'svn merge
> --notice-ancestry' and got the correct set of changes: a bunch of
> deletes and adds of identically-name files.  Everything was fine.
> 
> 
> 2.  The history of this issue
> 
> Once upon a time, diff and merge used to both notice ancestry by
> default.  But this angered users of 'svn diff'.

I still wonder how many people we are talking about here.

>   When running 'svn
> diff URL1 URL2', people would be annoyed that diff was being "too
> smart".  It would notice that two identical file paths were unrelated,
> and show a delete of one file (all minuses) and the add of the other
> file (all pluses).  But when running 'svn diff', the user usually
> doesn't give a darn about whether identically-named files are related
> or not: they just want to see the diff between them!  (The diff format
> doesn't understand tree changes anyway,

yet..

>  so why include that
> information?)
> 
> So, we did the right thing: we made 'svn diff URL1 URL2' ignore
> ancestry by default, and "dumbly" diff identical file paths.

I still am not supporting this decission.  svn diff TARGET1 TARGET2
should indeed ignore history.  However, svn diff -r X:Y TARGET should
notice history.  Hmpf, we need that peg revision proposal implemented.

>   But in
> the interest of consistency, folks decided that diff and merge should
> behave the same.  So both commands now ignore ancestry by default, and
> require a long option to notice ancestry.

Merging should always notice ancestry by default.  The only time it
is interesting to not notice ancestry is vendor branch merging.  This
should be uncommon enough to warrant --ignore-history.

> 
> 3.  The proposal
> 
> Very simple: allow 'svn diff' continue to ignore ancestry by default,
> but change 'svn merge' to notice ancestry by default.

Does this mean that svn diff -r X TARGET compares the WC against the
URL@X instead of the file in the WC at version X?

> (The patch for this involves no API changes; internally, the diff and
> merge client functions already have the ability to ignore ancestry or
> not.  The patch is a trivial change to cmdline client's default
> behaviors only.  A long option can always be used to override either
> subcommand's default behavior.)
> 
> Yes, I understand that this breaks our nice consistency between diff
> and merge.  But Karl and I think the risk for producing munged local
> mods here is just too high.

+1 on that assessment.

> 99% of the time, when people are merging changesets from one branch to
> another, people want ancestry to be noticed and copies to be honored.
> It's dangerous and foolish to default to the 1% use-case.  Keep in
> mind, I wrote the *book* chapter about branching and merging, and I
> still got burned by merge's default behavior.  It's not always so
> obvious to detect the munged local mods... if I hadn't received a
> conflict, I might have accidentally committed garbage.

Which says something about our merge code.  It sometimes is a bit too
smart (in not taking context into account).  The number of lines of
context to take into account when deciding something is a conflict
or not should be configurable.  And probably default to 3 as CVS does.
Anyway, that's a different topic.

> So yes, it's "nice" to have consistent default behaviors in diff and
> merge, but this route leads to either (1) 'svn diff' not showing diffs
> by default (very user-unfriendly), or alternately (2) a significant
> risk of 'svn merge' creating bogus local mods by default.   In my
> mind, neither evil is worth the "niceness" of consistency.  It's hard
> to enumerate the concrete benefits of consistency, except,
> well... "it's consistent."

The alternative is... a config option for both diff and merge,
defaulting to ignore-ancestry for diff and notice-ancestry for merge.
People can then set the default to their preference.  Although I must
admit I don't really want to add yet another config option.


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org