You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Eric Gillespie <ep...@pretzelnet.org> on 2002/12/07 02:36:56 UTC

svn diff, svn merge, and vendor branches (long)

I have been sitting on this problem for months.  From time to
time someone would bring up something closely related on the IRC
channel and i would begin talking about this.  But i have never
explained it in full, and today Sussman finally triggered me to
do it.  Now watch out, i address many issues in this mail, but
they're so closely related i don't divide them up into sections.

Try this:

% echo hi > a
% echo hi > b
% echo bye >> b
% svn add a b
% svn ci a b

Do not svn cp the files.  I created just such two files a couple
months ago, last time this came up on IRC: http://pretzelnet.org/svn/{a,b}.

Try it:

% svn diff http://pretzelnet.org/svn/{a,b}
Index: a
===================================================================
--- a	(revision 1532)
+++ a	(revision 1532)
@@ -1 +0,0 @@
-hi
Index: b
===================================================================
--- b	(revision 1532)
+++ b	(revision 1532)
@@ -0,0 +1,2 @@
+hi
+bye

Yikes.  That output is wrong for a number of reasons.  But the
first thing Sussman said when i first showed this output to him
back in October was that i was using diff incorrectly and needed
to use the URL@REV form.  Nope; just add @1532 to the end of
each of those URL and observe the same output.  Furthermore, try

svn diff \
http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4/gnuserv.c \
http://pretzelnet.org/svn/misc/gnuserv/gnuserv.c

Notice that even w/out the URL@REV format you get a meaningful diff.

Now let's look at what happened here.  In the tmp case, the two
files do not share ancestry (where ancestry is defined in svn
terms), while in the gnuserv case they *do*.

OK, that established let's look at the output from diffing a and
b.  I can't even begin to analyze just WTF we're looking at
here.  When i showed this to Sussman in October (after
commenting about URL@REV), he said that did look funny.  Quite.
When i first began thinking about this problem (probably back in
April), this was not the output i got from diffing two similar
files that did not share ancestry.  Instead what i got was this
(simulated):

--- a	(revision 1532)
+++ b	(revision 1532)
@@ -0,0 +1,2 @@
-hi
+hi
+bye

Now that makes more sense.  The earlier output, i don't even
understand.  This i do, though i do not agree with svn behaving
that way.  I'm going to assume (based on a reasonable assumption
and on Sussman's comment that the earlier output didn't look
right) that the earlier output is just a bug, and what i just
reconstructed above is the intended behavior.  So that's what
i'll be talking about now.

Way back when i first noticed this problem, i was first
experimenting with vendor branches with gnuserv.  I foolishly
assumed that svn import was intended to be an analogue to cvs
import and imported gnuserv-3.12.3 and gnuserv-3.12.4 as two
separate import commands; i.e. they did not share ancestry.  So a
diff on gnuserv.c got me a huge pile of - lines (one for every
line in the first gnuserv.c) followed by a huge pile of + lines
(one for every line in the second gnuserv.c).

Now, from svn's point of view, this output makes sense.  a and b
are not related, so the proper diff is to remove all a's lines
and then add all b's lines.

But this is svn trying to be too smart.  Tools that try to be
too smart inevitably screw it up, because the user knows so much
more than the tool.  Why *doesn't* svn diff work on unrelated
files?  Let's say a and b really weren't related: a is a copy of
fstab and b a copy of printcap.  Both the current too-smart
behavior and my suggested just-give-it-a-try behavior result in
a useless diff.  But, if a and b *are* related, just not in
svn's opinion, i get a much more useful output.

So that is what i suggest.  I can pass any random two files (or
directories) to diff(1) and get useful output if i know what i'm
doing or garbage output if i don't.  svn diff ought to work the
same way.

Finally, this brings us to the present "difficulty" in maintaining
vendor branches in svn.  First let's review cvs:

cvs import misc/gnuserv GNUSERV gnuserv-3_12_3
# time passes, changes are made on HEAD
cvs import misc/gnuserv GNUSERV gnuserv-3_12_4
cvs co -kk -j gnuserv-3_12_3 pj gnuserv-3_12_4 misc/gnuserv
# resolve conflicts
cvs ci

So, my first instinct was to do this in svn:

svn import http://pretzelnet.org/svn/imports/gnuserv/ . gnuserv-3.12.3
# cvs does this next step for us, but i wouldn't want svn to
svn cp http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \
       http://pretzelnet.org/svn/misc/gnuserv/
# time passes, changes are made on HEAD
svn import http://pretzelnet.org/svn/imports/gnuserv/ . gnuserv-3.12.4
svn co http://pretzelnet.org/svn/misc/gnuserv/
cd gnuserv
svn merge http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \
          http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4/

That doesn't work, and we all know why (the two directories in
imports/gnuserv do not share ancestry).  Should it?  Absolutely.
But there are two ways to make it work: 1) my above arguments for
changing svn diff apply here also, so just drop the shared
ancestry requirement for merge; 2) fix import.

I think 2) fix import should be done no matter what.  Should 1)
also be done?  Maybe.  I won't advocate that it be made to work,
though i confess i don't see the harm.  But fixing import would
satisfy my objections, so i'll be ignoring 1) for the rest of
this message.

In the absence of either 1) or 2), here is the solution i've been
using personally and at work for the last few months (i submitted
this to the list and it was included in the handbook; it can
still be seen at
http://svnbook.red-bean.com/book.html#svn-ch-6-sect-4).  Note
that with this procedure, the initial import is quite different
from subsequent imports.

###
### initial import
###
svn import http://pretzelnet.org/svn/imports/gnuserv/ . base
# tag release
svn cp http://pretzelnet.org/svn/imports/gnuserv/base/ \
       http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3
# cvs does this next step for us, but i wouldn't want svn to
svn cp http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \
       http://pretzelnet.org/svn/misc/gnuserv/
###
### subsequent imports
###
svn co http://pretzelnet.org/svn/imports/gnuserv/base/
# Copy contents of new release over this 'base' dir; handle adds
# and deletes, then commit.  Then:
svn cp http://pretzelnet.org/svn/imports/gnuserv/base/ \
       http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4
svn co http://pretzelnet.org/svn/misc/gnuserv/
cd gnuserv
svn merge http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \
          http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4/

Not bad.  All we need to change is the part where you have to
checkout the 'base' dir, copy over the new version, and handle
adds and deletes.  svn import should handle that for us.  It
looks like that's not saving much, but it really is.

Summary (issues raised, in order of appearance)

    1. svn diff usage unclear (URL vs. URL@REV)

    2. diffing two unrelated files gives bizarre output; at some
       point in the past the output it gave was different but at
       least made sense to me (though i disagree with its
       usefulness)

    3. svn ought to just go ahead and diff the files

    4. perhaps merge should be changed similarly to diff, not
       requiring shared ancestry

    5. svn import has little relationship to cvs import; to
       users of vendor branches it is useless

One final note.  I don't consider the import/merge issues to be
urgent or necessarily 1.0 items.

-- 
Eric Gillespie, Jr. <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Eric Gillespie <ep...@pretzelnet.org> writes:
> That isn't enough.  I should be able to svn diff gnuserv-3.12.3
> and gnuserv-3.12.5 even though the names aren't the same.

Oh, yah -- of course.  Ben's talking about names *underneath* the
given targets ("anchors", in SVN-speak), I think.

> I'm confused that you see this as a separate issue.  Maybe it's
> just because i don't understand the implementation.  From my
> perspective, what is discussed above (your "First") covers
> this.  And doing that will address this "Second"--won't it?

Sometimes the single-file-diff case is implemented differently from
tree deltas.  Check the code; I think this is one of those cases.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Mon, 2002-12-09 at 14:33, Eric Gillespie wrote:
> > Second -- Eric's original bug.  He ran 'svn diff' on two unrelated
> > file URLs.

> I'm confused that you see this as a separate issue.  Maybe it's
> just because i don't understand the implementation.

It's an artifact of the implementation.

For most svn operations, if you ask it to operate on a file, svn will
decompose that into an "anchor" of the parent directory and a "target"
of the filename.  This flows from the management of working directories;
for instance, if you ask svn to check out a file, it doesn't just create
the file; it creates a working directory with the file inside. 
Similarly, if you say "svn diff path/to/file", svn needs to look in
path/to/.svn for the text-base and other relevant information.

For a diff between two file-URLs, this decomposition doesn't really make
sense (there's no working-copy admin information to look at), so special
handling is required.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Ben Collins-Sussman <su...@collab.net> writes:

> It will, don't worry.  If you type 'svn diff http://.../gnuserv-3.12.3
> http://.../gnuserv-3.12.5', it will still work as you expect.
> dir_delta will "anchor" it's comparison on these two directories, no
> matter what their names.

Even when they don't share ancestry?  OK.

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Ben Collins-Sussman <su...@collab.net>.
Eric Gillespie <ep...@pretzelnet.org> writes:

> Ben Collins-Sussman <su...@collab.net> writes:
> 
> > So I propose an enhancement: give dir_delta a new "mode" which
> > desensitizes it to ancestry.  If it sees two files or dirs have the
> > same name -- that's enough.  It just won't check for relatedness.
> 
> That isn't enough.  I should be able to svn diff gnuserv-3.12.3
> and gnuserv-3.12.5 even though the names aren't the same.

It will, don't worry.  If you type 'svn diff http://.../gnuserv-3.12.3
http://.../gnuserv-3.12.5', it will still work as you expect.
dir_delta will "anchor" it's comparison on these two directories, no
matter what their names.

> 
> > Once we've implemented this, it's simply a matter of making the two
> > modes available to 'svn diff' and 'svn merge'.  I don't really care
> > which behavior is the default; we can add options to toggle
> > ancestry-sensitivity on or off.
> 
> For merge, i'm not sure.  But for diff i argue that the default
> should be not to worry about ancestry.  Principle of least
> surprise and all that.  If you'd rather wait on the discussion
> about defaults, that's fine with me.

I think I agree.

> 
> > Second -- Eric's original bug.  He ran 'svn diff' on two unrelated
> > file URLs.  Consistent with dir_delta, he was shown two complete diffs
> > representing a delete and an add.  In Chicago, we're all three in
> > agreement that this should never happen.  We don't care about how it's
> > implemented, or whether it "makes sense" given the implementation --
> > it's just wrong output.  None of us could come up with a *single*
> > use-case where a user would ever want or expect this behavior, even
> > if they knew svn's internals inside and out.  This should be filed
> > as a bug.
> 
> I'm confused that you see this as a separate issue.  Maybe it's
> just because i don't understand the implementation.  From my
> perspective, what is discussed above (your "First") covers
> this.  And doing that will address this "Second"--won't it?

I thought someone mentioned that we have special-case code for diffing
a single file, since dir_delta() won't work on a lone file.  My
"first" proposal may or may not fix this problem, I'm not sure till I
investigate implementations.  I just wanted to point out that this bug
has a life of its own, no matter how it ends up fixed.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Ben Collins-Sussman <su...@collab.net> writes:

> So I propose an enhancement: give dir_delta a new "mode" which
> desensitizes it to ancestry.  If it sees two files or dirs have the
> same name -- that's enough.  It just won't check for relatedness.

That isn't enough.  I should be able to svn diff gnuserv-3.12.3
and gnuserv-3.12.5 even though the names aren't the same.

> Once we've implemented this, it's simply a matter of making the two
> modes available to 'svn diff' and 'svn merge'.  I don't really care
> which behavior is the default; we can add options to toggle
> ancestry-sensitivity on or off.

For merge, i'm not sure.  But for diff i argue that the default
should be not to worry about ancestry.  Principle of least
surprise and all that.  If you'd rather wait on the discussion
about defaults, that's fine with me.

> Second -- Eric's original bug.  He ran 'svn diff' on two unrelated
> file URLs.  Consistent with dir_delta, he was shown two complete diffs
> representing a delete and an add.  In Chicago, we're all three in
> agreement that this should never happen.  We don't care about how it's
> implemented, or whether it "makes sense" given the implementation --
> it's just wrong output.  None of us could come up with a *single*
> use-case where a user would ever want or expect this behavior, even
> if they knew svn's internals inside and out.  This should be filed
> as a bug.

I'm confused that you see this as a separate issue.  Maybe it's
just because i don't understand the implementation.  From my
perspective, what is discussed above (your "First") covers
this.  And doing that will address this "Second"--won't it?

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Eric Gillespie <ep...@pretzelnet.org> writes:
> > Eric, do you have time to file the bug for the original issue (Ben's
> > "Second -- " heading below)?  If not, that's fine, just let us know so
> > someone else can file it.
> 
> http://subversion.tigris.org/issues/show_bug.cgi?id=1034

Thanks!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Eric, do you have time to file the bug for the original issue (Ben's
> "Second -- " heading below)?  If not, that's fine, just let us know so
> someone else can file it.

http://subversion.tigris.org/issues/show_bug.cgi?id=1034

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
+1, both of these seem very reasonable to me.

Eric, do you have time to file the bug for the original issue (Ben's
"Second -- " heading below)?  If not, that's fine, just let us know so
someone else can file it.

Thanks,
-Karl

Ben Collins-Sussman <su...@collab.net> writes:
> I'd like to summarize this thread and make a proposal.  I've thought
> about it a bit, and had discussions with kfogel and cmpilato.
> 
> First -- the behavior of dir_delta.  Cmpilato wrote this routine to be
> sensitive to ancestry, and now it seems that this causes a whole lot
> of unintuitive behavior for users, especially when running 'svn diff'
> and 'svn merge'.  Philip has correctly pointed out that if users
> actually attempted to preserve relationships, svn would behave well
> (that's the whole point of svn_load_dirs.pl).   However, if a user
> ignorantly compares two unrelated trees, the tool should still be
> useful.  At the moment, svn is punishing people by printing huge
> numbers of deletes and adds.  svn *should* be able to compare
> unrelated trees and show reasonable output.
> 
> So I propose an enhancement: give dir_delta a new "mode" which
> desensitizes it to ancestry.  If it sees two files or dirs have the
> same name -- that's enough.  It just won't check for relatedness.
> If two files have the same name, it will send a patch (instead of
> deleting/adding), regardless of relatedness.  If two dirs have the
> same name, it will recurse into the dirs (instead of adding/deleting),
> regardless of relatedness.
> 
> Once we've implemented this, it's simply a matter of making the two
> modes available to 'svn diff' and 'svn merge'.  I don't really care
> which behavior is the default; we can add options to toggle
> ancestry-sensitivity on or off.
> 
> Second -- Eric's original bug.  He ran 'svn diff' on two unrelated
> file URLs.  Consistent with dir_delta, he was shown two complete diffs
> representing a delete and an add.  In Chicago, we're all three in
> agreement that this should never happen.  We don't care about how it's
> implemented, or whether it "makes sense" given the implementation --
> it's just wrong output.  None of us could come up with a *single*
> use-case where a user would ever want or expect this behavior, even
> if they knew svn's internals inside and out.  This should be filed
> as a bug.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by cm...@collab.net.
Ben Collins-Sussman <su...@collab.net> writes:

> Greg Hudson <gh...@MIT.EDU> writes:
> 
> > I wonder if the right answer isn't to just punt the (distance == -1)
> > check from delta_dirs().  Is there a real penalty for expressing a file
> > as a delta against something unrelated?
> 
> Well, I was pondering this question too.  We use dir_delta all over
> the place... svn diff, svn merge, svn up, svn switch.  Is there ever a
> time where we definitely *should* see a delete+add instead of a patch?
> I think cmpilato had an example.

Actually, that was you that had the example. :-)

Simply, if a user replaces a file altogether, folks that run 'svn up'
will do this as only a textual modification to the file.  Instead of:

   D  foo.txt
   A  foo.txt

they see only:

   M  foo.txt

Even more insanely, consider a directory that was replaced.  And
consider that the new directory might have even had some files with
the same name as those in the old directory.  The user would see what
looks like a really weird diff that doesn't actually represent the
semantic change that occured:

   D  dir/foo.txt
   M  dir/README
   A  dir/README.win32
   D  dir/subdir/A.txt
   D  dir/subdir/B.txt
   M  dir/subdir/INSTALL
   A  dir/another_dir
   A  dir/another_dir/bar.txt 
   ...
   
Is this a bad thing?  I can't say (and apparently Ben now can't either).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Ben Collins-Sussman <su...@collab.net> writes:
> Mmm, yah.  Crud.  If someone really replaced a file, and you're
> merging that change into your working copy, then this is the
> difference between ending up with an 'M' versus an 'A +'.  This is
> about the only case where the working copy actually tracks ancestry.

The other example where detecting relatedness is nice was with diff
(not merge):

If two files are not related, then showing a diff between their
contents is a bit misleading.  The contents are unrelated, so the diff
between them will be random.  It's more useful to the user to show all
of one file, then all of the other, than to show a random diff.

Now, you might think that if the contents of A and B are truly
unrelated, then if you run "diff A B", you'll automatically get the
desired result: all the contents of one file, followed all the
contents of the other (never mind which is prefixed by "-" and which
by "+"; the user can mentally filter that part out).

To test this theory, I ran both

   $ diff A B
   $ diff -u A B

where A is the contents of this email message up to the end of the
next line, and B is the United States Constitution.

Turns out the diff hunks are interleaved; you don't get all of one
file followed by all of the other.

On the other hand, in conversation just now with Ben and Mike, they
seemed pretty certain that this is an edge case, plus there's the "if
you're diffing two unrelated files, you get what you deserve" factor
:-).

I dunno.  I guess I think "adding more switches and knobs" is actually
a good solution here.  Specifically, this switch is the one you flip
when some data is missing from your repository -- data about shared
ancestry.  As Philip pointed out, if the vendor branch imports had
been done right in the first place, dir_deltas() would behave fine.
So this proposal is really about not punishing people whose vendor
branch imports, for whatever reason, were *not* done right.

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Ben Collins-Sussman <su...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> I guess the relevant question is in merge.  If we are merging "delete
> foo and replace foo with unrelated file" with "modify the contents of
> foo", then we should get a directory-level conflict; my proposed change
> would produce a file-level merge.  So I guess we can throw out my idea.

Mmm, yah.  Crud.  If someone really replaced a file, and you're
merging that change into your working copy, then this is the
difference between ending up with an 'M' versus an 'A +'.  This is
about the only case where the working copy actually tracks ancestry.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Mon, 2002-12-09 at 13:41, Ben Collins-Sussman wrote:
> Well, I was pondering this question too.  We use dir_delta all over
> the place... svn diff, svn merge, svn up, svn switch.  Is there ever a
> time where we definitely *should* see a delete+add instead of a patch?
> I think cmpilato had an example.

When you're doing a commit, there is a single correct tree delta (a
reflection of the operations performed in the wc) and it is important
that it is transmitted and recorded accurately.  But of course
svn_repos_dir_delta() is not used during a commit.

When you're talking to a working copy, the only thing that matters is
the end state.  For an update or switch, the end state is easily defined
and any delta which gets you there is fine.

I guess the relevant question is in merge.  If we are merging "delete
foo and replace foo with unrelated file" with "modify the contents of
foo", then we should get a directory-level conflict; my proposed change
would produce a file-level merge.  So I guess we can throw out my idea.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Craig Peterein <cr...@peterein.org>.
On Wed, Dec 11, 2002 at 12:23:19PM +0100, Branko ??ibej wrote:
>       Obviously ordinary "patch" can't
>       know about Subversion file renames and such.

I was surprised to find out that patch has support for working with ClearCase.
I don't recall if the idea has been discussed before, but what about patching
patch to support Subversion?

> I'm offering these as an input to the discussion, not laying down the
> law, of course.

Craig

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Branko Čibej <br...@xbc.nu>.
Alon Ziv wrote:

>From: "Branko Čibej" <br...@xbc.nu>
>  
>
>>>(By the way, what happened to the internal 'diff' library? Using it
>>>      
>>>
>should
>  
>
>>>make custom-diff formatting much easier, I believe...)
>>>
>>>
>>>      
>>>
>>Ah... it's in my tree, waiting for better days.
>>
>>    
>>
>
>Hmm... What better days? Is there anything one can do to make them come?
>  
>

Magic up 4-6 extra hours in a day? :-)

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Alon Ziv <al...@nolaviz.org>.
From: "Branko Čibej" <br...@xbc.nu>
> >(By the way, what happened to the internal 'diff' library? Using it
should
> >make custom-diff formatting much easier, I believe...)
> >
> >
> Ah... it's in my tree, waiting for better days.
>

Hmm... What better days? Is there anything one can do to make them come?

    -az


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Branko Čibej <br...@xbc.nu>.
Alon Ziv wrote:

>(By the way, what happened to the internal 'diff' library? Using it should
>make custom-diff formatting much easier, I believe...)
>  
>
Ah... it's in my tree, waiting for better days.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Alon Ziv <al...@nolaviz.org>.
From: "Branko Čibej" <br...@xbc.nu>
> >This has already been hashed to death several times...
> >
> Obviously not, since we don't have a design doc -- which should be the
> output of any such hashing-to-death. We don't even have a clear set of
> requirements yet.
>
I do recall several threads about this... Most starring Tom Lord :-)

> >What's needed is to add a list of file-level operations to be done before
> >the diffs are applied.
> >
> Are you sure? Understand me, I'm not saying that's not so, but I'm not
> at all sure myself. This _is_ a can of worms, and the worms have teeth.
>
I'm not sure this is all that is needed, but it's a good start.
And, _assuming_ we are going to use properties for merge-related metadata,
this should also be able to capture history.

> >A possible format (which I'd started working on, long ago, then
abandoned)
> >is
> >
> >* Each file-level operation is represented by the appropriate Unix shell
> >command
> >* diff's are prefixed with a "patch" line
> >
> >... so the whole thing looks just like a shell script, and can even be
run
> >as one.
> >
> Yikes. On Unix-like boxes, perhaps.
>
Right, on Unix-like boxes. And 'svn patch' will always be able to read it,
even on non-Unixes.
I like the idea of having a 'shell-compatible' format, as it can be parsed
by non-svn-users without any additional tool (it is borrowed, mostly, from
Perl's "makepatch" tool). Iif we go a different route, the format has to be
simple enough so someone can write a parser for it which is not svn (for
interoperability with other VC tools).

> >Example output will look like:
> >
...
> >
>
> I understand this is an example, but it's funcamentally wrong -- the
> pre-diff operation should be "svn mv", not filesystem cp + rm. But never
> mind.
>
I'm assuming only the current metadata, and we don't have intrinsic 'move'
right now... And consistently guessing the real meaning is beyond any tool.

> > likewise, if a file is copied *into* the
> >diff'd area, the diff _must_ include its complete contents.
> >
> _and_ history.
>
Hmm... Yes, OK. (This allows e.g. to have a branch-merge as an 'external'
patch).

> >I'm tempted to go back to hacking this...
> >
>
> I'd much, much, _much_ rather see a long design discussion happen first.
> Startimg with a clear statement of the requirements. Those could be
> something along these lines:
>
>    1. "svn diff" + "svn patch" have the same effect as an equivalent
>       "svn merge". To wit, patching a tree with "svn diff" output should
>       be indistinguishable from merging from a "meta" branch that
>       contains the changes that generated the diff.
>    2. The enhanced "svn diff" output should be similar to "standard"
>       diff output, should be at least as human readable, and plain
>       "patch" should be able to do something sane with it, for a
>       reasonable definition of "sane". Obviously ordinary "patch" can't
>       know about Subversion file renames and such.
Maybe we should make sure a patch that requires moving files _will_ break
ordinary 'patch'?

>    3. The enhanced diff should not in any way depend on any feature of
>       any operating system/shell/whatever.
Sure.

(By the way, what happened to the internal 'diff' library? Using it should
make custom-diff formatting much easier, I believe...)

    -az


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Branko Čibej <br...@xbc.nu>.
Alon Ziv wrote:

>This has already been hashed to death several times...
>
Obviously not, since we don't have a design doc -- which should be the
output of any such hashing-to-death. We don't even have a clear set of
requirements yet.

>What's needed is to add a list of file-level operations to be done before
>the diffs are applied.
>
Are you sure? Understand me, I'm not saying that's not so, but I'm not
at all sure myself. This _is_ a can of worms, and the worms have teeth.

>A possible format (which I'd started working on, long ago, then abandoned)
>is
>
>* Each file-level operation is represented by the appropriate Unix shell
>command
>* diff's are prefixed with a "patch" line
>
>... so the whole thing looks just like a shell script, and can even be run
>as one.
>
Yikes. On Unix-like boxes, perhaps.

>Example output will look like:
>
>    set -e
>    cp ./File1.txt ./File2.txt
>    rm ./File1.txt
>    patch -p0 <<'EOF'
>    --- ./File2.txt (old)
>    +++ ./File2.txt (new)
>    @ 8,12
>     Line 8
>     Line 9
>    -Line 10
>    +New Line 10
>     Line 11
>     Line 12
>    EOF
>

I understand this is an example, but it's funcamentally wrong -- the
pre-diff operation should be "svn mv", not filesystem cp + rm. But never
mind.

>The problem with the whole subject, IIRC, was handling of copies outside the
>diff'd area. (There's also a minor problem re ordering of operations, but I
>believe it is solvable using a simple topological sort).
>
"svn merge" solves that.

> For example, if I
>move a file out of the current directory, the 'diff' output will include its
>removal but not the new location;
>
The same problem exists in a merge of a subtree. The solution should be
the same, too.

> likewise, if a file is copied *into* the
>diff'd area, the diff _must_ include its complete contents.
>
_and_ history.

> FWIW, I don't
>believe any of those problems to be really hard. Just annoying :-)
>
Well, that can be said of most, if not all, problems in Subversion. :-)

>I'm tempted to go back to hacking this...
>

I'd much, much, _much_ rather see a long design discussion happen first.
Startimg with a clear statement of the requirements. Those could be
something along these lines:

   1. "svn diff" + "svn patch" have the same effect as an equivalent
      "svn merge". To wit, patching a tree with "svn diff" output should
      be indistinguishable from merging from a "meta" branch that
      contains the changes that generated the diff.
   2. The enhanced "svn diff" output should be similar to "atandard"
      diff output, should be at least as human readable, and plain
      "patch" should be able to do something sane with it, for a
      reasonable definition of "sane". Obviously ordinary "patch" can't
      know about Subversion file renames and such.
   3. The enhanced diff should not in any way depend on any feature of
      any operating system/shell/whatever.

I'm offering these as an input to the discussion, not laying down the
law, of course.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Alon Ziv <al...@nolaviz.org>.
This has already been hashed to death several times...
What's needed is to add a list of file-level operations to be done before
the diffs are applied.
A possible format (which I'd started working on, long ago, then abandoned)
is

* Each file-level operation is represented by the appropriate Unix shell
command
* diff's are prefixed with a "patch" line

... so the whole thing looks just like a shell script, and can even be run
as one.

Example output will look like:

    set -e
    cp ./File1.txt ./File2.txt
    rm ./File1.txt
    patch -p0 <<'EOF'
    --- ./File2.txt (old)
    +++ ./File2.txt (new)
    @ 8,12
     Line 8
     Line 9
    -Line 10
    +New Line 10
     Line 11
     Line 12
    EOF

The problem with the whole subject, IIRC, was handling of copies outside the
diff'd area. (There's also a minor problem re ordering of operations, but I
believe it is solvable using a simple topological sort). For example, if I
move a file out of the current directory, the 'diff' output will include its
removal but not the new location; likewise, if a file is copied *into* the
diff'd area, the diff _must_ include its complete contents. FWIW, I don't
believe any of those problems to be really hard. Just annoying :-)

I'm tempted to go back to hacking this...

    -az

----- Original Message -----
From: "Branko Čibej" <br...@xbc.nu>
To: "Ben Collins-Sussman" <su...@collab.net>
Cc: "SVN Dev List" <de...@subversion.tigris.org>
Sent: Wednesday, December 11, 2002 8:33 AM
Subject: Re: svn diff, svn merge, and vendor branches (long)


> Ben Collins-Sussman wrote:
>
> >Greg Hudson <gh...@MIT.EDU> writes:
> >
> >
> >
> >>I wonder if the right answer isn't to just punt the (distance == -1)
> >>check from delta_dirs().  Is there a real penalty for expressing a file
> >>as a delta against something unrelated?
> >>
> >>
> >
> >Well, I was pondering this question too.  We use dir_delta all over
> >the place... svn diff, svn merge, svn up, svn switch.  Is there ever a
> >time where we definitely *should* see a delete+add instead of a patch?
> >I think cmpilato had an example.
> >
> >
>
> There is obviously a big difference betweena modification and a
delete+add.
>
> "svn diff" should, as far as possible, attempt to document repository
> changes. For the moment, let's ignore the fact that there isn't a
> symmetric "svn patch" comand that could actually recreate those
> differences. Assuming "svn patch" existed, a "svn diff" + "svn patch"
> sequence should ideally do the same thing as a "svn merge" with
> appropriate arguments
>
> Now, whether "svn diff" could represent a delete+add in a more compact
> way than it does now is debatable, but I don't think that the current
> diff format can rescribe the complete removal of a file other than as a
> removal of its contents.
>
> So I see two solutions: the short term is to change (_not_ fix!) "svn
> diff" to produce output that "makes sense" the way Eric wants. The
> not-so-short-term solution is to leave the "svn diff" semantics alone --
> or even make it behave more like svn merge -- and start thinking about
> the diff output instead. Specifically: What are the changes necessary to
> the "standard" -- whatever that means -- diff output, to make it
> possible to represent rename, move, and delete/add new file?
>
> --
> Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Branko Čibej <br...@xbc.nu>.
Ben Collins-Sussman wrote:

>Greg Hudson <gh...@MIT.EDU> writes:
>
>  
>
>>I wonder if the right answer isn't to just punt the (distance == -1)
>>check from delta_dirs().  Is there a real penalty for expressing a file
>>as a delta against something unrelated?
>>    
>>
>
>Well, I was pondering this question too.  We use dir_delta all over
>the place... svn diff, svn merge, svn up, svn switch.  Is there ever a
>time where we definitely *should* see a delete+add instead of a patch?
>I think cmpilato had an example.
>  
>

There is obviously a big difference betweena modification and a delete+add.

"svn diff" should, as far as possible, attempt to document repository
changes. For the moment, let's ignore the fact that there isn't a
symmetric "svn patch" comand that could actually recreate those
differences. Assuming "svn patch" existed, a "svn diff" + "svn patch"
sequence should ideally do the same thing as a "svn merge" with
appropriate arguments

Now, whether "svn diff" could represent a delete+add in a more compact
way than it does now is debatable, but I don't think that the current
diff format can rescribe the complete removal of a file other than as a
removal of its contents.

So I see two solutions: the short term is to change (_not_ fix!) "svn
diff" to produce output that "makes sense" the way Eric wants. The
not-so-short-term solution is to leave the "svn diff" semantics alone --
or even make it behave more like svn merge -- and start thinking about
the diff output instead. Specifically: What are the changes necessary to
the "standard" -- whatever that means -- diff output, to make it
possible to represent rename, move, and delete/add new file?

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Ben Collins-Sussman <su...@collab.net>.
Greg Hudson <gh...@MIT.EDU> writes:

> I wonder if the right answer isn't to just punt the (distance == -1)
> check from delta_dirs().  Is there a real penalty for expressing a file
> as a delta against something unrelated?

Well, I was pondering this question too.  We use dir_delta all over
the place... svn diff, svn merge, svn up, svn switch.  Is there ever a
time where we definitely *should* see a delete+add instead of a patch?
I think cmpilato had an example.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Nuutti Kotivuori <na...@iki.fi>.
Greg Hudson wrote:
> On Mon, 2002-12-09 at 12:55, Ben Collins-Sussman wrote:
>> Once we've implemented this, it's simply a matter of making the two
>> modes available to 'svn diff' and 'svn merge'.  I don't really care
>> which behavior is the default; we can add options to toggle
>> ancestry-sensitivity on or off.
> 
> This bugs me because of the axiom:
> 
>   You can't solve usability problems by adding more switches and
>   knobs.
> 
> I wonder if the right answer isn't to just punt the (distance == -1)
> check from delta_dirs().  Is there a real penalty for expressing a
> file as a delta against something unrelated?

I wrote a huge mail and blabbered my mind off on it. I'll leave it at
the end and summarise here what I really had to say:

  In my *mind*, svn diff with history and svn diff by only looking at
  names and contents, are *separate* concepts - for separate uses. I
  would not wish them to be combined together automatically any more
  I'd like 'svn fetch'.

That's it. Now here comes the long part, you can stop reading now :-)

***

Well, I'll stick my nose in this one, hopefully a bit more as an
outsider to the process.

When I started using Subversion for the first time, one of the things
that got me a warm and fuzzy feeling inside was that Subversion
behaved like I expected it to behave. Just simply reading what was
written on the website frontpage gave me a very good idea of how it
behaved. Then ofcourse I stuck my nose in for real and noticed that
just about everything was still unimplemented :-) Luckily that has
changed since then.

Now, when I hear phrases like "ancestry-sensitive", "preserving
history", "directory versioning", I get an idea in my head about what
it might be about. Then when I see how Subversion behaves with regard
to adding, removing and moving files in a directory for example, it
rhymes with that inner feeling. I see the files branching in two,
joining again somewhere in the future and branching off again. Then
ofcourse I yet again notice that we do not really record ancestry, yet
:-)

Now, when I run 'svn diff' on two unrelated files (I might not know
myself that they are unrelated), the output of the whole being deleted
and another being created feels... correct. That is what I expected,
since the files are not related. Hey, this is not CVS where only the
file name matter - a file with exactly the same name and the same
contents is not the same file, necessarily. And I honestly believe
that this is the way I thought it to be from the start.

The output of 'svn diff' serves two purposes. The first is the mostly
uninteresting case where the output is used only to be fed to 'patch'
later on. There as the only issue I see the edge case of having local
edits to a file that is entirely removed and added in the diff: if the
diff truly has this operation, then it will all end up as a reject -
if the diff just contains the actual changed, no rejects will appear.

The second is user display. When I say 'svn diff', I want to see the
_changes_. I want to know what has changed. And it matters if the file
was removed and an identical file was added in it's place. But, there
also occasions where I want to tell the version system that "Yes, I
know these files are not related, but show me what are the
differences." This is however separate - I know that I specifically
wish to tell the version system to behave differently from what I'd
expect. I would not wish for it to automatically guess for me what I
want to do.

Then ofcourse there is the 'svn merge' case, where you don't see the
differences yourself either. There I think that conceptually 'svn
merge' should always respect ancestry, for real. But since import is
what it is, and people at times have to do things where they are
unable to preserve history - in reality there should be way to either
fix what is wrong (specify history after commits? weird) or work
around it (check only filenames, good enough).

-- Naked

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Mon, 2002-12-09 at 12:55, Ben Collins-Sussman wrote:
> Once we've implemented this, it's simply a matter of making the two
> modes available to 'svn diff' and 'svn merge'.  I don't really care
> which behavior is the default; we can add options to toggle
> ancestry-sensitivity on or off.

This bugs me because of the axiom:

  You can't solve usability problems by adding more switches and knobs.

I wonder if the right answer isn't to just punt the (distance == -1)
check from delta_dirs().  Is there a real penalty for expressing a file
as a delta against something unrelated?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Ben Collins-Sussman <su...@collab.net>.
Eric Gillespie <ep...@pretzelnet.org> writes:

> I have been sitting on this problem for months.

I'd like to summarize this thread and make a proposal.  I've thought
about it a bit, and had discussions with kfogel and cmpilato.

First -- the behavior of dir_delta.  Cmpilato wrote this routine to be
sensitive to ancestry, and now it seems that this causes a whole lot
of unintuitive behavior for users, especially when running 'svn diff'
and 'svn merge'.  Philip has correctly pointed out that if users
actually attempted to preserve relationships, svn would behave well
(that's the whole point of svn_load_dirs.pl).   However, if a user
ignorantly compares two unrelated trees, the tool should still be
useful.  At the moment, svn is punishing people by printing huge
numbers of deletes and adds.  svn *should* be able to compare
unrelated trees and show reasonable output.

So I propose an enhancement: give dir_delta a new "mode" which
desensitizes it to ancestry.  If it sees two files or dirs have the
same name -- that's enough.  It just won't check for relatedness.
If two files have the same name, it will send a patch (instead of
deleting/adding), regardless of relatedness.  If two dirs have the
same name, it will recurse into the dirs (instead of adding/deleting),
regardless of relatedness.

Once we've implemented this, it's simply a matter of making the two
modes available to 'svn diff' and 'svn merge'.  I don't really care
which behavior is the default; we can add options to toggle
ancestry-sensitivity on or off.

Second -- Eric's original bug.  He ran 'svn diff' on two unrelated
file URLs.  Consistent with dir_delta, he was shown two complete diffs
representing a delete and an add.  In Chicago, we're all three in
agreement that this should never happen.  We don't care about how it's
implemented, or whether it "makes sense" given the implementation --
it's just wrong output.  None of us could come up with a *single*
use-case where a user would ever want or expect this behavior, even
if they knew svn's internals inside and out.  This should be filed
as a bug.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2002-12-07 at 09:53, Philip Martin wrote:
> How do you convert URL1 (http://pretzelnet.org/svn/a) into URL2
> (http://pretzelnet.org/svn/b)?  Well first you delete a

Uh, no.  Deletion or addition should never come into play in a diff
between two files, only in a diff between two directories.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Branko Čibej <br...@xbc.nu>.
Philip Martin wrote:

>"Bill Tutt" <ra...@lyra.org> writes:
>
>  
>
>>    John commits an update to create /foo as Foo.0.TxnA. 
>>
>>/foo now has an empty ancestry set.
>>
>>    Barney commits a change to /foo as Foo.0.TxnB.
>>
>>/foo now has an ancestry set of {Foo.0.TxnA}
>>
>>    Charley commits a change to /foo as Foo.0.TxnC.
>>
>>/foo now has an ancestry set of {Foo.0.TxnA Foo.0.TxnB}
>>    
>>
>
>We don't have to explicitly store the above do we?  I think it's
>implied by the current node history.
>
How we store it is not the question here; obviously a lot of the
ancestry info can be either compressed or derived. But all the info has
to be accessible somehow.

>Some merges involve reversing a particular ancestry set.  I think we
>will need to store whether a merge added or removed a particular
>ancestry set.
>
Of course. A reverse merge woud do that, and in that case the
information derived from node history woudl be wrong. :-)

>I'm interested as to how we would implement this.  Are the ancestry
>set IDs, the things like Foo.0.TxnA, visible to the client?  Is it
>done entirely in the client using properties, or does the Subversion
>filesystem get involved?
>
Good question. I wish I knew the answer...

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
"Bill Tutt" <ra...@lyra.org> writes:

>     John commits an update to create /foo as Foo.0.TxnA. 
> 
> /foo now has an empty ancestry set.
> 
>     Barney commits a change to /foo as Foo.0.TxnB.
> 
> /foo now has an ancestry set of {Foo.0.TxnA}
> 
>     Charley commits a change to /foo as Foo.0.TxnC.
> 
> /foo now has an ancestry set of {Foo.0.TxnA Foo.0.TxnB}

We don't have to explicitly store the above do we?  I think it's
implied by the current node history.

>     Fred unfortunately uses a filesystem copy to copy /foo to /bar.
>     Fred commits /bar as Bar.0.TxnD.
>     John commits a change to /bar as Bar.0.TxnE.
> 
> /bar now has an ancestry set of {Bar.0.TxnD}
> 
>     Fred merges the changes from /bar into /foo and commits /foo as
> {Foo.0.TxnF}
> 
> 
> /foo now has an ancestry set of {Foo.0.TxnA Foo.0.TxnB Foo.0.TxnC
> Bar.0.TxnD Bar.0.TxnE}
>
> The result of merge operations is the union of all ancestry sets
> involved in the operation. If there's a star merge of /foo /bar and
> /blah then the new /foo is the union of all ancestry sets of /foo /bar
> and /blah.

Some merges involve reversing a particular ancestry set.  I think we
will need to store whether a merge added or removed a particular
ancestry set.

I'm interested as to how we would implement this.  Are the ancestry
set IDs, the things like Foo.0.TxnA, visible to the client?  Is it
done entirely in the client using properties, or does the Subversion
filesystem get involved?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: svn diff, svn merge, and vendor branches (long)

Posted by Bill Tutt <ra...@lyra.org>.

> From: Philip Martin [mailto:philip@codematters.co.uk]
> One problem I forsee:
> when we get as far as recording merge history, how are we going to
> record the merge of the difference between two unrelated files?  Will
> subsequent merge-history-aware merges use that information?

If we support merging two unrelated files then the answer is yes.
Storing the data is not an issue. Solving the repeated merge problem
requires us to store ancestry sets. Merges simply add additional data to
ancestry sets.

e.g.:

    John commits an update to create /foo as Foo.0.TxnA. 

/foo now has an empty ancestry set.

    Barney commits a change to /foo as Foo.0.TxnB.

/foo now has an ancestry set of {Foo.0.TxnA}

    Charley commits a change to /foo as Foo.0.TxnC.

/foo now has an ancestry set of {Foo.0.TxnA Foo.0.TxnB}

    Fred unfortunately uses a filesystem copy to copy /foo to /bar.
    Fred commits /bar as Bar.0.TxnD.
    John commits a change to /bar as Bar.0.TxnE.

/bar now has an ancestry set of {Bar.0.TxnD}

    Fred merges the changes from /bar into /foo and commits /foo as
{Foo.0.TxnF}

/foo now has an ancestry set of {Foo.0.TxnA Foo.0.TxnB Foo.0.TxnC
Bar.0.TxnD Bar.0.TxnE}

The result of merge operations is the union of all ancestry sets
involved in the operation. If there's a star merge of /foo /bar and
/blah then the new /foo is the union of all ancestry sets of /foo /bar
and /blah.

FYI, 
Bill



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
Greg Hudson <gh...@MIT.EDU> writes:

> I would just say that the reduction of the file case to the
> directory case is rather lossy, and (as you seem to have agreed, after
> looking at the merge code) should be replaced with special handling.

Not quite, I said they should be the same, I didn't specify which
should change :)

It is probably fairly simple to reuse the merge code for diff,
although it may involve an extra round trip to the server to determine
whether a given URL is a file or directory.  One problem I forsee:
when we get as far as recording merge history, how are we going to
record the merge of the difference between two unrelated files?  Will
subsequent merge-history-aware merges use that information?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: svn diff, svn merge, and vendor branches (long)

Posted by Kean Johnston <jk...@sco.com>.
> (Reducing the file case to the directory case *does* make sense for
> checkouts and updates and commits and statuses.  Just not diffs.)
A further perspective ...

The command name carries with it a 27-year expectation of behaviour.
That's a lot of momentum. When program sub-commands use the same name
as system commands, and do mostly the same thing, then you have a
failure4 (IMHO). They should do EXACTLY the same thing, if at all
possible. I think it is possible here.

It is my experience that svn diff or cvs diff or cleartool diff or
any other kind of diff, is generally of more use to a human than to
a program. Yes, it gets used for making patches, but most often in
the development cycle, it is used by programmers who expect things
to behave a certain way. Many many times, I want to compare two
files in a VC system, and many of those times the files are unrelated
from the VC persepctive but very meaningfully related from the
human persective. So from that point of view, I want the tool to
be very dumb, because I am very smart, compared to it. Sometimes :)

Kean


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2002-12-07 at 12:09, Philip Martin wrote:
> Eric Gillespie <ep...@pretzelnet.org> writes:
> 
> > > simple.  Making it work the way you would prefer would involve more
> > > work, more special cases and more code.  It would in fact be making
> > > svn diff "smart", the very crime of which you have accused it.
> > 
> > What?  Not at all.  The current problem is svn is doing
> > something itself (i don't know what exactly).  If it just handed
> > the two arguments to diff(1), the diff would come out meaningful.
> 
> Currently 'svn diff' uses the same reporter/editor mechanism
> irrespective of whether the URLs are directories or files.  It appears
> that you, and Greg Hudson, believe that diff should use a different
> algorithm for comparing files and directories. That sounds like
> increasing diff's "smartness", not reducing it.

Sometimes it can be hard to quantify how smart something is.  In this
case, from a user perspective, "svn diff" appears to be being overly
clever; instead of comparing the two files directly, it is pretending
that the two files each live in a directory consisting of only one file,
and then comparing the directories with each other, which might result
in a diff between the file contents if the files share ancestry, or
might just result in a deletion and addition.  From an implementation
perspective, "svn diff" might appear fairly dumb, since it's performing
a simple case of one reduction to another.

I wouldn't try to analyze this situation in terms of smartness or
dumbness; I would just say that the reduction of the file case to the
directory case is rather lossy, and (as you seem to have agreed, after
looking at the merge code) should be replaced with special handling.

(Reducing the file case to the directory case *does* make sense for
checkouts and updates and commits and statuses.  Just not diffs.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Brandon Ehle <az...@yahoo.com>.
> 
>
>>It appears that you, and Greg Hudson, believe that diff should
>>use a different algorithm for comparing files and
>>directories.
>>    
>>
>
>Absolutely not.  Read my earlier messages.  svn diff should
>simply hand off to diff(1) in *all* cases, or at least that
>should be the effect, just like cvs.
>  
>
I agree, the current behaviour is not very useful.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Florin Iucha <fl...@iucha.net>.
On Sun, Dec 08, 2002 at 05:40:45PM +0000, Philip Martin wrote:
> Eric Gillespie <ep...@pretzelnet.org> writes:
> 
> > Diffing a and b with diff(1) produces a meaningful diff.
> > Diffing the two gnuserv.c versions with diff(1) or diffing with
> > svn when they share ancestry produces the meaningful diff,
> > diffing with svn they *don't* share ancestry produces a useless
> > diff.
> 
> You keep repeating "useless".  OK, I'll keep repeating "correct".

"One man's constant is another man's variable".

> Your "useless" result shows me that the files are not related?  How is
> that useless?

Then why doesn't it say
   "ERROR 1005: The files are not related."
instead of spitting out both files with +/- on the first column,
scrolling out thirty pages of text?

It is usefull to know if one element is the ancestor of the other but
that is not the answer to my question: "What is the difference of the
contents of those two elements?"

Feel free to add a "is-ancestor-of" command.

florin

-- 

"If it's not broken, let's fix it till it is."

41A9 2BDE 8E11 F1C5 87A6  03EE 34B3 E075 3B90 DFE4

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
Brandon Ehle <az...@yahoo.com> writes:

> Subversion is a tool, its supposed to make your project easier.  If
> every time we want to merge unrelated files or directories we have to
> checkout 3 working copies and manually do a 3 way merge, Subversion
> isn't being very useful.

As far as I can tell, checking out (or exporting) 2 working copies and
then "handing off to diff" is exactly what Eric wants.  I don't really
see how it works, but it seems to be what he wants.

As far as merge goes: I don't know what the merge equivalent of "hand
off to diff" is, but it would appear to involve checking out, or
exporting, 2 working copies in addition to the one receiving the
merge, and then doing "something".

A simple script doing "export; export; something;" will work just as
well as having Subversion do it.  In fact if you are going to ignore
revision history and play games looking for matching filenames you are
probably better off implementing it in a language like perl or python
then you are writing it in C.

> >I guess the question I really have is: why you don't just ensure that
> >your files have the proper revision history?  If they did, Subversion
> >would do what you want.
> >
> 
> Because the people on this list are not going to be the only people
> who use Subversion.  Inevitably, you are going to run into a case
> where someone has been implementing vendor branches using imports,
> after all the first vendor branch was brought in using import, why
> wouldn't you bring in the second version of the vendor branch the same
> way?
>
> If its decided that Subversion won't support logical diffs between
> unrelated files, then we should at least make a specialized vendor
> import that automatically generates the ancestry, or the ability to
> create ancestry for files that currently have no ancestry at a later
> time.

Have you looked at tools/client-side/svn_load_dirs.pl?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Brandon Ehle <az...@yahoo.com>.
> 
>
>>Diffing a and b with diff(1) produces a meaningful diff.
>>Diffing the two gnuserv.c versions with diff(1) or diffing with
>>svn when they share ancestry produces the meaningful diff,
>>diffing with svn they *don't* share ancestry produces a useless
>>diff.
>>    
>>
>
>You keep repeating "useless".  OK, I'll keep repeating "correct".
>Your "useless" result shows me that the files are not related?  How is
>that useless?
>  
>
Subversion is a tool, its supposed to make your project easier.  If 
every time we want to merge unrelated files or directories we have to 
checkout 3 working copies and manually do a 3 way merge, Subversion 
isn't being very useful.

>I guess the question I really have is: why you don't just ensure that
>your files have the proper revision history?  If they did, Subversion
>would do what you want.
>  
>
Because the people on this list are not going to be the only people who 
use Subversion.  Inevitably, you are going to run into a case where 
someone has been implementing vendor branches using imports, after all 
the first vendor branch was brought in using import, why wouldn't you 
bring in the second version of the vendor branch the same way?

If its decided that Subversion won't support logical diffs between 
unrelated files, then we should at least make a specialized vendor 
import that automatically generates the ancestry, or the ability to 
create ancestry for files that currently have no ancestry at a later time.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: svn diff, svn merge, and vendor branches (long)

Posted by Kean Johnston <jk...@sco.com>.
> That would make 'svn diff' totally different from 'svn merge', I would
> object strongly.
Why? One is a developer-centric comparison aid. The other is a
repository-mainter-centric filesystem aid. Of course they are
different. I think it is more of an error to think of them as the
same. Similar, granted, but most certainly not the same.


> I guess the question I really have is: why you don't just ensure that
> your files have the proper revision history?  If they did, Subversion
> would do what you want.
Because svn import was not helping him in any way to HAVE propper
history,
which was part of the original mail message that spawned this thread.

Kean


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
Eric Gillespie <ep...@pretzelnet.org> writes:

> Diffing a and b with diff(1) produces a meaningful diff.
> Diffing the two gnuserv.c versions with diff(1) or diffing with
> svn when they share ancestry produces the meaningful diff,
> diffing with svn they *don't* share ancestry produces a useless
> diff.

You keep repeating "useless".  OK, I'll keep repeating "correct".
Your "useless" result shows me that the files are not related?  How is
that useless?

[snip]

> Then you aren't listening.  If svn would simply hand off the
> files to diff(1) *regardless* of whether they're files, or
> directories, or widgets, you'd get meaningful results where
> possible or equally useless results where not.

That would make 'svn diff' totally different from 'svn merge', I would
object strongly.  In addition, given your response to the question
below, I don't understand how "handing off to diff" does what you
want.

> >   svn diff http://host/repo/dir1 http://host/repo/dir3/dir4 
> >   svn diff http://host/repo/dir1 http://host/repo/dir3
> > 
> > Would they show different diffs for foo.c?
> 
> It shouldn't.

I don't see how to implement it.  With a repository

  /repos/dir1/foo.c
        /dir3/dir4/foo.c

you want

  svn diff http://host/repo/dir1/foo.c http://host/repo/dir3/dir4/foo.c
  svn diff http://host/repo/dir1       http://host/repo/dir3/dir4
  svn diff http://host/repo/dir1       http://host/repo/dir3

to show the same diff for foo.c.  I don't see how "hand off to diff"
will do that.  Suppose I add

       /dir3/dir5/foo.c

What does 

  svn diff http://host/repo/dir1 http://host/repo/dir3

show now?


I guess the question I really have is: why you don't just ensure that
your files have the proper revision history?  If they did, Subversion
would do what you want.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
[Warning: this mail is meta-discussion, not discussion.]

Eric Gillespie <ep...@pretzelnet.org> writes:
> > Or does you request for "diff(1) in all cases" mean that you want the
> > directory diff command to ignore revision history as well?  I don't
> > know how that would be implemented.
> 
> Then you aren't listening.  If svn would simply hand off the
> files to diff(1) *regardless* of whether they're files, or
> directories, or widgets, you'd get meaningful results where
> possible or equally useless results where not.

Eric, can you please be less abrasive?

Accusing people of "not listening" is never constructive, and usually
destructive.  If Philip has misunderstood you, then try "I think you
might have misunderstood what I was saying."  That gets the idea
across without accusing Philip of being a lazy reader.

I found your wording particularly surprising given that Philip is one
of people most careful about details around here, as everyone knows.
If he did get lost in a long and complicated thread (and I'm not
saying he did), then it's just because he's human, and the right
remedy is friendly help, not abuse.

I'm glad that Philip ignored the barb and responded only to the
technical content of your post, but the tone still affects all of us.

Thank you,
-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
=?UTF-8?B?QnJhbmtvIMSMaWJlag==?= <br...@xbc.nu> writes:

> Blech. What you're saying is that we should never implement an internal
> diff/patch library, but should forever use a program that is simply not
> good enough for our needs.

>sigh<.  Please don't read between the lines, because i'm not
saying anything in there.  As i stated in an earlier message,
i'm describing what the *effect* should be.  As a user, i'm not
concerned with how it's implemented and am sure the
implementation will be quite different from CVS's.

> >To prove this to yourself, just do a svn export of any two URLs
> >you are curious about and diff -r the two resulting directories.
> >
> That's a totally different case.

Eh?  No it isn't.  Or, rather, currently svn thinks it's a
totally different case, but my whole point is that it isn't.
But your next paragraph makes me think i don't understand what
you are saying.

> So now I'll hear a shout "but patch doesn't understand those diffs, so
> they're useless!" Quite the contrary -- it's patch that's "useless",
> because -- like diff -- it can't handle renames.

I have no idea what you're talking about.  What do renames have
to do with anything?  I'm not asking for magic; i know that
diff and patch don't handle things like that.  There has been
talk of a new patch format, and new commands to generate and
apply these patches.  Is that what you're talking about?  That
topic is completely orthogonal to this thread.

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Branko Čibej <br...@xbc.nu>.
Eric Gillespie wrote:

>Then you aren't listening.  If svn would simply hand off the
>files to diff(1) *regardless* of whether they're files, or
>directories, or widgets, you'd get meaningful results where
>possible or equally useless results where not.
>

Blech. What you're saying is that we should never implement an internal
diff/patch library, but should forever use a program that is simply not
good enough for our needs.

Yes, diff is _not_ good enough: a) it knows nothing about common
ancestry, b) it knows nothing about tree reorganizations. So we already
have to do a lot of work _before_ calling diff to get output that's even
close to sensible.

>To prove this to yourself, just do a svn export of any two URLs
>you are curious about and diff -r the two resulting directories.
>
That's a totally different case.

So now I'll hear a shout "but patch doesn't understand those diffs, so
they're useless!" Quite the contrary -- it's patch that's "useless",
because -- like diff -- it can't handle renames.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Philip Martin <ph...@codematters.co.uk> writes:

> I not sure what "diff(1) in all cases" means.  All 'svn diff' output
> is produced by running diff(1).

Apparently it isn't.  Look at all the examples i've provided.
Diffing a and b with diff(1) produces a meaningful diff.
Diffing the two gnuserv.c versions with diff(1) or diffing with
svn when they share ancestry produces the meaningful diff,
diffing with svn they *don't* share ancestry produces a useless
diff.  If as you claim it simply passed off the files to
diff(1), that would not be the case.

>   svn diff http://host/repo/dir1/foo.c http://host/repo/dir2/foo.c 
>   svn diff http://host/repo/dir1/foo.c http://host/repo/dir3/dir4/foo.c 
> 
> to ignore revision history and generate direct file-to-file diff. Now
> suppose I run
> 
>   svn diff http://host/repo/dir1 http://host/repo/dir2 
> 
> That will produce a different diff for foo.c than that produced by the
> command run on foo.c directly.  I think it is "obviously" wrong for
> the form of the diff to depend on the URL in this way.  That's also
> why I think the merge behaviour is wrong.
> 
> Or does you request for "diff(1) in all cases" mean that you want the
> directory diff command to ignore revision history as well?  I don't
> know how that would be implemented.

Then you aren't listening.  If svn would simply hand off the
files to diff(1) *regardless* of whether they're files, or
directories, or widgets, you'd get meaningful results where
possible or equally useless results where not.

To prove this to yourself, just do a svn export of any two URLs
you are curious about and diff -r the two resulting directories.

>   svn diff http://host/repo/dir1 http://host/repo/dir3/dir4 
>   svn diff http://host/repo/dir1 http://host/repo/dir3
> 
> Would they show different diffs for foo.c?

It shouldn't.

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Greg Hudson <gh...@MIT.EDU>.
My current thinking is that dir_delta() is not appropriate for "svn diff",
because diff output is about filenames and file contents, not ancestry.
If we ever invent a tree-delta format which contains information about
ancestry (one which can say "copy this old file to this new file and then
apply this diff"), then dir_delta() would become appropriate.

A corollary is that "svn diff" and "svn merge" should not follow the same
algorithm.  merge applies a tree delta to the working copy, not a diff.
(So "svn merge" and "svn hypothetical-tree-delta-diff" would follow the
same algorithm.)

On 8 Dec 2002, Philip Martin wrote:
> /repo/dir1/foo.c
>      /dir2/foo.c
>      /dir3/dir4/foo.c

>   svn diff http://host/repo/dir1 http://host/repo/dir2
>
> That will produce a different diff for foo.c than that produced by the
> command run on foo.c directly.

I would say no.  Because diff output is about names and contents, the two
foo.c's should be compared with each other because they have the same
name, even if they have no common ancestry as far as svn knows.

>   svn diff http://host/repo/dir1 http://host/repo/dir3/dir4
>   svn diff http://host/repo/dir1 http://host/repo/dir3
>
> Would they show different diffs for foo.c?

Yes.  In the second case, foo.c and dir4/foo.c do not have the same name
relative to the base, so you'll see a deletion and addition.

(When you're diffing between two files, the "name relative to the base" is
always the empty string for both files, so is always the same.  So it's
always a diff between the contents.)

> I still believe that the current 'svn diff' is correct, and I think
> that 'svn merge' is wrong to allow the merge of the diff between
> arbitrary files without giving at least a warning.

I think users will expect svn diff to obey axioms like "a diff between two
files is always about the relative contents of the files" and "a diff
between two directories whose contents are identical is always empty."

> I think that, in general, if you attempt to merge unrelated files you
> are doing something wrong.

It suggests you're not using the version control system to its full
potential, but it doesn't mean the version control system should fight
you.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
Eric Gillespie <ep...@pretzelnet.org> writes:

> > It appears that you, and Greg Hudson, believe that diff should
> > use a different algorithm for comparing files and
> > directories.
> 
> Absolutely not.  Read my earlier messages.  svn diff should
> simply hand off to diff(1) in *all* cases, or at least that
> should be the effect, just like cvs.

I not sure what "diff(1) in all cases" means.  All 'svn diff' output
is produced by running diff(1).  When you run diff the editor receives
a delta for the files that change, it then generates two full texts
and runs diff(1) on those.

Suppose my repository looks like

   /repo/dir1/foo.c
        /dir2/foo.c
        /dir3/dir4/foo.c

where none of dir1/foo.c, dir2/foo.c and dir4/foo.c have a common
ancestor.

Now you want

  svn diff http://host/repo/dir1/foo.c http://host/repo/dir2/foo.c 
  svn diff http://host/repo/dir1/foo.c http://host/repo/dir3/dir4/foo.c 

to ignore revision history and generate direct file-to-file diff. Now
suppose I run

  svn diff http://host/repo/dir1 http://host/repo/dir2 

That will produce a different diff for foo.c than that produced by the
command run on foo.c directly.  I think it is "obviously" wrong for
the form of the diff to depend on the URL in this way.  That's also
why I think the merge behaviour is wrong.

Or does you request for "diff(1) in all cases" mean that you want the
directory diff command to ignore revision history as well?  I don't
know how that would be implemented.  What about the commands

  svn diff http://host/repo/dir1 http://host/repo/dir3/dir4 
  svn diff http://host/repo/dir1 http://host/repo/dir3

Would they show different diffs for foo.c?

I still believe that the current 'svn diff' is correct, and I think
that 'svn merge' is wrong to allow the merge of the diff between
arbitrary files without giving at least a warning.  I think that, in
general, if you attempt to merge unrelated files you are doing
something wrong.  There are perhaps some cases where it is useful, in
such cases having a --unrelated switch would be easier than running
the procedure manually, but I don't think it should be the default
beahviour.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Philip Martin <ph...@codematters.co.uk> writes:

> Currently 'svn diff' uses the same reporter/editor mechanism
> irrespective of whether the URLs are directories or files.

As i would expect.

> It appears that you, and Greg Hudson, believe that diff should
> use a different algorithm for comparing files and
> directories.

Absolutely not.  Read my earlier messages.  svn diff should
simply hand off to diff(1) in *all* cases, or at least that
should be the effect, just like cvs.

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
Eric Gillespie <ep...@pretzelnet.org> writes:

> > simple.  Making it work the way you would prefer would involve more
> > work, more special cases and more code.  It would in fact be making
> > svn diff "smart", the very crime of which you have accused it.
> 
> What?  Not at all.  The current problem is svn is doing
> something itself (i don't know what exactly).  If it just handed
> the two arguments to diff(1), the diff would come out meaningful.

Currently 'svn diff' uses the same reporter/editor mechanism
irrespective of whether the URLs are directories or files.  It appears
that you, and Greg Hudson, believe that diff should use a different
algorithm for comparing files and directories. That sounds like
increasing diff's "smartness", not reducing it.

Hmm, it appears that 'svn merge' already has special handling for
operating on single files; when that got written, it deviated from
'svn diff'.  I'd certainly accept an argument that diff and merge
should be the same, I'd probably use it myself :)

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Eric Gillespie <ep...@pretzelnet.org>.
Philip Martin <ph...@codematters.co.uk> writes:

> > --- a	(revision 1532)
> > +++ b	(revision 1532)
> > @@ -0,0 +1,2 @@
> > -hi
> > +hi
> > +bye
> 
> Really?  I was not aware that svn diff ever did this for unrelated
> files.

I was confused.  The output of svn diff has never changed.
Using two different filenames a and b confused me.  Here is what
i actually saw in the past and still see today:

% svn diff http://pretzelnet.org/svn/tmp/{1,2}/foo
Index: foo
===================================================================
--- foo	(revision 1533)
+++ foo	(revision 1533)
@@ -1 +0,0 @@
-hi
Index: foo
===================================================================
--- foo	(revision 1533)
+++ foo	(revision 1533)
@@ -0,0 +1,2 @@
+hi
+bye

Same as today.  And if i apply that patch to a foo file
containing only the single 'hi' line, it does indeed transform
it into the second form (two lines, 'hi' and 'bye').  But my
point still stands.  Try this:

svn diff http://pretzelnet.org/svn/tmp/{1,2}/gnuserv.c
svn diff http://pretzelnet.org/imports/gnuserv/base/gnuserv.c \
         http://pretzelnet.org/misc/gnuserv/gnuserv.c

The resulting diff from each command does the same thing when
fed to patch (i.e. transform gnuserv.c as it exists in the first
form to how it exists in the second form) whether the files
share ancestry or not.  The issue is that one of the diffs is
human-readable while the other is not.

> Not good.  If you want to propose changes in behaviour it is better to
> understand the current behaviour.

While i did have a slight misunderstanding, i hope what i say
above illustrates that my point still stands.

> It does work, it just doesn't work the way you assumed it would.

No, i never had any doubt that the resulting diff works when fed
to patch.  The issue is that one diff is a useless pile of
removing all lines of the first form followed by adding all lines
of the second, but the other diff is a meaningful diff.

> simple.  Making it work the way you would prefer would involve more
> work, more special cases and more code.  It would in fact be making
> svn diff "smart", the very crime of which you have accused it.

What?  Not at all.  The current problem is svn is doing
something itself (i don't know what exactly).  If it just handed
the two arguments to diff(1), the diff would come out meaningful.

When i diff two branches in cvs, i get the same diff i'd get if
i checked out both branches and ran diff -r on the two
directories.  My argument is that svn perhaps ought to do the
same thing.

> If you are comparing files that do not have a common ancestor then you
> are doing something wrong as far as a version control system is
> concerned.  While it is legitimate to compare arbitrary files in a
> general, non-version controlled fashion, I'm not sure that we need
> Subversion to do it.

You may have a point here.  It is possible that this issue is
blown up in importance in my mind only because import's
uselessness cause me great pain trying to figure out how to
implement vendor branches.  However, *my* gut reaction is that
svn diff is still trying to be too smart.

--  
Eric Gillespie <*> epg@pretzelnet.org

Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff, svn merge, and vendor branches (long)

Posted by Philip Martin <ph...@codematters.co.uk>.
Eric Gillespie <ep...@pretzelnet.org> writes:

> % echo hi > a
> % echo hi > b
> % echo bye >> b
> % svn add a b
> % svn ci a b
> 
> Do not svn cp the files.  I created just such two files a couple
> months ago, last time this came up on IRC: http://pretzelnet.org/svn/{a,b}.
> 
> Try it:
> 
> % svn diff http://pretzelnet.org/svn/{a,b}
> Index: a
> ===================================================================
> --- a	(revision 1532)
> +++ a	(revision 1532)
> @@ -1 +0,0 @@
> -hi
> Index: b
> ===================================================================
> --- b	(revision 1532)
> +++ b	(revision 1532)
> @@ -0,0 +1,2 @@
> +hi
> +bye
> 
> Yikes.  That output is wrong for a number of reasons.

It's working as intended.  The svn diff help states

  2. If the alternate syntax is used, the server compares URL1 and URL2
     at revisions N and M respectively.  If either N or M are ommitted,
     a value of HEAD is assumed.

How do you convert URL1 (http://pretzelnet.org/svn/a) into URL2
(http://pretzelnet.org/svn/b)?  Well first you delete a

Index: a
===================================================================
--- a	(revision 1532)
+++ a	(revision 1532)
@@ -1 +0,0 @@
-hi

and then you add b

Index: b
===================================================================
--- b	(revision 1532)
+++ b	(revision 1532)
@@ -0,0 +1,2 @@
+hi
+bye

> But the
> first thing Sussman said when i first showed this output to him
> back in October was that i was using diff incorrectly and needed
> to use the URL@REV form.  Nope; just add @1532 to the end of
> each of those URL and observe the same output.

Read the help text - adding @REV makes no difference.

>  Furthermore, try
> 
> svn diff \
> http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4/gnuserv.c \
> http://pretzelnet.org/svn/misc/gnuserv/gnuserv.c

I assume the files have a common ancestor.

> Notice that even w/out the URL@REV format you get a meaningful diff.

Read the help text - adding @REV makes no difference.

> Now let's look at what happened here.  In the tmp case, the two
> files do not share ancestry (where ancestry is defined in svn
> terms), while in the gnuserv case they *do*.
> 
> OK, that established let's look at the output from diffing a and
> b.  I can't even begin to analyze just WTF we're looking at
> here.

See above, it's quite simple.

>  When i showed this to Sussman in October (after
> commenting about URL@REV), he said that did look funny.  Quite.
> When i first began thinking about this problem (probably back in
> April), this was not the output i got from diffing two similar
> files that did not share ancestry.  Instead what i got was this
> (simulated):
> 
> --- a	(revision 1532)
> +++ b	(revision 1532)
> @@ -0,0 +1,2 @@
> -hi
> +hi
> +bye

Really?  I was not aware that svn diff ever did this for unrelated
files.

> Now that makes more sense.  The earlier output, i don't even
> understand.

See above, it's quite simple.

> This i do, though i do not agree with svn behaving
> that way.  I'm going to assume (based on a reasonable assumption
> and on Sussman's comment that the earlier output didn't look
> right) that the earlier output is just a bug, and what i just
> reconstructed above is the intended behavior.

No, the output you see now is the intended output.

>  So that's what
> i'll be talking about now.

Not good.  If you want to propose changes in behaviour it is better to
understand the current behaviour.

> Way back when i first noticed this problem, i was first
> experimenting with vendor branches with gnuserv.  I foolishly
> assumed that svn import was intended to be an analogue to cvs
> import and imported gnuserv-3.12.3 and gnuserv-3.12.4 as two
> separate import commands; i.e. they did not share ancestry.  So a
> diff on gnuserv.c got me a huge pile of - lines (one for every
> line in the first gnuserv.c) followed by a huge pile of + lines
> (one for every line in the second gnuserv.c).
> 
> Now, from svn's point of view, this output makes sense.  a and b
> are not related, so the proper diff is to remove all a's lines
> and then add all b's lines.
>
> But this is svn trying to be too smart.  Tools that try to be
> too smart inevitably screw it up, because the user knows so much
> more than the tool.  Why *doesn't* svn diff work on unrelated
> files?

It does work, it just doesn't work the way you assumed it would.
ClearCase surprised me when I started working with it.  Subversion's
tagging/copying confused me when I first came across it.  Nobody said
it was easy.

Although you say svn diff is currently "too smart", it is really very
simple.  Making it work the way you would prefer would involve more
work, more special cases and more code.  It would in fact be making
svn diff "smart", the very crime of which you have accused it.

>  Let's say a and b really weren't related: a is a copy of
> fstab and b a copy of printcap.  Both the current too-smart
> behavior and my suggested just-give-it-a-try behavior result in
> a useless diff.  But, if a and b *are* related, just not in
> svn's opinion, i get a much more useful output.

That depends on what you want from the output.  If you want to know
whether the files are related your "more useful" output is useless.

Please don't misunderstand me, I am not claiming that the current diff
is perfect or set in stone.  I won't even claim that it works fully.
The implementation, and the interface, can both be changed.  If you
think you have a better system you are free to propose it, discuss it,
implement it, or ask for someone else to implement it.  As yet you
have not convinced me (and that is just me personally) that what you
want is either simpler to understand, or would be a better
implementation.

Perhaps it because I have a different (not better, not worse, just
different) experience of using version control, but whether files are
related in the version history is of supreme importance in my view.
If you are comparing files that do not have a common ancestor then you
are doing something wrong as far as a version control system is
concerned.  While it is legitimate to compare arbitrary files in a
general, non-version controlled fashion, I'm not sure that we need
Subversion to do it.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org