You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Malcolm Rowe <ma...@farside.org.uk> on 2006/02/12 22:13:35 UTC

Diffing schedule-deleted files - what's going on?

Could anyone explain the rationale behind our treatment of
schedule-deleted files encountered during 'svn diff'?  (I realise that
using 'rationale' and 'diff' in the same sentence may provoke some humour,
but presumably there is some reason for the current arrangement).

Specifically, we seem to treat such files as if they are empty, rather
than as if they are absent.  This is rather strange, since diffs once
we commit will (of course) treat the file as absent.

A reproduction recipe is below, which should hopefully make things a
little clearer.

I'd like to fix repos-BASE diffs so that they don't have this odd
behaviour (and fix another bug in exactly the same area), but before I do
that, I'd like to understand what the correct behaviour for diffs against
WORKING is supposed to be, since that might affect the way I do this.

[DannyB, I've cc:d you because you committed the original regression test,
and so presumably you've got some idea about this.]

Regards,
Malcolm

Re: Diffing schedule-deleted files - what's going on?

Posted by Julian Foad <ju...@btopenworld.com>.
Malcolm Rowe wrote:
> Could anyone explain the rationale behind our treatment of
> schedule-deleted files encountered during 'svn diff'?  (I realise that
> using 'rationale' and 'diff' in the same sentence may provoke some humour,
> but presumably there is some reason for the current arrangement).

Malcolm, thanks for persisting with this awkward "diff" stuff.

You might find this idea a bit daunting, but I have been thinking for a long 
time that a lot of the trouble and bugs in our implementation arise because the 
logic for determining and traversing the two trees to compare is mixed up with 
the logic for comparing them.  It's partly separated, but not enough.

I feel it might be worth tackling the problem by designing and writing a 
differencing engine (huh? I'm sure my history book said that had already been 
invented :-)) that is driven in the same way for a repository diff, a local 
diff of a repository-against-local diff.

Much of it is probably there already.  Things to do include:

* See how the repos diff, the local diff and the mixed diff are driven, and 
decide which method is the master method and convert the others to use that method.

* Ensure that the "from" and "to" trees are interchangeable, in that there are 
no restrictions on which sorts of trees can be the "from" and which can be the 
"to".

No small task, I know, but a load of goodness will fall out of it.


BTW, for your own testing you might like to try the attached patch which simply 
adds a "WC" revision keyword so that you can try things like comparing "svn 
diff 3:WC" with "svn diff WC:3" and see if there are any asymmetries that 
indicate bugs.  Some combinations, like "-rWC:BASE", aren't supported.

- Julian

Re: Diffing schedule-deleted files - what's going on?

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Mon, Feb 13, 2006 at 03:27:00PM +0000, Julian Foad wrote:
> I don't get the result you describe from this part of your script.  Here's 
> what I get:
> 
> >+ svn diff -r0 wc
> >Index: wc/foo
> >===================================================================
> 
> Do you think that shows "the addition of an empty file"?

Yes, I do.  Consider what the patch format would look like for a new file
that has no changed hunks compared with the empty file.  (Also q.v. 'touch
foo; svn add foo; svn diff').

Also see the comment in libsvn_client/diff.c:diff_file_added() that
suggests that 'patch' should treat it as an added empty file, though I
couldn't seem to get that to work when I tested it.

Howver, it doesn't really matter whether we agree on what it represents,
because:

> I don't think this is a good output; I think it 
> is silly to output a header without some further indication of what 
> happened.
> 

I agree; see r18630 for the result.

> >+ svn diff -r0:BASE wc
> That one is not empty, but is correct, I think.

Yes - it was broken in 1.3.0, which is what I was basing my comments on.
I'd forgotten I'd already fixed it in trunk (silly me!).

> Later:
> >>If you apply the diff that results using patch, it won't DTRT, for
> >>example.
> 
> (Results from what, I'm not sure, but...)
> 
> >If that's the main criteria, just let me know.  I'm just trying to
> >work out if schedule-delete files are intended to be handled in any
> >non-obvious manner.
> 
> Nobody knows what the main criterion is.  We need to balance "patch", 
> readability, backward-compatibility, and, I hope, potential for "svn-patch" 
> programs that understand more from the output than "patch" does.
> 

Sure, though patch(1)-ability is pretty high on the list.  Also note
that almost all of the changes I've been making are at the semantic,
not presentation layer (i.e. in libsvn_wc, not libsvn_client), so they've
been wrong whatever the output format.

Regards,
Malcolm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Diffing schedule-deleted files - what's going on?

Posted by Julian Foad <ju...@btopenworld.com>.
Malcolm Rowe wrote:
> 
> # Mark the file as schedule-deleted
> svn rm wc/foo
> # Now both of these show the addition of an empty file
> # The repos->BASE one is definitely wrong, since changes in the
> # WORKING tree should not affect diffs against BASE,  but shouldn't the
> # repos->WORKING diff show nothing?  (The repository and WORKING trees
> # are identical, after all.)
> svn diff -r0 wc
> svn diff -r0:BASE wc

I don't get the result you describe from this part of your script.  Here's what 
I get:

> + svn diff -r0 wc
> Index: wc/foo
> ===================================================================

Do you think that shows "the addition of an empty file"?  I don't; I interpret 
it as "The file was touched in some unspecified way but is identical to how it 
was."  I don't think this is a good output; I think it is silly to output a 
header without some further indication of what happened.

I think "patch" will ignore this, which is fine.

> + svn diff -r0:BASE wc
> Index: wc/foo
> ===================================================================
> --- wc/foo      (revision 0)
> +++ wc/foo      (revision 1)
> @@ -0,0 +1 @@
> +xxx

That one is not empty, but is correct, I think: isn't this the way we currently 
represent both "create this file from nonexistent to the specified content" and 
"change this file from empty to the specified content"?  I hope we could do 
better and distinguish the two cases, but I think that's a secondary concern.


Later:
>> If you apply the diff that results using patch, it won't DTRT, for
>> example.

(Results from what, I'm not sure, but...)

> If that's the main criteria, just let me know.  I'm just trying to
> work out if schedule-delete files are intended to be handled in any
> non-obvious manner.

Nobody knows what the main criterion is.  We need to balance "patch", 
readability, backward-compatibility, and, I hope, potential for "svn-patch" 
programs that understand more from the output than "patch" does.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Diffing schedule-deleted files - what's going on?

Posted by Daniel Berlin <db...@dberlin.org>.
On Mon, 2006-02-13 at 14:39 +0000, Malcolm Rowe wrote:
> On Sun, Feb 12, 2006 at 06:14:03PM -0500, Daniel Berlin wrote:
> > The only intention is that schedule delete files show as a diff from the
> > file to empty, which is what should be happening (because the entire
> > file is being deleted).
> > 
> 
> You mean that 'svn diff' by itself (a BASE:WORKING diff) should
> show the file being deleted?

Yes.

>   That's fine; what I'm asking for is
> what a repos:WORKING diff should look like for a file that is marked
> schedule-deleted in the working copy.

That should show the file being deleted (by all of it's lines being
deleted), IMHO.


> I'd argue that if the file exists in the repository, but is
> schedule-delete in the working copy, we treat it the same way as if
> it were absent - that is, show an addition or deletion of the file's
> repository version, depending on direction.

This would be fine, since i believe it meets the "show the file being
deleted".  Unless you meant something different by "show an addition or
deletion of the file's repository version".

> 
> > Why do you believe deleting the file should not show up at all in diff?
> > 
> 
> Because in the reproduction script I quoted, the file doesn't exist in
> either the repository version, and is schedule-delete in the WORKING
> version (so shouldn't be mentioned in the output at all).  The other
> thing I mentioned was repos:BASE diffs, which should ignore the state
> of schedule-delete files in WORKING.
> 
> But note that if I change that, I break diff_test 27, which relies on
> getting an entirely empty 'diff' (that is, header only) for files that are
> non-existant in the repository and schedule-delete in the working copy.

This is different than the case above (where you said "exists in the
repository above"), however, and showing no diff in the case that it is
non-existent now in the repo, and schedule delete in the WC, is fine,
for a repos:WORKING diff.
> 
> > If you apply the diff that results using patch, it won't DTRT, for
> > example.
> > 
> 
> If that's the main criteria, just let me know.  I'm just trying to
> work out if schedule-delete files are intended to be handled in any
> non-obvious manner.

Originally, doing base:WORKING in this case actually used to cause an
error, because we would pass a file to diff that didn't exist :(

Just make sure to try what you are doing with an external diff, since
that tends to be more fragile in this regard.





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Diffing schedule-deleted files - what's going on?

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Sun, Feb 12, 2006 at 06:14:03PM -0500, Daniel Berlin wrote:
> The only intention is that schedule delete files show as a diff from the
> file to empty, which is what should be happening (because the entire
> file is being deleted).
> 

You mean that 'svn diff' by itself (a BASE:WORKING diff) should
show the file being deleted?  That's fine; what I'm asking for is
what a repos:WORKING diff should look like for a file that is marked
schedule-deleted in the working copy.

I'd argue that if the file exists in the repository, but is
schedule-delete in the working copy, we treat it the same way as if
it were absent - that is, show an addition or deletion of the file's
repository version, depending on direction.

> Why do you believe deleting the file should not show up at all in diff?
> 

Because in the reproduction script I quoted, the file doesn't exist in
either the repository version, and is schedule-delete in the WORKING
version (so shouldn't be mentioned in the output at all).  The other
thing I mentioned was repos:BASE diffs, which should ignore the state
of schedule-delete files in WORKING.

But note that if I change that, I break diff_test 27, which relies on
getting an entirely empty 'diff' (that is, header only) for files that are
non-existant in the repository and schedule-delete in the working copy.

> If you apply the diff that results using patch, it won't DTRT, for
> example.
> 

If that's the main criteria, just let me know.  I'm just trying to
work out if schedule-delete files are intended to be handled in any
non-obvious manner.

Regards,
Malcolm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Diffing schedule-deleted files - what's going on?

Posted by Daniel Berlin <db...@dberlin.org>.
On Sun, 2006-02-12 at 22:13 +0000, Malcolm Rowe wrote:
> Could anyone explain the rationale behind our treatment of
> schedule-deleted files encountered during 'svn diff'?  (I realise that
> using 'rationale' and 'diff' in the same sentence may provoke some humour,
> but presumably there is some reason for the current arrangement).
> 
> Specifically, we seem to treat such files as if they are empty, rather
> than as if they are absent.  This is rather strange, since diffs once
> we commit will (of course) treat the file as absent.
> 
> A reproduction recipe is below, which should hopefully make things a
> little clearer.
> 
> I'd like to fix repos-BASE diffs so that they don't have this odd
> behaviour (and fix another bug in exactly the same area), but before I do
> that, I'd like to understand what the correct behaviour for diffs against
> WORKING is supposed to be, since that might affect the way I do this.
> 
> [DannyB, I've cc:d you because you committed the original regression test,
> and so presumably you've got some idea about this.]

The only intention is that schedule delete files show as a diff from the
file to empty, which is what should be happening (because the entire
file is being deleted).

This matches expected user behavior (I received a large number of
complaints when it used to treat them as absent not to mention it caused
*errors* from diff).

Why do you believe deleting the file should not show up at all in diff?

If you apply the diff that results using patch, it won't DTRT, for
example.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org