You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2008/10/07 21:43:11 UTC

Re: diff error? url@rev against local mods -- was: Re: diff_repos_wc question

On Sun, 2008-10-05 at 04:17 +0200, Neels J Hofmeyr wrote: 
> > Karl Fogel wrote:
> >> "Neels J. Hofmeyr" <ne...@elego.de> writes:
> >>> Let me rephrase that:
> >>>
> >>> Is it currently possible to do a diff between
> >>>   URL@REV
> >>> and
> >>>   an unrelated[1] working copy with local mods
> >>> , and if yes, how?
> >>> [1] unrelated, as in not of the same URL.
> >>    $ cd svn_trunk_working_copy
> >>    $ echo "This is a local mod." >> README
> >>    $ svn diff --old=http://svn.collab.net/repos/svn/trunk/INSTALL@33019 --new=README
> >>
> >> -Karl
> > 
> > Ah, thanks for that. I was quite unlucky (cough) to have missed that
> > combination.
> > 
> > Alas, it looks like julianf and me found a case where `svn diff' reports
> > wrongly. (It is a crucial diff case for implementing directory comparison
> > for tree-conflicts.)
> > 
> > I've attached test scripts and an output screenshot when using current trunk.
> > 
> > "Create two similar trees, but leave some of the changes uncommitted in
> > the second tree. Then diff the url of the first to the working copy of
> > the second tree."
[...] 
> > When reading the "Results" in the output, note that
[...] 
> > 
> > 2) the other local change (the file addition of `baz') is sort of not seen
> > by diff and reported as if baz was missing in `--new' (wrong).

Sadly, there are several cases of "svn diff" that are just broken. This
appears to be one of them.

> > 3) Also, the diff is sort of ambiguous about reporting the file name:
> > Which is it, quux/baz or foo/baz?
> > "--- quux/baz  (.../file:///arch/elego/svn/test/repos/foo/baz) (revision 2)"

Yes, it's broken there too, though of course these header lines are
unimportant for our purposes.

> > Is this really an error or am I still missing something?

It's really an error.

Can you maybe add a test for it, and/or file an issue, and/or fix it, or
just ignore this case, while still writing the directory compare thing?

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: my cmdline tests are extremely slow

Posted by Neels J Hofmeyr <ne...@elego.de>.
Thanks Mike, that seems to be the reason. I'm using a system-provided APR
AFAIR and it probably uses the blocking /dev/random for security reasons.
Using this

  rngd -r /dev/urandom -o /dev/random -f -t 1

seems to alleviate the problem without recompiling APR.
Found it on this blog:

http://www.chrissearle.org/blog/technical/increase_entropy_26_kernel_linux_box

~Neels

C. Michael Pilato wrote:
> Neels J. Hofmeyr wrote:
>> Stefan Sperling wrote:
>>> On Thu, Oct 09, 2008 at 07:28:43AM +0200, Neels J Hofmeyr wrote:
>>>> Also, I haven't actually checked *all* of the commandline tests. It takes
>>>> ages to complete, with apparent longer breaks of doing nothing at all, when
>>>> watching the CPU and disk activity.
>>>>
>>>> And then I have to run it twice, once for trunk and once for my patch, to be
>>>> able to compare test outcomes. How do you handle this kind of situation?
>>> Use a RAM disk to host temporary test data.
>>> See http://subversion.tigris.org/faq.html#ramdisk-tests
>> So I did that. But still, running merge_tests.py --cleanup in the ramdisk
>> has random pauses of all activity for sometimes as much as 15 seconds.
>> That's ridiculous. Why can't it just race through the tests, idling at most
>> a second at a time? Why is my CPU staying cold?
>>
>> I've also seen that if I run a single test, it goes swiftly. Then, if I run
>> a whole list of them, there appear to be completely random pauses between
>> the tests sometimes, and sometimes not. It's weirding me out...
> 
> I'm guessing it's a lack of entropy?  See if during your pauses things go
> faster when you jiggle your mouse.  (And then try compiling APR with
> --with-devrandom=/dev/urandom so you don't have to do said jiggling all the
> time.)
> 
> 


Re: my cmdline tests are extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Neels J. Hofmeyr wrote on Sat, 1 Nov 2008 at 06:23 +0100:
> > The test passes for me when I change the delay to two seconds -- any 
> > reason not to commit that?
> 
> I guess it's worth committing and seeing if anyone complains about it. If it
> saves everyone of us on average half a minute of waiting on every make
> check, that is nice.
> 

+1 (I thought the same).  r33992.

> > (Theoretically we could do even better, since the '-r {date}' syntax has
> > microsecond resolution.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: my cmdline tests are extremely slow

Posted by "Neels J. Hofmeyr" <ne...@elego.de>.

Daniel Shahaf wrote:
> Julian Foad wrote on Thu, 30 Oct 2008 at 17:53 -0000:
>> By the way, there is one test in the test suite that purposefully waits
>> until the next minute boundary. I think it's checkout_tests 12.
>>
>> I'd welcome any improvement to that test. I don't think it needs to do
>> the long delay.
>>
> 
> The test passes for me when I change the delay to two seconds -- any 
> reason not to commit that?

I guess it's worth committing and seeing if anyone complains about it. If it
saves everyone of us on average half a minute of waiting on every make
check, that is nice.

~Neels

> 
> (Theoretically we could do even better, since the '-r {date}' syntax has
> microsecond resolution.)
> 
> Daniel
> 
>> - Julian
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
> 


Re: my cmdline tests are extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Julian Foad wrote on Thu, 30 Oct 2008 at 17:53 -0000:
> By the way, there is one test in the test suite that purposefully waits
> until the next minute boundary. I think it's checkout_tests 12.
> 
> I'd welcome any improvement to that test. I don't think it needs to do
> the long delay.
> 

The test passes for me when I change the delay to two seconds -- any 
reason not to commit that?

(Theoretically we could do even better, since the '-r {date}' syntax has
microsecond resolution.)

Daniel

> - Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: my cmdline tests are extremely slow

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-10-30 at 10:49 -0400, C. Michael Pilato wrote:
> Neels J. Hofmeyr wrote:
> > 
> > Stefan Sperling wrote:
> >> On Thu, Oct 09, 2008 at 07:28:43AM +0200, Neels J Hofmeyr wrote:
> >>> Also, I haven't actually checked *all* of the commandline tests. It takes
> >>> ages to complete, with apparent longer breaks of doing nothing at all, when
> >>> watching the CPU and disk activity.
> >>>
> >>> And then I have to run it twice, once for trunk and once for my patch, to be
> >>> able to compare test outcomes. How do you handle this kind of situation?
> >> Use a RAM disk to host temporary test data.
> >> See http://subversion.tigris.org/faq.html#ramdisk-tests
> > 
> > So I did that. But still, running merge_tests.py --cleanup in the ramdisk
> > has random pauses of all activity for sometimes as much as 15 seconds.
> > That's ridiculous. Why can't it just race through the tests, idling at most
> > a second at a time? Why is my CPU staying cold?
> >
> > I've also seen that if I run a single test, it goes swiftly. Then, if I run
> > a whole list of them, there appear to be completely random pauses between
> > the tests sometimes, and sometimes not. It's weirding me out...
> 
> I'm guessing it's a lack of entropy?  See if during your pauses things go
> faster when you jiggle your mouse.  (And then try compiling APR with
> --with-devrandom=/dev/urandom so you don't have to do said jiggling all the
> time.)

By the way, there is one test in the test suite that purposefully waits
until the next minute boundary. I think it's checkout_tests 12.

I'd welcome any improvement to that test. I don't think it needs to do
the long delay.

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: my cmdline tests are extremely slow

Posted by "C. Michael Pilato" <cm...@collab.net>.
Neels J. Hofmeyr wrote:
> 
> Stefan Sperling wrote:
>> On Thu, Oct 09, 2008 at 07:28:43AM +0200, Neels J Hofmeyr wrote:
>>> Also, I haven't actually checked *all* of the commandline tests. It takes
>>> ages to complete, with apparent longer breaks of doing nothing at all, when
>>> watching the CPU and disk activity.
>>>
>>> And then I have to run it twice, once for trunk and once for my patch, to be
>>> able to compare test outcomes. How do you handle this kind of situation?
>> Use a RAM disk to host temporary test data.
>> See http://subversion.tigris.org/faq.html#ramdisk-tests
> 
> So I did that. But still, running merge_tests.py --cleanup in the ramdisk
> has random pauses of all activity for sometimes as much as 15 seconds.
> That's ridiculous. Why can't it just race through the tests, idling at most
> a second at a time? Why is my CPU staying cold?
>
> I've also seen that if I run a single test, it goes swiftly. Then, if I run
> a whole list of them, there appear to be completely random pauses between
> the tests sometimes, and sometimes not. It's weirding me out...

I'm guessing it's a lack of entropy?  See if during your pauses things go
faster when you jiggle your mouse.  (And then try compiling APR with
--with-devrandom=/dev/urandom so you don't have to do said jiggling all the
time.)


-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


my cmdline tests are extremely slow

Posted by "Neels J. Hofmeyr" <ne...@elego.de>.

Stefan Sperling wrote:
> On Thu, Oct 09, 2008 at 07:28:43AM +0200, Neels J Hofmeyr wrote:
>> Also, I haven't actually checked *all* of the commandline tests. It takes
>> ages to complete, with apparent longer breaks of doing nothing at all, when
>> watching the CPU and disk activity.
>>
>> And then I have to run it twice, once for trunk and once for my patch, to be
>> able to compare test outcomes. How do you handle this kind of situation?
> 
> Use a RAM disk to host temporary test data.
> See http://subversion.tigris.org/faq.html#ramdisk-tests

So I did that. But still, running merge_tests.py --cleanup in the ramdisk
has random pauses of all activity for sometimes as much as 15 seconds.
That's ridiculous. Why can't it just race through the tests, idling at most
a second at a time? Why is my CPU staying cold?

I've also seen that if I run a single test, it goes swiftly. Then, if I run
a whole list of them, there appear to be completely random pauses between
the tests sometimes, and sometimes not. It's weirding me out...

Thanks for any help.
~Neels


Re: dirs_same_p and summarizing diff

Posted by Stefan Sperling <st...@elego.de>.
On Thu, Oct 09, 2008 at 07:28:43AM +0200, Neels J Hofmeyr wrote:
> Also, I haven't actually checked *all* of the commandline tests. It takes
> ages to complete, with apparent longer breaks of doing nothing at all, when
> watching the CPU and disk activity.
> 
> And then I have to run it twice, once for trunk and once for my patch, to be
> able to compare test outcomes. How do you handle this kind of situation?

Use a RAM disk to host temporary test data.
See http://subversion.tigris.org/faq.html#ramdisk-tests
 
> ...and, I think I'd like to open a micro-branch for this thing. I can just
> decide about that myself now, right?

Yes.
http://subversion.tigris.org/hacking.html#lightweight-branches says:
"There's no need to ask -- just do it."

:)
Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: dirs_same_p and summarizing diff

Posted by "Neels J. Hofmeyr" <ne...@elego.de>.

Julian Foad wrote:
> Mark Phippard wrote:
>> On Thu, Oct 9, 2008 at 1:28 AM, Neels J Hofmeyr <ne...@elego.de> wrote:
[...]
>>> ...and, I think I'd like to open a micro-branch for this thing. I can just
>>> decide about that myself now, right?
>> I have a "meta-question".  At some point are you going to make the
>> actual svn diff --summarize command support a WC to REPOS compare?  As
>> I said a few weeks ago, this is something that GUI clients would
>> really benefit from having work.  If that is the plan, does it make
>> sense to get that code into trunk and working first?
> 
> Yes, that is what we're doing here. We'll see the user-level "diff
> --summarize" support extended before or no later than tree conflicts
> make use of it.
> 
> - Julian

Created branch "diff-repos-wc" in r33591 and r33592.

-- 
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696  mobile: +49 177 2345869  fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194


Re: dirs_same_p and summarizing diff

Posted by Julian Foad <ju...@btopenworld.com>.
Mark Phippard wrote:
> On Thu, Oct 9, 2008 at 1:28 AM, Neels J Hofmeyr <ne...@elego.de> wrote:
> > What about another case I encountered today in ./merge_tests.py 88 89:
> >  foreign-repos <> wc
> >
> > A diff doesn't work, but a merge does (test script attached):
> >
[...]
> >
> > So, if we were to add a summarizing-diff before each merge, it would break
> > all foreign-repos merges...
> 
> Couldn't we just skip some of this stuff for foreign repository
> merges?  We already do some of that with merge tracking.  I think it
> is reasonable for some features to be removed when doing this.

To the principle that we shouldn't spend too much effort on making
foreign-repository merges work as well as same-repository merges, I
agree. Yes, we can explicitly disable this check if necessary.

I hope fixing it properly will actually turn out to be the simpler
solution (as many of such problems come from having overly special-cased
things in the first place), but I haven't looked into the details so
this may be naïve.

> > ...and, I think I'd like to open a micro-branch for this thing. I can just
> > decide about that myself now, right?
> 
> I have a "meta-question".  At some point are you going to make the
> actual svn diff --summarize command support a WC to REPOS compare?  As
> I said a few weeks ago, this is something that GUI clients would
> really benefit from having work.  If that is the plan, does it make
> sense to get that code into trunk and working first?

Yes, that is what we're doing here. We'll see the user-level "diff
--summarize" support extended before or no later than tree conflicts
make use of it.

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: dirs_same_p and summarizing diff

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Oct 9, 2008 at 1:28 AM, Neels J Hofmeyr <ne...@elego.de> wrote:
> What about another case I encountered today in ./merge_tests.py 88 89:
>  foreign-repos <> wc
>
> A diff doesn't work, but a merge does (test script attached):
>
> A         wc1/file
> Adding         wc1/file
> Transmitting file data .
> Committed revision 1.
> A         wc2/file
> + svn diff --old=file:////arch/elego/svn/test/repos1 --new=wc2
> subversion/libsvn_ra_local/ra_plugin.c:266: (apr_err=170000)
> svn: 'file:///arch/elego/svn/test/repos1'
> is not the same repository as
> 'file:///arch/elego/svn/test/repos2'
> + echo
>
> + svn merge -c 1 file:////arch/elego/svn/test/repos1 wc2
> Conflict discovered in 'wc2/file'.
> Select: (p) postpone, (df) diff-full, (e) edit,
>        (mc) mine-conflict, (tc) theirs-conflict,
>        (s) show all options: e
> Select: (p) postpone, (df) diff-full, (e) edit, (r) resolved,
>        (mc) mine-conflict, (tc) theirs-conflict,
>        (s) show all options: r
> --- Merging (from foreign repository) r1 into 'wc2':
> A    wc2/file
> C    wc2
>
>
> So, if we were to add a summarizing-diff before each merge, it would break
> all foreign-repos merges...

Couldn't we just skip some of this stuff for foreign repository
merges?  We already do some of that with merge tracking.  I think it
is reasonable for some features to be removed when doing this.

> ...and, I think I'd like to open a micro-branch for this thing. I can just
> decide about that myself now, right?

I have a "meta-question".  At some point are you going to make the
actual svn diff --summarize command support a WC to REPOS compare?  As
I said a few weeks ago, this is something that GUI clients would
really benefit from having work.  If that is the plan, does it make
sense to get that code into trunk and working first?

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

dirs_same_p and summarizing diff

Posted by Neels J Hofmeyr <ne...@elego.de>.
Hi Julian,

many thanks for your much-needed clarifications on the editor/diff callbacks
frameworks! I'll try to put them to practice.

Julian Foad wrote:
> Great. It doesn't seem to be working quite right yet, though. A simple
> test on the command line, in a Subversion WC with your patch applied,
> shows:
> 
> [[[
> $ svn status
> M       subversion/tests/cmdline/merge_tests.py
> M       subversion/libsvn_client/merge.c
> M       subversion/libsvn_client/diff.c
> M       subversion/libsvn_client/repos_diff_summarize.c
> 
> $ svn diff --summarize -r33519 subversion/
> M      subversion/subversion/libsvn_wc/update_editor.c
> ]]]

lol, that bad :)
I wonder why the merge tests using it to compare directories are passing, then.

> There are two different mechanisms for (streamily) representing a tree
> diff: the "delta editor" and the "diff callbacks". In the existing (svn
> 1.5) diff code, notice how the two different mechanisms are used:
> 
> (repos <> repos):
> 
>   normal-diff:
>       editor = svn_client__get_diff_editor(CALLBACKS);
>       reporter = svn_ra_do_diff3(editor);
>       reporter->finish_report();
> 
>   diff-summarize:
>       editor = svn_client__get_diff_summarize_editor(SUMMARIZE_FUNC);
>       reporter = svn_ra_do_diff3(editor);
>       reporter->finish_report();
> 
> (repos <> wc):
> 
>   normal-diff:
>       editor = svn_wc_get_diff_editor5(CALLBACKS);
>       reporter = svn_ra_do_diff3(editor);
>       svn_wc_crawl_revisions3(reporter);
> 
>   diff-summarize:
>       n/a
> 
> (wc <> wc):
> 
>   normal-diff:
>       svn_wc_diff5(CALLBACKS);
> 
>   diff-summarize:
>       n/a
> 

What about another case I encountered today in ./merge_tests.py 88 89:
 foreign-repos <> wc

A diff doesn't work, but a merge does (test script attached):

A         wc1/file
Adding         wc1/file
Transmitting file data .
Committed revision 1.
A         wc2/file
+ svn diff --old=file:////arch/elego/svn/test/repos1 --new=wc2
subversion/libsvn_ra_local/ra_plugin.c:266: (apr_err=170000)
svn: 'file:///arch/elego/svn/test/repos1'
is not the same repository as
'file:///arch/elego/svn/test/repos2'
+ echo

+ svn merge -c 1 file:////arch/elego/svn/test/repos1 wc2
Conflict discovered in 'wc2/file'.
Select: (p) postpone, (df) diff-full, (e) edit,
        (mc) mine-conflict, (tc) theirs-conflict,
        (s) show all options: e
Select: (p) postpone, (df) diff-full, (e) edit, (r) resolved,
        (mc) mine-conflict, (tc) theirs-conflict,
        (s) show all options: r
--- Merging (from foreign repository) r1 into 'wc2':
A    wc2/file
C    wc2


So, if we were to add a summarizing-diff before each merge, it would break
all foreign-repos merges...


>> A couple of tests still fail with my patch, I'll investigate.
>> merge_tests.py 3 88 89 114 115 119
>> diff_tests.py 45

I fixed all but merge_tests.py 88 and 89, the foreign-repos ones.

Also, I haven't actually checked *all* of the commandline tests. It takes
ages to complete, with apparent longer breaks of doing nothing at all, when
watching the CPU and disk activity.

And then I have to run it twice, once for trunk and once for my patch, to be
able to compare test outcomes. How do you handle this kind of situation?

...and, I think I'd like to open a micro-branch for this thing. I can just
decide about that myself now, right?

Thanks
~Neels


-- 
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696  mobile: +49 177 2345869  fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194


Re: diff_summarize: item batons vs. item cache

Posted by Neels J Hofmeyr <ne...@elego.de>.
Sorry, my previous post was too quick.
Comment inline.

Neels J Hofmeyr wrote:
> Hi,
> 
> As Julian suggested:
> 
> Julian Foad wrote:
>> However, it would make much more sense if it is possible to write a set
>> of summarizing diff-callbacks that could be passed to any of the three
>> existing (normal-diff) code paths:
>>
>> (repos <> repos):
>>       editor = svn_client__get_diff_editor(SUMMARIZE_CALLBACKS);
>>       ...
>>
>> (repos <> wc):
>>       editor = svn_wc_get_diff_editor5(SUMMARIZE_CALLBACKS);
>>       ...
>>
>> (wc <> wc):
>>       svn_wc_diff5(SUMMARIZE_CALLBACKS);
>>
> 
> , I'm trying to make SUMMARIZE_CALLBACKS based on the present diff callbacks
> svn_wc_diff_callbacks3_t. But there's this problem:
[...]
> Let's take a property change as example. Normal diff receives a property
[...]
> svn_wc_diff_callbacks3_t does not provide file|dir opened|closed callbacks
> and does not create distinct item batons.

Which is not a problem at all. I didn't see that there is actually just one
file_changed() callback which provides all the changes at once. I was
mislead by the existence of a separate sub-function to deal with props.

[...]
> Pity, though.
No, after all, it's still on :)

Thanks for bearing with my being a complete noob; sorry for the noise.

~Neels

-- 
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696  mobile: +49 177 2345869  fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194


diff_summarize: item batons vs. item cache

Posted by Neels J Hofmeyr <ne...@elego.de>.
Hi,

As Julian suggested:

Julian Foad wrote:
> However, it would make much more sense if it is possible to write a set
> of summarizing diff-callbacks that could be passed to any of the three
> existing (normal-diff) code paths:
> 
> (repos <> repos):
>       editor = svn_client__get_diff_editor(SUMMARIZE_CALLBACKS);
>       ...
> 
> (repos <> wc):
>       editor = svn_wc_get_diff_editor5(SUMMARIZE_CALLBACKS);
>       ...
> 
> (wc <> wc):
>       svn_wc_diff5(SUMMARIZE_CALLBACKS);
> 

, I'm trying to make SUMMARIZE_CALLBACKS based on the present diff callbacks
svn_wc_diff_callbacks3_t. But there's this problem:

Let's take a property change as example. Normal diff receives a property
change and prints that to screen. However, diff_summarize needs to save the
property change in a cache, later on record some file content changes as
well, and send out a complete summary on file_close. But
svn_wc_diff_callbacks3_t does not provide file|dir opened|closed callbacks
and does not create distinct item batons.

So I guess we do need the current design and have separate functions
  svn_client__get_diff_summarize_editor(SUMMARIZE_CALLBACKS);
  svn_wc_get_diff_summarize_editor(SUMMARIZE_CALLBACKS);
  svn_wc_diff_summarize(SUMMARIZE_CALLBACKS);
instead of reusing the existing diff editors and callback structure.

Unless we create some sort of item cache in the diff_summarize baton, as in
an apr_hash_t where node caches are added when prop changes are encountered
and deleted when content changes have come through.

Nah.
Pity, though.

~Neels


P.S.: This patch turns out to be quite educational. :)

-- 
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696  mobile: +49 177 2345869  fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194


Re: diff error? url@rev against local mods -- was: Re: diff_repos_wc question

Posted by Julian Foad <ju...@btopenworld.com>.
On Wed, 2008-10-08 at 05:20 +0200, Neels J Hofmeyr wrote:
> 
> Julian Foad wrote:
> >>> Is this really an error or am I still missing something?
> > 
> > It's really an error.
> > 
> > Can you maybe add a test for it, and/or file an issue, and/or fix it, or
> > just ignore this case, while still writing the directory compare thing?
> 
> Other than this error appearing in the tests, the directory compare thing is
> already working. Your suggested way of implementing diff_summarize looks
> very good in terms of avoiding to re-invent the wheel.

Great. It doesn't seem to be working quite right yet, though. A simple
test on the command line, in a Subversion WC with your patch applied,
shows:

[[[
$ svn status
M       subversion/tests/cmdline/merge_tests.py
M       subversion/libsvn_client/merge.c
M       subversion/libsvn_client/diff.c
M       subversion/libsvn_client/repos_diff_summarize.c

$ svn diff --summarize -r33519 subversion/
M      subversion/subversion/libsvn_wc/update_editor.c
]]]

(My WC BASE is presently at r33519, except for "update_editor.c" which
is at r33553.)

The output of "update_editor.c" is correct (r33519 of it differs from
r33553 of it) except that the file path is printed with an extra
"subversion/" component in it.

However, the four locally modified files are not shown. It looks like it
is showing -r33519:BASE whereas it should be showing -r33519:Working.

DEBUGGING

In diff_summarize_repos_wc(), it calls
svn_client__get_diff_summarize_editor(). However, that is for getting a
(repos <> repos) editor. We need a (repos <> wc) editor. The equivalent
non-summarize code path uses svn_wc_get_diff_editor5(). A call to that
is present but commented out, and we haven't written the set of diff
callbacks that need to be provided to it.

There are two different mechanisms for (streamily) representing a tree
diff: the "delta editor" and the "diff callbacks". In the existing (svn
1.5) diff code, notice how the two different mechanisms are used:

(repos <> repos):

  normal-diff:
      editor = svn_client__get_diff_editor(CALLBACKS);
      reporter = svn_ra_do_diff3(editor);
      reporter->finish_report();

  diff-summarize:
      editor = svn_client__get_diff_summarize_editor(SUMMARIZE_FUNC);
      reporter = svn_ra_do_diff3(editor);
      reporter->finish_report();

(repos <> wc):

  normal-diff:
      editor = svn_wc_get_diff_editor5(CALLBACKS);
      reporter = svn_ra_do_diff3(editor);
      svn_wc_crawl_revisions3(reporter);

  diff-summarize:
      n/a

(wc <> wc):

  normal-diff:
      svn_wc_diff5(CALLBACKS);

  diff-summarize:
      n/a


So it looks like one way to fill in the (repos <> wc) summarize case
would be with a new function:

  svn_wc_get_diff_summarize_editor()

which would be based on svn_client__get_diff_summarize_editor() and
svn_wc_get_diff_editor5().

However, it would make much more sense if it is possible to write a set
of summarizing diff-callbacks that could be passed to any of the three
existing (normal-diff) code paths:

(repos <> repos):
      editor = svn_client__get_diff_editor(SUMMARIZE_CALLBACKS);
      ...

(repos <> wc):
      editor = svn_wc_get_diff_editor5(SUMMARIZE_CALLBACKS);
      ...

(wc <> wc):
      svn_wc_diff5(SUMMARIZE_CALLBACKS);

That would simplify (reduce code duplication in) the existing (repos <>
repos) summarize case as well as trivially supporting the missing cases.


> Now, if I/we could manage to fix the cases where diff is just broken, it
> would be a nice side-effect of tree-conflicts. Is it a -- as stsp would say
> -- huge can of worms, though? What other diff failures are known?

I really don't want to go into that tonight. Like you say, let's make
the use of this diff framework be the incentive for fixing the broken
cases.

> I'd still +1 on carrying on with the diff_summarize solution of dirs_same_p,
> for now ignoring the breaking freak cases. It drastically raises the general
> incentive to fix (and exposure of) broken diff cases.
> 
> Julian, your original patch suggestion contains a change to files_same_p(),
> which I've dropped because I didn't understand it then. I'll still see
> whether I'll re-include it:
> * subversion/libsvn_client/merge.c
>   (files_same_p): Change to an implementation more like we're going to use
>     for directories - one that compares against the repository rather than
>     a local temporary file - to help me get up to speed on how to do this.
>   (merge_file_deleted): Use the new version of files_same_p().

There is no reason to include that change, unless you know of a reason.


> A couple of tests still fail with my patch, I'll investigate.
> merge_tests.py 3 88 89 114 115 119
> diff_tests.py 45

OK.

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: diff error? url@rev against local mods -- was: Re: diff_repos_wc question

Posted by Neels J Hofmeyr <ne...@elego.de>.

Julian Foad wrote:
>>> Is this really an error or am I still missing something?
> 
> It's really an error.
> 
> Can you maybe add a test for it, and/or file an issue, and/or fix it, or
> just ignore this case, while still writing the directory compare thing?

Other than this error appearing in the tests, the directory compare thing is
already working. Your suggested way of implementing diff_summarize looks
very good in terms of avoiding to re-invent the wheel.

Now, if I/we could manage to fix the cases where diff is just broken, it
would be a nice side-effect of tree-conflicts. Is it a -- as stsp would say
-- huge can of worms, though? What other diff failures are known?

I'd still +1 on carrying on with the diff_summarize solution of dirs_same_p,
for now ignoring the breaking freak cases. It drastically raises the general
incentive to fix (and exposure of) broken diff cases.

Julian, your original patch suggestion contains a change to files_same_p(),
which I've dropped because I didn't understand it then. I'll still see
whether I'll re-include it:
* subversion/libsvn_client/merge.c
  (files_same_p): Change to an implementation more like we're going to use
    for directories - one that compares against the repository rather than
    a local temporary file - to help me get up to speed on how to do this.
  (merge_file_deleted): Use the new version of files_same_p().


A couple of tests still fail with my patch, I'll investigate.
merge_tests.py 3 88 89 114 115 119
diff_tests.py 45

~Neels


-- 
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696  mobile: +49 177 2345869  fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194

Re: diff error? url@rev against local mods -- was: Re: diff_repos_wc question

Posted by Neels J Hofmeyr <ne...@elego.de>.

Julian Foad wrote:
> On Sun, 2008-10-05 at 04:17 +0200, Neels J Hofmeyr wrote: 
>>> Karl Fogel wrote:
>>>> "Neels J. Hofmeyr" <ne...@elego.de> writes:
>>>>> Let me rephrase that:
>>>>>
>>>>> Is it currently possible to do a diff between
>>>>>   URL@REV
>>>>> and
>>>>>   an unrelated[1] working copy with local mods
>>>>> , and if yes, how?
>>>>> [1] unrelated, as in not of the same URL.
>>>>    $ cd svn_trunk_working_copy
>>>>    $ echo "This is a local mod." >> README
>>>>    $ svn diff --old=http://svn.collab.net/repos/svn/trunk/INSTALL@33019 --new=README
>>>>
>>>> -Karl
>>> Ah, thanks for that. I was quite unlucky (cough) to have missed that
>>> combination.
>>>
>>> Alas, it looks like julianf and me found a case where `svn diff' reports
>>> wrongly. (It is a crucial diff case for implementing directory comparison
>>> for tree-conflicts.)
>>>
>>> I've attached test scripts and an output screenshot when using current trunk.
>>>
>>> "Create two similar trees, but leave some of the changes uncommitted in
>>> the second tree. Then diff the url of the first to the working copy of
>>> the second tree."
> [...] 
>>> When reading the "Results" in the output, note that
> [...] 
>>> 2) the other local change (the file addition of `baz') is sort of not seen
>>> by diff and reported as if baz was missing in `--new' (wrong).
> 
> Sadly, there are several cases of "svn diff" that are just broken. This
> appears to be one of them.
> 
>>> 3) Also, the diff is sort of ambiguous about reporting the file name:
>>> Which is it, quux/baz or foo/baz?
>>> "--- quux/baz  (.../file:///arch/elego/svn/test/repos/foo/baz) (revision 2)"
> 
> Yes, it's broken there too, though of course these header lines are
> unimportant for our purposes.
> 
>>> Is this really an error or am I still missing something?
> 
> It's really an error.
> 
> Can you maybe add a test for it, and/or file an issue, and/or fix it, or
> just ignore this case, while still writing the directory compare thing?

Filed issue #3295,
committed diff_tests.py 49 "diff URL against working copy with local mods"

For issue #1675,
committed diff_tests.py 50 "diff -r1 of removed file to its local addition"

-- 
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696  mobile: +49 177 2345869  fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194