You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Burba <pa...@softlanding.com> on 2006/10/02 19:44:43 UTC

Re: [PATCH] Was: Enhancement needed in svn status -u

"Daniel L. Rall" <dl...@collab.net> wrote on 09/29/2005 07:17:17 PM:

> On Wed, 28 Sep 2005, Paul Burba wrote:
> 
> > "Daniel L. Rall" <dl...@finemaltcoding.com> wrote on 09/28/2005 12:58:20 
AM:
> > 
> > > On Mon, 26 Sep 2005, Paul Burba wrote:
> > > 
> > > > Here is the latest patch for improved out of date info when using 
svn 
> > > > status -u.
> > > > 
> > > > The background on all of this can be found here: 
> > > > http://svn.haxx.se/dev/archive-2005-08/0257.shtml
> > > ... 
> 
> Committed with a few formatting adjustments in r16344.

Dan,

A blast from the past...

Recall that r16344 was the ultimate solution for providing out of date 
info about a WC and that this originally arose due to problems with the 
Subclipse Synchronize view that Mark Phippard described in 
http://svn.haxx.se/subdev/archive-2005-08/0020.shtml.  Recently Mark asked 
me to look into a several problems/limitations that still adversely affect 
the Synchronize view, these are: 

1) Unless a directory has a pending child
   addition/deletion, or there is a prop change on the
   directory itself, the out of date (ood_*) fields
   in svn_wc_status2_t are not set, or are incorrectly
   set. 

2) The out of date (ood_*) fields for the root of WC
   are *never* set.

3) OOD info never bubbles up beyond an out of date
   path's immediate parent (if even that).  E.g. if
   WC/A/B/E/beta is out of date, A, and A/B don't
   have any ood info, and as described in #1, A/B/E
   might not have any either.

4) The OOD_last_cmt_rev field for deleted paths is
   always SVN_INVALID_REVNUM. 

The attached patch attempts to address all of these issues:

1) This problem results from a flaw in my original patch which was the 
basis for r16344 (the fault is definitely in my original code, not any 
change you made).  When we call tweak_statushash() for a dir_baton in 
close_directory() we are passing the db->parent_baton, but the ood info 
for dir_baton is in itself.  So when tweak_statushash() sets the ood info 
on the parent_baton it just resets all ood fields to the parent's existing 
values.  To fix this I added a second baton arg to tweak_statushash so the 
ood info is available.

2) Following your and Lieven's example from r21685, I set the 
eb->anchor_status->ood_* fields directly in close_directory().

3) This is closely related to #1: We only call tweak_statushash() in 
close_directory() in the first place if a dir is added, deleted, or has a 
prop change.  I changed this to also make the call if 
dir_baton->ood_last_commit_rev is set to anything but the default value of 
SVN_INVALID_REVNUM. 

4) This was a bit trickier to solve.  I saw no straightforward way to find 
out the revision an out of date path was deleted within libsvn_wc, so I 
added a new function to libsvn_repos: svn_repos_deleted_rev().  I then use 
this new function in reporter.c's delta_dirs() and update_entry() to get 
the rev a path was deleted and then pass this info as the revision 
argument to the svn_delta_editor_t's delete_entry() callback.  I'm not 
totally clear from this comment how the revision arg is supposed to be 
used, currently we just pass SVN_INVALID_REVNUM.  The doc string for 
delete_entry() says:

  /** Remove the directory entry named @a path, a child of the directory
   * represented by @a parent_baton.  If @a revision is set, it is used as 
a
   * sanity check to ensure that you are removing the revision of @a path
   * that you really think you are.
   *
   * All allocations should be performed in @a pool.
   */

But I'm not sure what a "sanity check" implies exactly.  I'd like to think 
that what I'm doing (passing the rev the path was deleted) is compatible 
with the doc string, but it seems *too* easy that an essentially unused 
arg already exists and does just what I require :-)

Assuming my changes to libsvn_repos are ok, there is still the problem of 
older servers not providing this information.  My solution is to look to 
deleted path's parent folder and use it's out of date info.  There is some 
chance that this info is actually correct, and even if it is wrong it 
would still be valid in the sense that it is a revision higher than the 
revision the path was deleted.  Re the Subclipse Synchronize use case this 
is at least somewhat helpful.  And since the present alternative is no 
info at all (i.e. SVN_INVALID_REVNUM) this slightly flawed alternative 
seems at least a partial improvement.
 
> I have written code to expose this additional status information via the
> JavaHL bindings (org.tigris.subversion.javahl.Status), and am in the 
process
> of figuring out how to add a unit test for that to
> BasicTests.testBasicStatus().

Did you create a test for this?  I didn't see anything committed, but if 
you have any significant work in progress could I have a copy?  Problem #1 
above highlights the need for a real test of this functionality. 

If anyone has some time to look at this I'd appreciate it, thanks,

Paul B.

[[[
Further improvements to status information on working copy items which
are out of date compared to the repository.

Follow-up to r16344 (and its subsequent follow-ups: r16494, 16784, 16796,
16829, and 17844).

* subversion/include/svn_repos.h
  (svn_repos_deleted_rev): New function to find the revision a path was
  most recently deleted within a give revision range.

* subversion/libsvn_repos/reporter.c
  (update_entry, delta_dirs): Use the new function svn_repos_deleted_rev()
  to determine the revision deleted paths were deleted and pass this to
  the delete_entry callback rather than defaulting to SVN_INVALID_REVNUM.

* subversion/libsvn_repos/rev_hunt.c
  (svn_repos_deleted_rev): New function definition.

* subversion/libsvn_wc/status.c
  (tweak_statushash): Add second baton argument which contains the out
  of date info for a dir baton when tweaking that baton's parent.  Add
  another argument to identify the revision a path was deleted when
  handling deletes.  When deleting paths: Construct the correct url for
  the path and record deleted path's deleted revision in the path's
  svn_wc_status2_t structure.  For pre-1.5 servers, which don't provide
  the deleted revision, use the parent's last committed rev as a best
  guess.
  (delete_entry, close_file): Supply new args to tweak_statushash()
  calls.
  (close_directory): Tweak status for directories even when the only
  change is that they have and an out of date descendents.  Supply new
  args to tweak_statushash() call.
]]]



P.S. In testing this patch I used the following hack to make the debug 
version of svn status show all the OOD info on the command line.  It also 
adds a temporary status test to setup various out of date scenarios.  It 
may prove useful to anyone who tries out this patch:


Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Daniel Rall <dl...@collab.net>.
On Wed, 25 Oct 2006, Paul Burba wrote:
....
> > > > Some of the Python status tests do check for "*", but don't test the
> > > > individual pieces of "out of date" information retrieved from 
> > > > repository.
> > > 
> > > You're making this statement for the benefit of others reading this 
> yes? 
> > > Or is there a question for me in here? :-)
> > 
> > Yup.
> 
> 'Yup' to which question?  "Would you like coffee or tea?", "YES!" :-)

Sorry, thought I'd trimmed off more of your post -- the comment was
only for posterity.

...
> > > B) Possible side effects of passing wrong revision to
> > > svn_delta_editor_t delete_entry() implementations.
> > 
> > Any findings here?
> 
> Short answer is no.  I gave the long answer earlier in this thread:
> 
>   http://svn.haxx.se/dev/archive-2006-10/0344.shtml
> 
> If my answer there isn't satisfactory let me know.

I'll respond there if I've anything to add.

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Paul Burba <pa...@softlanding.com>.
Paul Burba <pa...@softlanding.com> wrote on 10/27/2006 02:57:26 PM:

> Daniel Rall <dl...@collab.net> wrote on 10/25/2006 02:55:09 PM:
> 
> > I've created a temporary branch, "ood-status-info", so we have a place
> > to work on refining these fixes.
> > 
> > On Fri, 13 Oct 2006, Paul Burba wrote:
> <snip>
> > > Our remaining issues(?):
> > > 
> > > A) The performance of svn_repos_deleted_rev()
> > 
> > This is currently too slow to integrate into trunk.  Let's write a
> > faster implementation -- hopefully we can speed things up by moving
> > the logic down inside the FS layer.
> 
> I think I solved this performance problem, not in FS, but with a 
different 
> implmentation of the new function svn_repos_deleted_rev().
> 
> A quick recap the existing performance problem (in r22118 of 
> svn/branches/ood-status-info):
> 
>  * Start with a standard greek-tree test (fsfs) repos.
>  * Delete /A/D/H/omega in r2.
>  * Make a whole slew of (50,000+) of text mods to the other files.
>  * Check out r1 via file://
>  * Run "time svn st -u -v" in Cygwin
> 
> r22118 of svn_repos_deleted_rev() implementation takes almost 5 minutes 
on 
> my machine:
> 
> $ time svn st -v -u SVN/stat-tests-1/
>        *        1        1 jrandom      SVN\stat-tests-1\A\B\E\beta
>        *        1        1 jrandom      SVN\stat-tests-1\A\B\E\alpha
>                 1        1 jrandom      SVN\stat-tests-1\A\B\E
>        *        1        1 jrandom      SVN\stat-tests-1\A\B\lambda
>                 1        1 jrandom      SVN\stat-tests-1\A\B\F
>                 1        1 jrandom      SVN\stat-tests-1\A\B
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\G\pi
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\G\rho
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\G\tau
>                 1        1 jrandom      SVN\stat-tests-1\A\D\G
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\H\omega
>                 1        1 jrandom      SVN\stat-tests-1\A\D\H\psi
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\H\chi
>                 1        1 jrandom      SVN\stat-tests-1\A\D\H
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\gamma
>                 1        1 jrandom      SVN\stat-tests-1\A\D
>        *        1        1 jrandom      SVN\stat-tests-1\A\mu
>                 1        1 jrandom      SVN\stat-tests-1\A\C
>                 1        1 jrandom      SVN\stat-tests-1\A
>        *        1        1 jrandom      SVN\stat-tests-1\iota
>                 1        1 jrandom      SVN\stat-tests-1
> Status against revision:  53600
> 
> real    4m58.004s
> user    0m0.015s
> sys     0m0.031s
> 
> The slowness is due to svn_repos_deleted_rev's linear search that starts 

> at the last rev in the search range and works backwards, looking for the 

> last time a path existed.  In the above scenario this takes a long time 
> because a 50k rev files need to be opened.  At the time I'd thought the 
> linear was the only way to go because a path might be deleted, re-added 
> and deleted many times, so a binary search wouldn't work.  But as Mark 
> pointed out to me, we don't really care if a path of the same name was 
> later added then deleted, that is for the path's parent to care about in 

> it's ood info.  Even if a path is moved within the same directory this 
> still holds.
> 
> Looking at the problem in this new light, I used svn_fs_node_id() and 
> svn_fs_compare_ids() to reimplement svn_repos_deleted_rev() as a 
> non-recursive binary search and committed it to the ood branch in 
r22139.
> 
> The change from linear to logarithmic running time has the expected 
> results, the 5 minute operation now takes less than a second:
> 
> $ time svn st -u -v SVN/stat-tests-1
>        *        1        1 jrandom      SVN\stat-tests-1\A\B\E\beta
>        *        1        1 jrandom      SVN\stat-tests-1\A\B\E\alpha
>                 1        1 jrandom      SVN\stat-tests-1\A\B\E
>        *        1        1 jrandom      SVN\stat-tests-1\A\B\lambda
>                 1        1 jrandom      SVN\stat-tests-1\A\B\F
>                 1        1 jrandom      SVN\stat-tests-1\A\B
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\G\pi
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\G\rho
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\G\tau
>                 1        1 jrandom      SVN\stat-tests-1\A\D\G
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\H\omega
>                 1        1 jrandom      SVN\stat-tests-1\A\D\H\psi
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\H\chi
>                 1        1 jrandom      SVN\stat-tests-1\A\D\H
>        *        1        1 jrandom      SVN\stat-tests-1\A\D\gamma
>                 1        1 jrandom      SVN\stat-tests-1\A\D
>        *        1        1 jrandom      SVN\stat-tests-1\A\mu
>                 1        1 jrandom      SVN\stat-tests-1\A\C
>                 1        1 jrandom      SVN\stat-tests-1\A
>        *        1        1 jrandom      SVN\stat-tests-1\iota
>                 1        1 jrandom      SVN\stat-tests-1
> Status against revision:  53600
> 
> real    0m0.340s
> user    0m0.015s
> sys     0m0.015s
> 
> Anyone see any problems with this approach?
> 
> Paul B.

For anyone who missed it, we discussed this patch at length on IRC on 
10/27:



Essentially the conclusion was that a binary search would work to find the 
*first* time a path was deleted.  Problem was, my binary search algorithm 
in r22139 made incorrect assumptions about how svn_fs_compare_ids() 
behaved when comparing copied/modified nodes.  I addressed these problems 
in r22234 of svn/branches/ood-status-info.  I also made an effort to 
clearly document how the binary search works in the comments for 
svn_repos_deleted_rev()...it might be worth look at those comments first 
if you have any questions on this.

Paul B.

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Paul Burba <pa...@softlanding.com>.
Daniel Rall <dl...@collab.net> wrote on 10/25/2006 02:55:09 PM:

> I've created a temporary branch, "ood-status-info", so we have a place
> to work on refining these fixes.
> 
> On Fri, 13 Oct 2006, Paul Burba wrote:
<snip>
> > Our remaining issues(?):
> > 
> > A) The performance of svn_repos_deleted_rev()
> 
> This is currently too slow to integrate into trunk.  Let's write a
> faster implementation -- hopefully we can speed things up by moving
> the logic down inside the FS layer.

I think I solved this performance problem, not in FS, but with a different 
implmentation of the new function svn_repos_deleted_rev().

A quick recap the existing performance problem (in r22118 of 
svn/branches/ood-status-info):

 * Start with a standard greek-tree test (fsfs) repos.
 * Delete /A/D/H/omega in r2.
 * Make a whole slew of (50,000+) of text mods to the other files.
 * Check out r1 via file://
 * Run "time svn st -u -v" in Cygwin

r22118 of svn_repos_deleted_rev() implementation takes almost 5 minutes on 
my machine:

$ time svn st -v -u SVN/stat-tests-1/
       *        1        1 jrandom      SVN\stat-tests-1\A\B\E\beta
       *        1        1 jrandom      SVN\stat-tests-1\A\B\E\alpha
                1        1 jrandom      SVN\stat-tests-1\A\B\E
       *        1        1 jrandom      SVN\stat-tests-1\A\B\lambda
                1        1 jrandom      SVN\stat-tests-1\A\B\F
                1        1 jrandom      SVN\stat-tests-1\A\B
       *        1        1 jrandom      SVN\stat-tests-1\A\D\G\pi
       *        1        1 jrandom      SVN\stat-tests-1\A\D\G\rho
       *        1        1 jrandom      SVN\stat-tests-1\A\D\G\tau
                1        1 jrandom      SVN\stat-tests-1\A\D\G
       *        1        1 jrandom      SVN\stat-tests-1\A\D\H\omega
                1        1 jrandom      SVN\stat-tests-1\A\D\H\psi
       *        1        1 jrandom      SVN\stat-tests-1\A\D\H\chi
                1        1 jrandom      SVN\stat-tests-1\A\D\H
       *        1        1 jrandom      SVN\stat-tests-1\A\D\gamma
                1        1 jrandom      SVN\stat-tests-1\A\D
       *        1        1 jrandom      SVN\stat-tests-1\A\mu
                1        1 jrandom      SVN\stat-tests-1\A\C
                1        1 jrandom      SVN\stat-tests-1\A
       *        1        1 jrandom      SVN\stat-tests-1\iota
                1        1 jrandom      SVN\stat-tests-1
Status against revision:  53600

real    4m58.004s
user    0m0.015s
sys     0m0.031s

The slowness is due to svn_repos_deleted_rev's linear search that starts 
at the last rev in the search range and works backwards, looking for the 
last time a path existed.  In the above scenario this takes a long time 
because a 50k rev files need to be opened.  At the time I'd thought the 
linear was the only way to go because a path might be deleted, re-added 
and deleted many times, so a binary search wouldn't work.  But as Mark 
pointed out to me, we don't really care if a path of the same name was 
later added then deleted, that is for the path's parent to care about in 
it's ood info.  Even if a path is moved within the same directory this 
still holds.

Looking at the problem in this new light, I used svn_fs_node_id() and 
svn_fs_compare_ids() to reimplement svn_repos_deleted_rev() as a 
non-recursive binary search and committed it to the ood branch in r22139.

The change from linear to logarithmic running time has the expected 
results, the 5 minute operation now takes less than a second:

$ time svn st -u -v SVN/stat-tests-1
       *        1        1 jrandom      SVN\stat-tests-1\A\B\E\beta
       *        1        1 jrandom      SVN\stat-tests-1\A\B\E\alpha
                1        1 jrandom      SVN\stat-tests-1\A\B\E
       *        1        1 jrandom      SVN\stat-tests-1\A\B\lambda
                1        1 jrandom      SVN\stat-tests-1\A\B\F
                1        1 jrandom      SVN\stat-tests-1\A\B
       *        1        1 jrandom      SVN\stat-tests-1\A\D\G\pi
       *        1        1 jrandom      SVN\stat-tests-1\A\D\G\rho
       *        1        1 jrandom      SVN\stat-tests-1\A\D\G\tau
                1        1 jrandom      SVN\stat-tests-1\A\D\G
       *        1        1 jrandom      SVN\stat-tests-1\A\D\H\omega
                1        1 jrandom      SVN\stat-tests-1\A\D\H\psi
       *        1        1 jrandom      SVN\stat-tests-1\A\D\H\chi
                1        1 jrandom      SVN\stat-tests-1\A\D\H
       *        1        1 jrandom      SVN\stat-tests-1\A\D\gamma
                1        1 jrandom      SVN\stat-tests-1\A\D
       *        1        1 jrandom      SVN\stat-tests-1\A\mu
                1        1 jrandom      SVN\stat-tests-1\A\C
                1        1 jrandom      SVN\stat-tests-1\A
       *        1        1 jrandom      SVN\stat-tests-1\iota
                1        1 jrandom      SVN\stat-tests-1
Status against revision:  53600

real    0m0.340s
user    0m0.015s
sys     0m0.015s

Anyone see any problems with this approach?

Paul B.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Paul Burba <pa...@softlanding.com>.
Daniel Rall <dl...@collab.net> wrote on 10/25/2006 02:55:09 PM:

> I've created a temporary branch, "ood-status-info", so we have a place
> to work on refining these fixes.
> 
> On Fri, 13 Oct 2006, Paul Burba wrote:
> 
> > Daniel Rall <dl...@collab.net> wrote on 10/12/2006 05:37:45 PM:
> > 
> > > [Focusing on the Java/test portion of this thread.]
> > > 
> > > On Wed, 11 Oct 2006, Paul Burba wrote:
> > > 
> > > > Daniel Rall <dl...@collab.net> wrote on 10/06/2006 03:49:32 PM:
> ...
> > > and have committed this code in an XFAIL-style in r21908.
> ...
> > Though we'd need to detect XPASS no?
> 
> Yeah, we want to report an unexpected pass.  I've done so on trunk in
> r22112.
> 
> ...
> > > Some of the Python status tests do check for "*", but don't test the
> > > individual pieces of "out of date" information retrieved from 
> > > repository.
> > 
> > You're making this statement for the benefit of others reading this 
yes? 
> > Or is there a question for me in here? :-)
> 
> Yup.

'Yup' to which question?  "Would you like coffee or tea?", "YES!" :-)
 
> ...
> > Our remaining issues(?):
> > 
> > A) The performance of svn_repos_deleted_rev()
> 
> This is currently too slow to integrate into trunk.  Let's write a
> faster implementation -- hopefully we can speed things up by moving
> the logic down inside the FS layer.

I'll post a separate message to catch the attention of FS gurus and get 
their opinion.

> > B) Possible side effects of passing wrong revision to
> > svn_delta_editor_t delete_entry() implementations.
> 
> Any findings here?

Short answer is no.  I gave the long answer earlier in this thread:

  http://svn.haxx.se/dev/archive-2006-10/0344.shtml

If my answer there isn't satisfactory let me know.

Paul B.

> I've committed the following patch to the "ood-status-info" branch in
> r22116:
> 
> > [[[
> > Further improvements to status information on working copy items which
> > are out of date compared to the repository.
> > 
> > Follow-up to r16344 (and its subsequent follow-ups: r16494, 16784, 
16796,
> > 16829, 17844, and 21908).
> > 
> > * subversion/bindings/java/javahl/src/org/tigris/
> >   subversion/javahl/tests/BasicTests.java
> >   (testOODStatus): Remove XFAIL-style catch.
> > 
> > * subversion/include/svn_repos.h
> >   (svn_repos_deleted_rev): New function to find the revision a path 
was
> >   most recently deleted within a give revision range.
> > 
> > * subversion/libsvn_repos/reporter.c
> >   (update_entry, delta_dirs): Use the new function 
svn_repos_deleted_rev()
> >   to determine the revision deleted paths were deleted and pass this 
to
> >   the delete_entry callback rather than defaulting to 
SVN_INVALID_REVNUM.
> > 
> > * subversion/libsvn_repos/rev_hunt.c
> >   (svn_repos_deleted_rev): New function definition.
> > 
> > * subversion/libsvn_wc/status.c
> >   (tweak_statushash): Add second baton argument which contains the out
> >   of date info for a dir baton when tweaking that baton's parent.  Add
> >   another argument to identify the revision a path was deleted when
> >   handling deletes.  When deleting paths: Construct the correct url 
for
> >   the path and record deleted path's deleted revision in the path's
> >   svn_wc_status2_t structure.  For pre-1.5 servers, which don't 
provide
> >   the deleted revision, use the parent's last committed rev as a best
> >   guess.
> >   (delete_entry, close_file): Supply new args to tweak_statushash()
> >   calls.
> >   (close_directory): Tweak status for directories even when the only
> >   change is that they have and an out of date descendents.  Supply new
> >   args to tweak_statushash() call.
> > ]]]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Daniel Rall <dl...@collab.net>.
I've created a temporary branch, "ood-status-info", so we have a place
to work on refining these fixes.

On Fri, 13 Oct 2006, Paul Burba wrote:

> Daniel Rall <dl...@collab.net> wrote on 10/12/2006 05:37:45 PM:
> 
> > [Focusing on the Java/test portion of this thread.]
> > 
> > On Wed, 11 Oct 2006, Paul Burba wrote:
> > 
> > > Daniel Rall <dl...@collab.net> wrote on 10/06/2006 03:49:32 PM:
...
> > and have committed this code in an XFAIL-style in r21908.
...
> Though we'd need to detect XPASS no?

Yeah, we want to report an unexpected pass.  I've done so on trunk in
r22112.

...
> > Some of the Python status tests do check for "*", but don't test the
> > individual pieces of "out of date" information retrieved from
> > repository.
> 
> You're making this statement for the benefit of others reading this yes? 
> Or is there a question for me in here? :-)

Yup.

...
> Our remaining issues(?):
> 
> A) The performance of svn_repos_deleted_rev()

This is currently too slow to integrate into trunk.  Let's write a
faster implementation -- hopefully we can speed things up by moving
the logic down inside the FS layer.

> B) Possible side effects of passing wrong revision to
> svn_delta_editor_t delete_entry() implementations.

Any findings here?


I've committed the following patch to the "ood-status-info" branch in
r22116:

> [[[
> Further improvements to status information on working copy items which
> are out of date compared to the repository.
> 
> Follow-up to r16344 (and its subsequent follow-ups: r16494, 16784, 16796,
> 16829, 17844, and 21908).
> 
> * subversion/bindings/java/javahl/src/org/tigris/
>   subversion/javahl/tests/BasicTests.java
>   (testOODStatus): Remove XFAIL-style catch.
> 
> * subversion/include/svn_repos.h
>   (svn_repos_deleted_rev): New function to find the revision a path was
>   most recently deleted within a give revision range.
> 
> * subversion/libsvn_repos/reporter.c
>   (update_entry, delta_dirs): Use the new function svn_repos_deleted_rev()
>   to determine the revision deleted paths were deleted and pass this to
>   the delete_entry callback rather than defaulting to SVN_INVALID_REVNUM.
> 
> * subversion/libsvn_repos/rev_hunt.c
>   (svn_repos_deleted_rev): New function definition.
> 
> * subversion/libsvn_wc/status.c
>   (tweak_statushash): Add second baton argument which contains the out
>   of date info for a dir baton when tweaking that baton's parent.  Add
>   another argument to identify the revision a path was deleted when
>   handling deletes.  When deleting paths: Construct the correct url for
>   the path and record deleted path's deleted revision in the path's
>   svn_wc_status2_t structure.  For pre-1.5 servers, which don't provide
>   the deleted revision, use the parent's last committed rev as a best
>   guess.
>   (delete_entry, close_file): Supply new args to tweak_statushash()
>   calls.
>   (close_directory): Tweak status for directories even when the only
>   change is that they have and an out of date descendents.  Supply new
>   args to tweak_statushash() call.
> ]]]
...

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Paul Burba <pa...@softlanding.com>.
Daniel Rall <dl...@collab.net> wrote on 10/12/2006 05:37:45 PM:

> [Focusing on the Java/test portion of this thread.]
> 
> On Wed, 11 Oct 2006, Paul Burba wrote:
> 
> > Daniel Rall <dl...@collab.net> wrote on 10/06/2006 03:49:32 PM:
> ...
> > I added a second status test to
> > 
> 
subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/tests/BasicTests.
> java,
> > it's included in the attached patch.  It replicates what my
> > temporary test did, but actually *tests* the expected results rather
> > than relying on a manual check.
> 
> Status.getReposLastCmtKind() seems to be redundant with getReposKind()
> (they are both accessors for the same Status.reposKind instance
> field), which has existed since 1.3.  The method not used by your
> patch, so I think we can dump it.

Yes, good riddance to that. 

> testOODStatus() starts off by performing an "obstructed
> checkout"-oriented check, which appears to have been copied from
> testBasicCheckout().  This isn't appropriate for testOODStatus(), so
> we can dump it.

That was trash from the start of my work on the test.  Somehow it flew 
under my radar.

> The method's "throws" clause follows the poor pattern
> used throughout this class TestCase by declaring "Throwable".  We can
> narrow this down to "SubversionException, IOException".

Noted for any future tests I write.

> The method
> also contains some block comments with "*" formatting on every line,
> which can be switched to line comments.
 
> I made a bunch of other formatting tweaks and JavaDoc improvements,

Sorry about the indent formatting, I was following the style used 
elsewhere in the file thinking it was something we did specific to java.

> and have committed this code in an XFAIL-style in r21908.

So that's how a JavaHL "XFAIL" is done, thanks, I had no clue.  Though 
we'd need to detect XPASS no?
 
> ...
> > > > P.S. In testing this patch I used the following hack to make the 
debug 
> > > > version of svn status show all the OOD info on the command line. 
It 
> > > > also adds a temporary status test to setup various out of 
datescenarios.
> > > > It may prove useful to anyone who tries out this patch:
> > > 
> > > This status test won't be permanent, then?  Do we have any OOD info
> > > tests?
> > 
> > No, because it relies on the hack I made to the debug version of svn 
> > status, so it isn't really usefil.  The JavaHL test will replace it 
(see 
> > above).
> 
> Some of the Python status tests do check for "*", but don't test the
> individual pieces of "out of date" information retrieved from
> repository.

You're making this statement for the benefit of others reading this yes? 
Or is there a question for me in here? :-)

Updated log and patch to r21914 included.

Our remaining issues(?):

A) The performance of svn_repos_deleted_rev()

B) Possible side effects of passing wrong revision to svn_delta_editor_t 
delete_entry() implementations.

Paul B.

[[[
Further improvements to status information on working copy items which
are out of date compared to the repository.

Follow-up to r16344 (and its subsequent follow-ups: r16494, 16784, 16796,
16829, 17844, and 21908).

* subversion/bindings/java/javahl/src/org/tigris/
  subversion/javahl/tests/BasicTests.java
  (testOODStatus): Remove XFAIL-style catch.

* subversion/include/svn_repos.h
  (svn_repos_deleted_rev): New function to find the revision a path was
  most recently deleted within a give revision range.

* subversion/libsvn_repos/reporter.c
  (update_entry, delta_dirs): Use the new function svn_repos_deleted_rev()
  to determine the revision deleted paths were deleted and pass this to
  the delete_entry callback rather than defaulting to SVN_INVALID_REVNUM.

* subversion/libsvn_repos/rev_hunt.c
  (svn_repos_deleted_rev): New function definition.

* subversion/libsvn_wc/status.c
  (tweak_statushash): Add second baton argument which contains the out
  of date info for a dir baton when tweaking that baton's parent.  Add
  another argument to identify the revision a path was deleted when
  handling deletes.  When deleting paths: Construct the correct url for
  the path and record deleted path's deleted revision in the path's
  svn_wc_status2_t structure.  For pre-1.5 servers, which don't provide
  the deleted revision, use the parent's last committed rev as a best
  guess.
  (delete_entry, close_file): Supply new args to tweak_statushash()
  calls.
  (close_directory): Tweak status for directories even when the only
  change is that they have and an out of date descendents.  Supply new
  args to tweak_statushash() call.
]]]

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Daniel Rall <dl...@collab.net>.
[Focusing on the Java/test portion of this thread.]

On Wed, 11 Oct 2006, Paul Burba wrote:

> Daniel Rall <dl...@collab.net> wrote on 10/06/2006 03:49:32 PM:
...
> I added a second status test to
> subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/tests/BasicTests.java,
> it's included in the attached patch.  It replicates what my
> temporary test did, but actually *tests* the expected results rather
> than relying on a manual check.

Status.getReposLastCmtKind() seems to be redundant with getReposKind()
(they are both accessors for the same Status.reposKind instance
field), which has existed since 1.3.  The method not used by your
patch, so I think we can dump it.

testOODStatus() starts off by performing an "obstructed
checkout"-oriented check, which appears to have been copied from
testBasicCheckout().  This isn't appropriate for testOODStatus(), so
we can dump it.  The method's "throws" clause follows the poor pattern
used throughout this class TestCase by declaring "Throwable".  We can
narrow this down to "SubversionException, IOException".  The method
also contains some block comments with "*" formatting on every line,
which can be switched to line comments.

I made a bunch of other formatting tweaks and JavaDoc improvements,
and have committed this code in an XFAIL-style in r21908.


...
> > > P.S. In testing this patch I used the following hack to make the debug 
> > > version of svn status show all the OOD info on the command line.  It 
> > > also adds a temporary status test to setup various out of date scenarios.
> > > It may prove useful to anyone who tries out this patch:
> > 
> > This status test won't be permanent, then?  Do we have any OOD info
> > tests?
> 
> No, because it relies on the hack I made to the debug version of svn 
> status, so it isn't really usefil.  The JavaHL test will replace it (see 
> above).

Some of the Python status tests do check for "*", but don't test the
individual pieces of "out of date" information retrieved from
repository.

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Daniel Rall <dl...@collab.net>.
On Wed, 11 Oct 2006, Paul Burba wrote:
...
> > > I'm not totally clear from this comment how the revision arg is
> > > supposed to be used, currently we just pass SVN_INVALID_REVNUM.
> > > The doc string for delete_entry() says:
> > > 
> > >   /** Remove the directory entry named @a path, a child of the 
> directory
> > >    * represented by @a parent_baton.  If @a revision is set, it is 
> used as 
> > > a
> > >    * sanity check to ensure that you are removing the revision of @a 
> path
> > >    * that you really think you are.
> > >    *
> > >    * All allocations should be performed in @a pool.
> > >    */
> > >
> > > But I'm not sure what a "sanity check" implies exactly.  I'd like to 
> > > think 
> > > that what I'm doing (passing the rev the path was deleted) is 
> > > compatible 
> > > with the doc string, but it seems *too* easy that an essentially 
> > > unused arg already exists and does just what I require :-)
> > 
> > The "If @a revision is set, it is used as a sanity check" part of this
> > gives me concern that returning an incorrect "deleted rev" might have
> > side-effects.  What type of sanity check is performed on the revnum,
> > and by what code?
...
> I realized I needed to understand production and consumption of tree 
> deltas better.  I still have only a basic grasp of this, but here is how I 
> understand the problem:
> 
> If the following are all true...
> 
>   1) An svn operation causes a tree delta
>      to be produced in libsvn_repos.
> 
>   2) libsvn_repos/reporter.c's delta_dirs()
>      or update_entry() is used to construct
>      the tree delta.
> 
>   3) The delta's producer calls the
>      svn_delta_editor_t's delete_entry()
>      callback from delta_dirs()
>      or update_dirs(). 
> 
>   4) The delete_entry() implementation isn't
>      just a wrapper around the actual
>      implementation callback in the edit baton.
>      e.g. commit_util.c(1483):delete_entry()
>
>   5) The implementation isn't just the network
>      layer consuming the edit operation and
>      passing it to it's ultimate consumer. 
>      e.g. libsvn_ra_svn\editor.c(157):ra_svn_delete_entry() 

Yup.  The wrapping editor API implementation often adds additional
logic around its delegation to the wrapped editor callback.

> ...Then with my patch there is now the possibility of unintended 
> side-effects as the delta consumer's delete_entry() implementations are 
> now getting a revision argument other than SVN_INVALID_REVNUM.

Right.

> So if I understand the risk correctly, there are 6 delete_entry()
> implmentations that meet the above criteria and are potential risks:
> 
> libsvn_client\repos_diff.c(428):delete_entry(const char *path, 
> svn_revnum_t base_revision, void *parent_baton, apr_pool_t *pool)
> 
> libsvn_client\repos_diff_summarize.c(119):delete_entry(const char *path, 
> svn_revnum_t base_revision, void *parent_baton, apr_pool_t *pool)
> 
> libsvn_wc\diff.c(980):delete_entry(const char *path, svn_revnum_t 
> base_revision, void *parent_baton, apr_pool_t *pool)
> 
> libsvn_wc\update_editor.c(1037):delete_entry(const char *path, 
> svn_revnum_t revision, void *parent_baton, apr_pool_t *pool)
> 
> libsvn_delta\default_editor.c(75):delete_entry(const char *path, 
> svn_revnum_t revision, void *parent_baton, apr_pool_t *pool)
> 
> All six implmentations are all safe as none use the svn_revnum_t
> argument.

Additional logic which makes use of the REVISION argument could be
inserted by use of an externally-defined editor (e.g. by TortoiseSVN).
This is only going to be a risk if we make a "best guess" about the
revision being deleted in svn_repos_deleted_rev() (rather than
returning SVN_INVALID_REVNUM), and happen to return an incorrect
revision.

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Paul Burba <pa...@softlanding.com>.
Daniel Rall <dl...@collab.net> wrote on 10/06/2006 03:49:32 PM:

> On Mon, 02 Oct 2006, Paul Burba wrote:
> ....
> > > > > > The background on all of this can be found here: 
> > > > > > http://svn.haxx.se/dev/archive-2005-08/0257.shtml
> ...
> > Recall that r16344 was the ultimate solution for providing out of date 

> > info about a WC and that this originally arose due to problems with 
the 
> > Subclipse Synchronize view that Mark Phippard described in 
> > http://svn.haxx.se/subdev/archive-2005-08/0020.shtml.  Recently Mark 
asked 
> > me to look into a several problems/limitations that still adversely 
affect 
> > the Synchronize view, these are: 
> > 
> > 1) Unless a directory has a pending child
> >    addition/deletion, or there is a prop change on the
> >    directory itself, the out of date (ood_*) fields
> >    in svn_wc_status2_t are not set, or are incorrectly
> >    set. 
> > 
> > 2) The out of date (ood_*) fields for the root of WC
> >    are *never* set.
> > 
> > 3) OOD info never bubbles up beyond an out of date
> >    path's immediate parent (if even that).  E.g. if
> >    WC/A/B/E/beta is out of date, A, and A/B don't
> >    have any ood info, and as described in #1, A/B/E
> >    might not have any either.
> > 
> > 4) The OOD_last_cmt_rev field for deleted paths is
> >    always SVN_INVALID_REVNUM. 
> > 
> > The attached patch attempts to address all of these issues:
> > 
> > 1) ... When we call tweak_statushash() for a dir_baton in 
> > close_directory() we are passing the db->parent_baton, but the ood 
info 
> > for dir_baton is in itself.  So when tweak_statushash() sets the ood 
info 
> > on the parent_baton it just resets all ood fields to the parent's 
existing 
> > values.  To fix this I added a second baton arg to tweak_statushash so 
the 
> > ood info is available.
> 
> This is to let the OOD info bubble up?
> 
> > 2) Following your and Lieven's example from r21685, I set the 
> > eb->anchor_status->ood_* fields directly in close_directory().
> 
> +1.  This chunk of the patch can be committed at any time.  This patch
> is so big that I might lean towards multiple commits, even though it
> slightly complicates the backport.

Glad to break it up appropriately when the time comes.

> > 3) This is closely related to #1: We only call tweak_statushash() in 
> > close_directory() in the first place if a dir is added, deleted, or 
has a 
> > prop change.  I changed this to also make the call if 
> > dir_baton->ood_last_commit_rev is set to anything but the default 
value of 
> > SVN_INVALID_REVNUM. 
> 
> Okay.
> 
> > 4) This was a bit trickier to solve.  I saw no straightforward way to 
find 
> > out the revision an out of date path was deleted within libsvn_wc, so 
I 
> > added a new function to libsvn_repos: svn_repos_deleted_rev().  I then 
use 
> > this new function in reporter.c's delta_dirs() and update_entry() to 
get 
> > the rev a path was deleted and then pass this info as the revision 
> > argument to the svn_delta_editor_t's delete_entry() callback.
> 
> The algorithm looks functional, but I'd like to know more about the
> performance profile of this new code.  See my comments further on down
> below in its implementation.
> 
> > I'm not 
> > totally clear from this comment how the revision arg is supposed to be 

> > used, currently we just pass SVN_INVALID_REVNUM.  The doc string for 
> > delete_entry() says:
> > 
> >   /** Remove the directory entry named @a path, a child of the 
directory
> >    * represented by @a parent_baton.  If @a revision is set, it is 
used as 
> > a
> >    * sanity check to ensure that you are removing the revision of @a 
path
> >    * that you really think you are.
> >    *
> >    * All allocations should be performed in @a pool.
> >    */
> >
> > But I'm not sure what a "sanity check" implies exactly.  I'd like to 
think 
> > that what I'm doing (passing the rev the path was deleted) is 
compatible 
> > with the doc string, but it seems *too* easy that an essentially 
unused 
> > arg already exists and does just what I require :-)
> 
> The "If @a revision is set, it is used as a sanity check" part of this
> gives me concern that returning an incorrect "deleted rev" might have
> side-effects.  What type of sanity check is performed on the revnum,
> and by what code?

In IRC regarding the above:

[16:29] pburba: I'd need to look at every implementation of 
svn_delta_editor_t's delete_entry() and see what it does with the revision 
arg yes?
[16:32] dlr: Perhaps only the "client"-side implementations?
[16:38] pburba: I don't understand enough about that editors to answer 
that...why only client side?
[16:38] dlr: I don't think that any FS editor consumes the libsvn_repos 
module.
[16:39] dlr: Direct and indirect users of that module would be affected.
[16:39] dlr: e.g. libsvn_ra_local, mod_dav_svn, svnserve, libsvn_client, 
libsvn_wc?
[16:46] pburba: You've lost me I'm afraid...
[16:47] dlr: All I'm saying is that you ought to be able to ignore 
libsvn_fs, as far as usage of that "deleted rev" is concerned.

I realized I needed to understand production and consumption of tree 
deltas better.  I still have only a basic grasp of this, but here is how I 
understand the problem:

If the following are all true...

  1) An svn operation causes a tree delta
     to be produced in libsvn_repos.

  2) libsvn_repos/reporter.c's delta_dirs()
     or update_entry() is used to construct
     the tree delta.

  3) The delta's producer calls the
     svn_delta_editor_t's delete_entry()
     callback from delta_dirs()
     or update_dirs(). 

  4) The delete_entry() implementation isn't
     just a wrapper around the actual
     implementation callback in the edit baton.
     e.g. commit_util.c(1483):delete_entry()

  5) The implementation isn't just the network
     layer consuming the edit operation and
     passing it to it's ultimate consumer. 
     e.g. libsvn_ra_svn\editor.c(157):ra_svn_delete_entry() 

...Then with my patch there is now the possibility of unintended 
side-effects as the delta consumer's delete_entry() implementations are 
now getting a revision argument other than SVN_INVALID_REVNUM.  So if I 
understand the risk correctly, there are 6 delete_entry() implmentations 
that meet the above criteria and are potential risks:

libsvn_client\repos_diff.c(428):delete_entry(const char *path, 
svn_revnum_t base_revision, void *parent_baton, apr_pool_t *pool)

libsvn_client\repos_diff_summarize.c(119):delete_entry(const char *path, 
svn_revnum_t base_revision, void *parent_baton, apr_pool_t *pool)

libsvn_wc\diff.c(980):delete_entry(const char *path, svn_revnum_t 
base_revision, void *parent_baton, apr_pool_t *pool)

libsvn_wc\update_editor.c(1037):delete_entry(const char *path, 
svn_revnum_t revision, void *parent_baton, apr_pool_t *pool)

libsvn_delta\default_editor.c(75):delete_entry(const char *path, 
svn_revnum_t revision, void *parent_baton, apr_pool_t *pool)

All six implmentations are all safe as none use the svn_revnum_t argument. 

 
> > Assuming my changes to libsvn_repos are ok, there is still the problem 
of 
> > older servers not providing this information.  My solution is to look 
to 
> > deleted path's parent folder and use it's out of date info.  There is 
some 
> > chance that this info is actually correct, and even if it is wrong it 
> > would still be valid in the sense that it is a revision higher than 
the 
> > revision the path was deleted.
> 
> See my comment immediately above.
> 
> > Re the Subclipse Synchronize use case this 
> > is at least somewhat helpful.  And since the present alternative is no 

> > info at all (i.e. SVN_INVALID_REVNUM) this slightly flawed alternative 

> > seems at least a partial improvement.
> ...
> > Did you create a test for this?  I didn't see anything committed, but 
if 
> > you have any significant work in progress could I have a copy? Problem 
#1 
> > above highlights the need for a real test of this functionality. 
> 
> Looks like I did not create a test.  Paul mentioned on IRC that he has
> one in progress.

I added a second status test to 
subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/tests/BasicTests.java, 
it's included in the attached patch.  It replicates what my temporary test 
did, but actually *tests* the expected results rather than relying on a 
manual check.
 
> ...
> > P.S. In testing this patch I used the following hack to make the debug 

> > version of svn status show all the OOD info on the command line.  It 
also 
> > adds a temporary status test to setup various out of date scenarios. 
It 
> > may prove useful to anyone who tries out this patch:
> 
> This status test won't be permanent, then?  Do we have any OOD info
> tests?

No, because it relies on the hack I made to the debug version of svn 
status, so it isn't really usefil.  The JavaHL test will replace it (see 
above).
 
> > Index: subversion/include/svn_repos.h
> > ===================================================================
> > --- subversion/include/svn_repos.h   (revision 21732)
> > +++ subversion/include/svn_repos.h   (working copy)
> > @@ -835,6 +835,24 @@
> >                 apr_pool_t *pool);
> > 
> > 
> > +/**
> > + * Set @a *deleted to the revision @a path was most recently deleted
> > + * in @a fs, within the inclusive revision bounds set by @a start and
> > + * @end.  If @a path does not exist in @a root within the @a start 
and
> 
> This doc string should be "@a end" rather than "@end".

Fixed.
 
> > + * @a end bounds, or is not deleted within those bounds, set @a 
*deleted
> > + * to SVN_INVALID_REVNUM.  Use @a pool for memory allocation.
> > + *
> > + * @since New in 1.5.
> > + */
> > +svn_error_t *
> > +svn_repos_deleted_rev(svn_fs_t *fs,
> > +                      const char *path,
> > +                      svn_revnum_t start,
> > +                      svn_revnum_t end,
> > +                      svn_revnum_t *deleted,
> ...
> > Index: subversion/libsvn_repos/rev_hunt.c
> > ===================================================================
> > --- subversion/libsvn_repos/rev_hunt.c   (revision 21732)
> > +++ subversion/libsvn_repos/rev_hunt.c   (working copy)
> ...
> > +svn_error_t *
> > +svn_repos_deleted_rev(svn_fs_t *fs,
> > +                      const char *path,
> > +                      svn_revnum_t start,
> > +                      svn_revnum_t end,
> > +                      svn_revnum_t *deleted,
> > +                      apr_pool_t *pool)
> > +{
> > +  apr_pool_t *subpool = svn_pool_create(pool);
> 
> We can delay initialization of this sub-pool until after our START and
> END have been checked.

Moved...
 
> > +  svn_fs_root_t *root;
> > +  svn_revnum_t curr_rev;
> > +  *deleted = SVN_INVALID_REVNUM;
> > +
> > +  /* Validate the revision range. */
> > +  if (! SVN_IS_VALID_REVNUM(start))
> > +    return svn_error_createf 
> > +      (SVN_ERR_FS_NO_SUCH_REVISION, 0, 
> > +       _("Invalid start revision %ld"), start);
> > +  if (! SVN_IS_VALID_REVNUM(end))
> > +    return svn_error_createf 
> > +      (SVN_ERR_FS_NO_SUCH_REVISION, 0, 
> > +       _("Invalid end revision %ld"), end);
> 
> Init sub-pool here.

...to here.
 
> > +  /* Ensure that the input is ordered. */
> > +  if (start > end)
> > +    {
> > +      svn_revnum_t tmprev = start;
> > +      start = end;
> > +      end = tmprev;
> > +    }
> > +
> > +  curr_rev = end;
> > +
> > +  while (curr_rev >= start)
> > +    {
> > +      svn_node_kind_t kind_p;
> > +      svn_pool_clear(subpool);
> > +
> > +      /* Get a revision root for curr_rev. */
> > +      SVN_ERR(svn_fs_revision_root(&root, fs, curr_rev, subpool));
> > +
> > +      SVN_ERR(svn_fs_check_path(&kind_p, root, path, subpool));
> > +
> > +      if (kind_p == svn_node_none)
> > +        {
> > +          curr_rev--;
> > +        }
> > +      else
> > +        {
> > +          if (curr_rev != end)
> > +            *deleted = curr_rev + 1;
> > +          break;
> > +        }
> > +  }
> 
> This "while" loop seems rather performance-intensive.  On a repository
> with a long history (hundreds of thousands of revs+), how much time
> does this take?

Unfortunately it is very slow.  I created a greek test repos with 53600 
revisions.  In r2 path /A/D/H/omega is deleted.  The remaining revs are 
simply text mods to the other files.  I checked out r1 via file:// and ran 
"time svn st -u -v" in Cygwin:

With all of my patch in place except the two calls to 
svn_repos_deleted_rev() in reporter.c, the time to run the command was 
less than a second:

  real    0m0.390s
  user    0m0.015s
  sys     0m0.015s

With the the calls to svn_repos_deleted_rev() active the command took well 
over 5 minutes:

  real    5m44.236s
  user    0m0.015s
  sys     0m0.062s

This algorithm could be improved a bit to run much faster for less 
contrived cases, using a binary search to narrow the rev range before 
beginning the linear search for example.  But ultimately everything I've 
been able to think of is still O(n).  Is it even worth trying to improve 
the performance...

> If it does take a noticable amount of time, would we be able to speed
> things up by adding a svn_fs_deleted_rev() API which might be able to
> produce results in a more efficient manner?

...or should I take another route altogether.  I'm glad to attempt this, 
but I'm not even sure where to start.  Has anyone thought about this 
problem in the past?  Even a vague idea how to get started would be 
helpful.
 
> > +
> > +  svn_pool_destroy(subpool);
> > +  return SVN_NO_ERROR;
> > +}
> ...
> > Index: subversion/libsvn_wc/status.c
> > ===================================================================
> > --- subversion/libsvn_wc/status.c   (revision 21732)
> > +++ subversion/libsvn_wc/status.c   (working copy)
> > @@ -999,18 +999,42 @@
> >  /* Look up the key PATH in BATON->STATII.  IS_DIR_BATON indicates 
whether
> >     baton is a struct *dir_baton or struct *file_baton.  If the 
> value doesn't
> >     yet exist, and the REPOS_TEXT_STATUS indicates that this is an
> > -   addition, create a new status struct using the hash's pool.  Merge
> > -   REPOS_TEXT_STATUS and REPOS_PROP_STATUS into the status 
structure's
> > +   addition, create a new status struct using the hash's pool.
> > +
> > +   If IS_DIR_BATON is true, OOD_BATON is a *dir_baton cotaining the 
ood
> > +   information we want to set in BATON.  This is necessary because 
this
> > +   function tweaks the status of out of date directories (BATON 
> == OOD_BATON)
> > +   and out of date directories' parents (BATON == 
OOD_BATON->parent_baton).
> > +   In the latter case OOD_BATON contains the ood info we want to 
bubble up
> > +   to ancestor directories so these accurately reflect the fact they 
have
> > +   and ood descendent.
> > +
> > +   Merge REPOS_TEXT_STATUS and REPOS_PROP_STATUS into the status 
> structure's
> >     "network" fields.
> > +
> > +   Iff IS_DIR_BATON is true, DELETED_REV is used as follows, 
otherwise it
> > +   is ignored:
> > + 
> > +       If REPOS_TEXT_STATUS is svn_wc_status_deleted then DELETED_REV 
is
> > +       optionally the revision path was deleted, in all other cases 
it must
> > +       be set to SVN_INVALID_REVNUM.  If DELETED_REV is not
> > +       SVN_INVALID_REVNUM and REPOS_TEXT_STATUS is 
svn_wc_status_deleted,
> > +       then use DELETED_REV to set PATH's ood_last_cmt_rev field in 
BATON.
> > +       If DELETED_REV is SVN_INVALID_REVNUM and REPOS_TEXT_STATUS is
> > +       svn_wc_status_deleted, set PATH's ood_last_cmt_rev to it's 
parent's
> > +       ood_last_cmt_rev value - see comment below.
> > +
> >     If a new struct was added, set the repos_lock to REPOS_LOCK. */
> >  static svn_error_t *
> >  tweak_statushash(void *baton,
> > +                 void *ood_baton,
> 
> I can't quite put my finger on why, but I'd like a different name for
> OOD_BATON if we can come up with something more specific.

How about this_dir_baton?

~~~~~

Updated log and patch attached.

Paul B.

[[[
Further improvements to status information on working copy items which
are out of date compared to the repository.

Follow-up to r16344 (and its subsequent follow-ups: r16494, 16784, 16796,
16829, and 17844).

* 
subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/Status.java
  (getReposLastCmtKind): New accesor method.

* subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/
  tests/BasicTests.java
  (testPathValidation): New test of SVNClient.status' out of date
  functionality

* subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/tests/
  SVNTests.java
  (OneTest.checkStatus): Add overload which accepts a "checkServer" flag
  which indicates whether or not to check the WC's status against the
  repository.

* subversion/bindings/java/javahl/src/org/tigris/subversion/javahl/tests/
  WC.java
  (org.tigris.subversion.javahl.Revision, java.util.Date): Import.
  (setAllWorkingCopyRevision): New method which sets the revision number
  on all paths in the WC. 
  (setItemReposLastCmtRevision, setItemReposLastCmtAuthor,
  setItemReposLastCmtDate, setItemReposKind): New setter methods for the
  four new Item.repos* fields. 
  (setItemOODInfo): New method which wraps the four setItemRepos* methods.
  (check): Add overload which accepts a "checkServer" flag which indicates
  whether or not to check the expected vs. actual out-of-date statii 
fields.
  (Item.reposLastCmtRevision, Item.reposLastCmtDate, Item.reposKind,
  Item.reposLastCmtAuthor): New fields representing the WC's out of date
  status.

* subversion/include/svn_repos.h
  (svn_repos_deleted_rev): New function to find the revision a path was
  most recently deleted within a give revision range.

* subversion/libsvn_repos/reporter.c
  (update_entry, delta_dirs): Use the new function svn_repos_deleted_rev()
  to determine the revision deleted paths were deleted and pass this to
  the delete_entry callback rather than defaulting to SVN_INVALID_REVNUM.

* subversion/libsvn_repos/rev_hunt.c
  (svn_repos_deleted_rev): New function definition.

* subversion/libsvn_wc/status.c
  (tweak_statushash): Add second baton argument which contains the out
  of date info for a dir baton when tweaking that baton's parent.  Add
  another argument to identify the revision a path was deleted when
  handling deletes.  When deleting paths: Construct the correct url for
  the path and record deleted path's deleted revision in the path's
  svn_wc_status2_t structure.  For pre-1.5 servers, which don't provide
  the deleted revision, use the parent's last committed rev as a best
  guess.
  (delete_entry, close_file): Supply new args to tweak_statushash()
  calls.
  (close_directory): Tweak status for directories even when the only
  change is that they have and an out of date descendents.  Supply new
  args to tweak_statushash() call.
]]]

Re: [PATCH] Was: Enhancement needed in svn status -u

Posted by Daniel Rall <dl...@collab.net>.
On Mon, 02 Oct 2006, Paul Burba wrote:
....
> > > > > The background on all of this can be found here: 
> > > > > http://svn.haxx.se/dev/archive-2005-08/0257.shtml
...
> Recall that r16344 was the ultimate solution for providing out of date 
> info about a WC and that this originally arose due to problems with the 
> Subclipse Synchronize view that Mark Phippard described in 
> http://svn.haxx.se/subdev/archive-2005-08/0020.shtml.  Recently Mark asked 
> me to look into a several problems/limitations that still adversely affect 
> the Synchronize view, these are: 
> 
> 1) Unless a directory has a pending child
>    addition/deletion, or there is a prop change on the
>    directory itself, the out of date (ood_*) fields
>    in svn_wc_status2_t are not set, or are incorrectly
>    set. 
> 
> 2) The out of date (ood_*) fields for the root of WC
>    are *never* set.
> 
> 3) OOD info never bubbles up beyond an out of date
>    path's immediate parent (if even that).  E.g. if
>    WC/A/B/E/beta is out of date, A, and A/B don't
>    have any ood info, and as described in #1, A/B/E
>    might not have any either.
> 
> 4) The OOD_last_cmt_rev field for deleted paths is
>    always SVN_INVALID_REVNUM. 
> 
> The attached patch attempts to address all of these issues:
> 
> 1) ... When we call tweak_statushash() for a dir_baton in 
> close_directory() we are passing the db->parent_baton, but the ood info 
> for dir_baton is in itself.  So when tweak_statushash() sets the ood info 
> on the parent_baton it just resets all ood fields to the parent's existing 
> values.  To fix this I added a second baton arg to tweak_statushash so the 
> ood info is available.

This is to let the OOD info bubble up?

> 2) Following your and Lieven's example from r21685, I set the 
> eb->anchor_status->ood_* fields directly in close_directory().

+1.  This chunk of the patch can be committed at any time.  This patch
is so big that I might lean towards multiple commits, even though it
slightly complicates the backport.

> 3) This is closely related to #1: We only call tweak_statushash() in 
> close_directory() in the first place if a dir is added, deleted, or has a 
> prop change.  I changed this to also make the call if 
> dir_baton->ood_last_commit_rev is set to anything but the default value of 
> SVN_INVALID_REVNUM. 

Okay.

> 4) This was a bit trickier to solve.  I saw no straightforward way to find 
> out the revision an out of date path was deleted within libsvn_wc, so I 
> added a new function to libsvn_repos: svn_repos_deleted_rev().  I then use 
> this new function in reporter.c's delta_dirs() and update_entry() to get 
> the rev a path was deleted and then pass this info as the revision 
> argument to the svn_delta_editor_t's delete_entry() callback.

The algorithm looks functional, but I'd like to know more about the
performance profile of this new code.  See my comments further on down
below in its implementation.

> I'm not 
> totally clear from this comment how the revision arg is supposed to be 
> used, currently we just pass SVN_INVALID_REVNUM.  The doc string for 
> delete_entry() says:
> 
>   /** Remove the directory entry named @a path, a child of the directory
>    * represented by @a parent_baton.  If @a revision is set, it is used as 
> a
>    * sanity check to ensure that you are removing the revision of @a path
>    * that you really think you are.
>    *
>    * All allocations should be performed in @a pool.
>    */
>
> But I'm not sure what a "sanity check" implies exactly.  I'd like to think 
> that what I'm doing (passing the rev the path was deleted) is compatible 
> with the doc string, but it seems *too* easy that an essentially unused 
> arg already exists and does just what I require :-)

The "If @a revision is set, it is used as a sanity check" part of this
gives me concern that returning an incorrect "deleted rev" might have
side-effects.  What type of sanity check is performed on the revnum,
and by what code?
 
> Assuming my changes to libsvn_repos are ok, there is still the problem of 
> older servers not providing this information.  My solution is to look to 
> deleted path's parent folder and use it's out of date info.  There is some 
> chance that this info is actually correct, and even if it is wrong it 
> would still be valid in the sense that it is a revision higher than the 
> revision the path was deleted.

See my comment immediately above.

> Re the Subclipse Synchronize use case this 
> is at least somewhat helpful.  And since the present alternative is no 
> info at all (i.e. SVN_INVALID_REVNUM) this slightly flawed alternative 
> seems at least a partial improvement.
...
> Did you create a test for this?  I didn't see anything committed, but if 
> you have any significant work in progress could I have a copy?  Problem #1 
> above highlights the need for a real test of this functionality. 

Looks like I did not create a test.  Paul mentioned on IRC that he has
one in progress.

...
> P.S. In testing this patch I used the following hack to make the debug 
> version of svn status show all the OOD info on the command line.  It also 
> adds a temporary status test to setup various out of date scenarios.  It 
> may prove useful to anyone who tries out this patch:

This status test won't be permanent, then?  Do we have any OOD info
tests?

> Index: subversion/include/svn_repos.h
> ===================================================================
> --- subversion/include/svn_repos.h	(revision 21732)
> +++ subversion/include/svn_repos.h	(working copy)
> @@ -835,6 +835,24 @@
>                 apr_pool_t *pool);
>  
>  
> +/**
> + * Set @a *deleted to the revision @a path was most recently deleted
> + * in @a fs, within the inclusive revision bounds set by @a start and
> + * @end.  If @a path does not exist in @a root within the @a start and

This doc string should be "@a end" rather than "@end".

> + * @a end bounds, or is not deleted within those bounds, set @a *deleted
> + * to SVN_INVALID_REVNUM.  Use @a pool for memory allocation.
> + *
> + * @since New in 1.5.
> + */
> +svn_error_t *
> +svn_repos_deleted_rev(svn_fs_t *fs,
> +                      const char *path,
> +                      svn_revnum_t start,
> +                      svn_revnum_t end,
> +                      svn_revnum_t *deleted,
...
> Index: subversion/libsvn_repos/rev_hunt.c
> ===================================================================
> --- subversion/libsvn_repos/rev_hunt.c	(revision 21732)
> +++ subversion/libsvn_repos/rev_hunt.c	(working copy)
...
> +svn_error_t *
> +svn_repos_deleted_rev(svn_fs_t *fs,
> +                      const char *path,
> +                      svn_revnum_t start,
> +                      svn_revnum_t end,
> +                      svn_revnum_t *deleted,
> +                      apr_pool_t *pool)
> +{
> +  apr_pool_t *subpool = svn_pool_create(pool);

We can delay initialization of this sub-pool until after our START and
END have been checked.

> +  svn_fs_root_t *root;
> +  svn_revnum_t curr_rev;
> +  *deleted = SVN_INVALID_REVNUM;
> +
> +  /* Validate the revision range. */
> +  if (! SVN_IS_VALID_REVNUM(start))
> +    return svn_error_createf 
> +      (SVN_ERR_FS_NO_SUCH_REVISION, 0, 
> +       _("Invalid start revision %ld"), start);
> +  if (! SVN_IS_VALID_REVNUM(end))
> +    return svn_error_createf 
> +      (SVN_ERR_FS_NO_SUCH_REVISION, 0, 
> +       _("Invalid end revision %ld"), end);

Init sub-pool here.

> +  /* Ensure that the input is ordered. */
> +  if (start > end)
> +    {
> +      svn_revnum_t tmprev = start;
> +      start = end;
> +      end = tmprev;
> +    }
> +
> +  curr_rev = end;
> +
> +  while (curr_rev >= start)
> +    {
> +      svn_node_kind_t kind_p;
> +      svn_pool_clear(subpool);
> +
> +      /* Get a revision root for curr_rev. */
> +      SVN_ERR(svn_fs_revision_root(&root, fs, curr_rev, subpool));
> +
> +      SVN_ERR(svn_fs_check_path(&kind_p, root, path, subpool));
> +
> +      if (kind_p == svn_node_none)
> +        {
> +          curr_rev--;
> +        }
> +      else
> +        {
> +          if (curr_rev != end)
> +            *deleted = curr_rev + 1;
> +          break;
> +        }
> +  }

This "while" loop seems rather performance-intensive.  On a repository
with a long history (hundreds of thousands of revs+), how much time
does this take?

If it does take a noticable amount of time, would we be able to speed
things up by adding a svn_fs_deleted_rev() API which might be able to
produce results in a more efficient manner?


> +
> +  svn_pool_destroy(subpool);
> +  return SVN_NO_ERROR;
> +}
...
> Index: subversion/libsvn_wc/status.c
> ===================================================================
> --- subversion/libsvn_wc/status.c	(revision 21732)
> +++ subversion/libsvn_wc/status.c	(working copy)
> @@ -999,18 +999,42 @@
>  /* Look up the key PATH in BATON->STATII.  IS_DIR_BATON indicates whether
>     baton is a struct *dir_baton or struct *file_baton.  If the value doesn't
>     yet exist, and the REPOS_TEXT_STATUS indicates that this is an
> -   addition, create a new status struct using the hash's pool.  Merge
> -   REPOS_TEXT_STATUS and REPOS_PROP_STATUS into the status structure's
> +   addition, create a new status struct using the hash's pool.
> +
> +   If IS_DIR_BATON is true, OOD_BATON is a *dir_baton cotaining the ood
> +   information we want to set in BATON.  This is necessary because this
> +   function tweaks the status of out of date directories (BATON == OOD_BATON)
> +   and out of date directories' parents (BATON == OOD_BATON->parent_baton).
> +   In the latter case OOD_BATON contains the ood info we want to bubble up
> +   to ancestor directories so these accurately reflect the fact they have
> +   and ood descendent.
> +
> +   Merge REPOS_TEXT_STATUS and REPOS_PROP_STATUS into the status structure's
>     "network" fields.
> +
> +   Iff IS_DIR_BATON is true, DELETED_REV is used as follows, otherwise it
> +   is ignored:
> +   
> +       If REPOS_TEXT_STATUS is svn_wc_status_deleted then DELETED_REV is
> +       optionally the revision path was deleted, in all other cases it must
> +       be set to SVN_INVALID_REVNUM.  If DELETED_REV is not
> +       SVN_INVALID_REVNUM and REPOS_TEXT_STATUS is svn_wc_status_deleted,
> +       then use DELETED_REV to set PATH's ood_last_cmt_rev field in BATON.
> +       If DELETED_REV is SVN_INVALID_REVNUM and REPOS_TEXT_STATUS is
> +       svn_wc_status_deleted, set PATH's ood_last_cmt_rev to it's parent's
> +       ood_last_cmt_rev value - see comment below.
> +
>     If a new struct was added, set the repos_lock to REPOS_LOCK. */
>  static svn_error_t *
>  tweak_statushash(void *baton,
> +                 void *ood_baton,

I can't quite put my finger on why, but I'd like a different name for
OOD_BATON if we can come up with something more specific.

>                   svn_boolean_t is_dir_baton,
>                   svn_wc_adm_access_t *adm_access,
>                   const char *path,
>                   svn_boolean_t is_dir,
>                   enum svn_wc_status_kind repos_text_status,
>                   enum svn_wc_status_kind repos_prop_status,
> +                 svn_revnum_t deleted_rev,
>                   svn_lock_t *repos_lock)
>  {
>    svn_wc_status2_t *statstruct;
> @@ -1064,20 +1088,50 @@
>    /* Copy out of date info. */
>    if (is_dir_baton)
>      {
> -      struct dir_baton *b = baton;
> +      struct dir_baton *b = ood_baton;
> +
>        if (b->url)
> -        statstruct->url = apr_pstrdup(pool, b->url);
> -      statstruct->ood_kind = b->ood_kind;
> -      /* The last committed rev, date, and author for deleted items
> +        {
> +          if (statstruct->repos_text_status == svn_wc_status_deleted)
> +            {
> +              /* When deleting PATH, BATON is for PATH's parent,
> +                 so we must construct PATH's real statstruct->url. */
> +              statstruct->url =
> +                svn_path_url_add_component(b->url,
> +                                           svn_path_basename(path, pool),
> +                                           pool);
> +            }
> +          else
> +            statstruct->url = apr_pstrdup(pool, b->url);
> +        }
> +
> +      /* The last committed date, and author for deleted items
>           isn't available. */
> -      if (statstruct->repos_text_status != svn_wc_status_deleted)
> +      if (statstruct->repos_text_status == svn_wc_status_deleted)
>          {
> +          statstruct->ood_kind = is_dir ? svn_node_dir : svn_node_file;
> +          
> +          /* Pre 1.5 servers don't provide the revision a path was deleted.
> +             So we punt and use the last committed revision of the path's
> +             parent, which has some chance of being correct.  At worse it
> +             is a higher revision than the path was deleted, but this is
> +             better than nothing... */
> +          if (deleted_rev == SVN_INVALID_REVNUM)
> +            statstruct->ood_last_cmt_rev =
> +              ((struct dir_baton *) baton)->ood_last_cmt_rev;
> +          else
> +            statstruct->ood_last_cmt_rev = deleted_rev;
> +        }
> +      else
> +        {
> +          statstruct->ood_kind = b->ood_kind;
>            statstruct->ood_last_cmt_rev = b->ood_last_cmt_rev;
>            statstruct->ood_last_cmt_date = b->ood_last_cmt_date;
>            if (b->ood_last_cmt_author)
>              statstruct->ood_last_cmt_author =
>                apr_pstrdup(pool, b->ood_last_cmt_author);
>          }
> +
>      }
>    else
>      {
> @@ -1468,17 +1522,18 @@
>  
>    SVN_ERR(svn_wc_entries_read(&entries, adm_access, FALSE, pool));
>    if (apr_hash_get(entries, hash_key, APR_HASH_KEY_STRING))
> -    SVN_ERR(tweak_statushash(db, TRUE, eb->adm_access,
> +    SVN_ERR(tweak_statushash(db, db, TRUE, eb->adm_access,
>                               full_path, kind == svn_node_dir,
> -                             svn_wc_status_deleted, 0, NULL));
> +                             svn_wc_status_deleted, 0, revision, NULL));
>  
>    /* Mark the parent dir -- it lost an entry (unless that parent dir
>       is the root node and we're not supposed to report on the root
>       node).  */
>    if (db->parent_baton && (! *eb->target))
> -    SVN_ERR(tweak_statushash(db->parent_baton, TRUE, eb->adm_access,
> +    SVN_ERR(tweak_statushash(db->parent_baton, db, TRUE, eb->adm_access,
>                               db->path, kind == svn_node_dir,
> -                             svn_wc_status_modified, 0, NULL));
> +                             svn_wc_status_modified, 0, SVN_INVALID_REVNUM,
> +                             NULL));
>  
>    return SVN_NO_ERROR;
>  }
> @@ -1560,8 +1615,10 @@
>    struct edit_baton *eb = db->edit_baton;
>    svn_wc_status2_t *dir_status = NULL;
>    
> -  /* If nothing has changed, return. */
> -  if (db->added || db->prop_changed || db->text_changed)
> +  /* If nothing has changed and directory has no out of
> +     date descendents, return. */
> +  if (db->added || db->prop_changed || db->text_changed
> +      || db->ood_last_cmt_rev != SVN_INVALID_REVNUM)
>      {

This additional check fixes #3?

...
> +          /* If the root dir is out of date set the ood info directly too. */
> +          if (db->ood_last_cmt_rev != eb->anchor_status->entry->revision)
> +            {
> +              eb->anchor_status->ood_last_cmt_rev = db->ood_last_cmt_rev;
> +              eb->anchor_status->ood_last_cmt_date = db->ood_last_cmt_date;
> +              eb->anchor_status->ood_kind = db->ood_kind;
> +              eb->anchor_status->ood_last_cmt_author =
> +                apr_pstrdup(pool, db->ood_last_cmt_author);
> +            }

This fixes #2?

...