You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2004/06/01 02:51:21 UTC

Re: PATCH: Perform multi-file operations in sorted order

On Mon, 2004-05-31 at 18:34, Julian Foad wrote:
> I have as yet failed to get the hardest part working, which is sorting
> the combination of local and repository information for "svn diff
> -rX[:Y]" and "svn status --show-updates".  The crux of this seems to
> be sorting the various add/change/delete operations within
> subversion/libsvn_repos/reporter.c (delta_dirs), but when I try to do
> that various things fail.  It may be theoretically impossible but I do
> not yet see any reason why.  I may try that again and/or request help
> on it later, but, even without it, having local operations sorted
> seems very nice to me.

Well, it's possible, but it's not easy, so I'm not sure if it's worth
it.  There are two big cans of worms:

  #1: Instead of traversing the report information followed by the
target information followed by the source information, we have to
traverse the (sorted) source and target information at the same time as
the report (asking, at each step, which comes next lexically: a source
item, a target item, or a report item).  And we have to behave correctly
if the client's report isn't sorted (because the report could have come
from a 1.0 client).  (By "correctly" I mean we have to display all the
right output, but not necessarily in alphabetical order.)

  #2: Frequently, on the libsvn_wc side of things, the editor's
close_directory() handler will process everything the server didn't
mention.  For instance, "svn st -u" will (I presume) report on status
actions received from the server and then, when the directory is closed,
report on locally modified or added files not mentioned by the server
(and also recurse into subdirectories not mentioned by the server).  All
these editors would have to be changed to, each time the server mentions
something, iterate over all the local items which alphabetically preceed
it.  Again, we have to behave correctly if the server's output isn't
sorted, since it might be a 1.0 server.

Given how hard it is to get it right, I'm not sure if we want to do it
half-assed.  It seems like it could lead to questions, or in the extreme
case, incorrect scripts, when people become used to seeing alphabetical
behavior in most-but-not-all cases.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: PATCH: Perform multi-file operations in sorted order

Posted by Greg Hudson <gh...@MIT.EDU>.
On Wed, 2004-06-02 at 09:41, Mark Benedetto King wrote:
> I think well-ordered reporters could eventually give us streamy tree
> compares.  That would be something.

The report is already guaranteed to be in depth-first order.  Since
we're historically willing to store whole directories in memory, that's
enough to avoid storing the entire report on the server (currently into
a temporary file) if we could start driving the editor before the report
is finished.  But that would require some serious protocol engineering,
both in ra_dav and ra_svn.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: PATCH: Perform multi-file operations in sorted order

Posted by Mark Benedetto King <mb...@lowlatency.com>.
On Tue, Jun 01, 2004 at 09:30:04PM -0400, Greg Hudson wrote:
> 
> I wouldn't be on that.  Historically we're pretty good at punting hard
> stuff unless it solves a big problem.
> 

I think well-ordered reporters could eventually give us streamy tree
compares.  That would be something.

--ben



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: PATCH: Perform multi-file operations in sorted order

Posted by Julian Foad <ju...@btopenworld.com>.
Greg Hudson wrote:
> On Tue, 2004-06-01 at 10:10, Julian Foad wrote:
> 
>>Probably I am failing to understand something about the meaning of
>>the report information and how it interacts with the local
>>information.
> 
> I'll explain in greater detail than you probably require.
[...]

Thanks - that's a useful level of detail.

I'll get back to it when I can.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: PATCH: Perform multi-file operations in sorted order

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2004-06-01 at 10:10, Julian Foad wrote:
> Probably I am failing to understand something about the meaning of
> the report information and how it interacts with the local
> information.

I'll explain in greater detail than you probably require.

The report communicates the state of the working copy (minus local
mods), which is used as the source of the comparison.  It essentially
overlays s_entries, and sometimes t_entries as well.

For example, suppose I have a working copy at rev 5, except that (due to
a prior commit) foo.c is at rev 6.  And, due to a switch operation,
bar.c is actually representing a different URL within the repository. 
The report operations will look like:

  set_path("", 5)
  set_path("foo.c", 6)
  link_path("bar.c", "other-url", 5)

What we used to do is construct a database transaction representing the
state of the working copy; we'd start with the anchor at rev 5, then
replace the foo.c dirent with rev 6 of that file, and then replace the
bar.c dirent with whatever other-url is at rev 5.  Moreover, because
updates preserve switches, we'd also construct a database transaction
representing the target; we'd start with the anchor directory at rev
HEAD, and replace the bar.c dirent with  whatever other-url is at rev
HEAD.

What we do now is dynamically overlay the source and target with the
report information (we now behave like "sed" rather than "ed").  We
still make the same comparisons as we would have made with the old code,
but we don't set up source and target transactions to do so.

+  /* Assume that for each report entry there is a corresponding target entry
+     (not sure about this), but not vice versa. */

Well, this is definitely not a good assumption.  I don't know if it's
your only problem.

> If we can't get it right, then yes, we don't want a half-solution
> in the long term.  What I was thinking is that this is all doable
> and should all be done in the long term, and seeing most output
> sorted now will show us how valuable it is and give us a bit more
> incentive to do the hard cases.  I'll do some more work on the
> hard cases later.

I wouldn't be on that.  Historically we're pretty good at punting hard
stuff unless it solves a big problem.

(So, -0 on checking in the easy stuff without solving the hard cases. 
But not -1.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: PATCH: Perform multi-file operations in sorted order

Posted by Julian Foad <ju...@btopenworld.com>.
Greg Hudson wrote:
> On Mon, 2004-05-31 at 18:34, Julian Foad wrote:
> 
>>I have as yet failed to get the hardest part working, which is sorting
>>the combination of local and repository information for "svn diff
>>-rX[:Y]" and "svn status --show-updates".  The crux of this seems to
>>be sorting the various add/change/delete operations within
>>subversion/libsvn_repos/reporter.c (delta_dirs), but when I try to do
>>that various things fail.  It may be theoretically impossible but I do
>>not yet see any reason why.  I may try that again and/or request help
>>on it later, but, even without it, having local operations sorted
>>seems very nice to me.
> 
> Well, it's possible, but it's not easy, so I'm not sure if it's worth
> it.  There are two big cans of worms:

Well, it's my number one itch, so I'm at least going to open the cans and dig around.

>   #1: Instead of traversing the report information followed by the
> target information followed by the source information, we have to
> traverse the (sorted) source and target information at the same time as
> the report (asking, at each step, which comes next lexically: a source
> item, a target item, or a report item).

This seems conceptually easy to do within reporter.c:delta_dirs().  I have tried in the attached patch.  As far as I can tell, I have implemented what you say, but many regression tests fail seriously (such as a basic checkout failing to complete).  I have tried coding it a few times in slightly different ways, so I don't thinks it's just a careless bug.  Probably I am failing to understand something about the meaning of the report information and how it interacts with the local information.

If you can shed any light on my attempt that would be great.  The "#if 0" part sorts and merges the source and target entries but still processes all of the report info first.  I believe that worked (though it may not be in a compilable state at the moment) in the sense that it proceeded in partially sorted order and passed regression tests.  I'll try it again some time to be sure.

>  And we have to behave correctly
> if the client's report isn't sorted (because the report could have come
> from a 1.0 client).  (By "correctly" I mean we have to display all the
> right output, but not necessarily in alphabetical order.)

Yes - I would make sure of that.  That's not difficult.

>   #2: Frequently, on the libsvn_wc side of things, the editor's
> close_directory() handler will process everything the server didn't
> mention.  For instance, "svn st -u" will (I presume) report on status
> actions received from the server and then, when the directory is closed,
> report on locally modified or added files not mentioned by the server
> (and also recurse into subdirectories not mentioned by the server).  All
> these editors would have to be changed to, each time the server mentions
> something, iterate over all the local items which alphabetically preceed
> it.

Right.  I haven't attacked this yet.  That's a fair chunk of work.

> Given how hard it is to get it right, I'm not sure if we want to do it
> half-assed.  It seems like it could lead to questions, or in the extreme
> case, incorrect scripts, when people become used to seeing alphabetical
> behavior in most-but-not-all cases.

If we can't get it right, then yes, we don't want a half-solution in the long term.  What I was thinking is that this is all doable and should all be done in the long term, and seeing most output sorted now will show us how valuable it is and give us a bit more incentive to do the hard cases.  I'll do some more work on the hard cases later.

- Julian