You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Ben Collins-Sussman <su...@collab.net> on 2004/03/31 17:46:11 UTC

The "follow copy history" initiative

I wanted to let everyone know about my initiative to make svn
subcommands aware of copy history.  For example, if I run 

  svn cat -r100 foo.c

...and foo.c was at a different URL in r100, svn should autotmatically
be able to follow that copy/rename history and still show us the proper
file.  At the moment, we get an error about how foo.c's path "doesn't
exist in r100".

This problem is endemic to a bunch of subcommands:  cat, ls, blame,
diff, merge, and maybe others.  It's irritating our users to no end. 
Issue #1093 is the current focal point for this class of bugs.


I see three phases to fixing the problem:

1. write an internal client function which, given a "peg" (rev, path)
coordinate pair, is able to find the same object at a different path in
an earlier revision.

2. make our subcommands use this new client function.

3. [optimization]  make the server do the history searching, not the
client.


Step #1 is mostly done:  Mike Pilato has written this function in the
current patch attached to issue #1093.  I'm testing it now.  It's doing
the "clumsy" thing by calling RA->get_logs() to collect history.

Step #2 is yet to be done.  For some subcommands (like 'cat' or 'ls'),
the upgrade will be easy, no UI changes required.  IIRC, we might need
to have some UI discussions about upgrading diff/merge, due to possible
amibiguities.  Don't wanna go there yet, though.  :-)

Step #3 is something that Shlomi Fish is interested in working on,
possibly on a branch.  The idea is that we add a new RA function to the
bottom of our RA vtable (so as to be 1.1 compliant) which asks the
*repository* to do the history search using the fs_history API.  That
way the client gets back a single targeted answer, rather than sloughing
through hundreds of logs.  Once the new RA function exists, we upgrade
the new client function (created in step #1) to use it instead of
RA->get_logs().


That's the overview.  Anyone see any horrible problems here?



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Chia-Liang Kao <cl...@clkao.org>.

On Wed, Mar 31, 2004 at 10:06:38PM +0200, Shlomi Fish wrote:
> +  while (history && (revision > past_revision))
> +    {
> +      SVN_ERR (svn_fs_history_prev (&new_history, history, TRUE, pool));
> +      history = new_history;
> +      if (history)
> +        {
> +          SVN_ERR (svn_fs_history_location (&path, &revision, history, pool));
> +        }
> +    }

You do want to use a subpool and claer it in the while loop here.

Cheers,
CLK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Branko Čibej <br...@xbc.nu>.

Greg Hudson wrote:

>>Anyway. I propose that the API allows searches in any direction, and to
>>optionally return a list of paths.
>>    
>>
>
>-0 on this.  I can see allowing forward search given true rename
>tracking, but multiple results for copies is going well beyond the
>scope of this function ("where is this file as of that rev?"), even if
>we provide that kind of functionality in the future.
>  
>
O.K., let's agree to disagree on this point. :-)

>>>but "svn_fs_get_node_path" sounds likea simple FS operation.
>>>      
>>>
>
>  
>
>>There's no reason why it shouldn't _be_ a simple operation. The fact
>>that our schema currently makes it a total pain is embarrasing, but
>>irrelevant.
>>    
>>
>
>When I say "sounds like a simple FS operation," I meant it sounds like
>a simple getter function, like svn_fs_node_id.  A two-dimensional
>search simply can't be that simple; it can be efficient, but it can't
>be a simple retrieval operation in any sane implementation.  So a word
>like "trace" makes more sense than a word like "get".
>  
>
Oh! All right, I'll buy that. And to make it consistent with current
terminology, "svn_fs_trace_node_locations" seems right -- too long
again, but right... Maybe just "svn_fs_trace_node"?

Dunno.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

> Anyway. I propose that the API allows searches in any direction, and to
> optionally return a list of paths.

-0 on this.  I can see allowing forward search given true rename
tracking, but multiple results for copies is going well beyond the
scope of this function ("where is this file as of that rev?"), even if
we provide that kind of functionality in the future.

>> but "svn_fs_get_node_path" sounds likea simple FS operation.

> There's no reason why it shouldn't _be_ a simple operation. The fact
> that our schema currently makes it a total pain is embarrasing, but
> irrelevant.

When I say "sounds like a simple FS operation," I meant it sounds like
a simple getter function, like svn_fs_node_id.  A two-dimensional
search simply can't be that simple; it can be efficient, but it can't
be a simple retrieval operation in any sane implementation.  So a word
like "trace" makes more sense than a word like "get".

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Branko Čibej <br...@xbc.nu>.

Ben Collins-Sussman wrote:

>There are two approaches here:
>
>   1. as ghudson suggests, run RA->get_logs() on (5, foo.c), and verify
>that the (3, foo.c) is actually the same object.
>
>   2. throw a specific error ("future searching not yet implemented"),
>so that callers can catch the error and make a helpful suggestion ("try
>running 'svn cat -rHEAD' on the URL instead of the working file.")
>
>I'm much more inclined to go the second route for now, because it's much
>simpler.  The first approach seems nice in an academic way, but I doubt
>it will be truly useful to users.
>
I absolutely agree. "sorry not implemented" is all right.

I still think the function should be able to return a list of results,
though.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Ben Reser <be...@reser.org>.

On Fri, Apr 02, 2004 at 07:37:18AM +0200, C.A.T.Magic wrote:
> C. Michael Pilato wrote:
> >Greg Hudson <gh...@MIT.EDU> writes:
> >>I'm not saying it's never useful to know what copies of a node were made
> >>in the past or the future.  But the purpose of this function is to aid
> >>commands like "svn cat -r N filename", and that should not traverse
> >>copies going forward.
> >
> >FWIW, I agree with you.
> 
> it's probably VERY useful -
> ( copied from thread "Dropping Subversion" )
> 
> cvs status -v foo
> ===================================================================
> File: foo        Status: Up-to-date
> 
>    Working revision:    1.4
>    Repository revision: 1.4     /CVS/Project/foo,v
>    Sticky Tag:          (none)
>    Sticky Date:         (none)
>    Sticky Options:      (none)
> 
>    Existing Tags:
>         foo-V01-00                    (revision: 1.1)
>         foo-V02-00                    (revision: 1.2)
>         foo-V03-01                    (revision: 1.3)
>         foo-V04-03                    (revision: 1.4)
> 
> this would require an svn command for something like
> 'list all node-copies of this file|dir that currently exist in -rHEAD 
> below /tags or /branches'

This is an entirely different issue.  Has nothing to do with what
they're talking about in this thread.  Please don't mix and match
threads.

-- 
Ben Reser <be...@reser.org>
http://ben.reser.org

"Conscience is the inner voice which warns us somebody may be looking."
- H.L. Mencken

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Martin Tomes <li...@tomes.org>.

Branko Čibej wrote:
> C. Michael Pilato wrote:
>>Greg Hudson <gh...@MIT.EDU> writes:
>>
>>>I'm not saying it's never useful to know what copies of a node were made
>>>in the past or the future. 

The attached perl program finds copy points on a file.  It isn't 
desperately efficient but it does do the job.

-- 
Martin

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

On Fri, 2004-04-02 at 15:50, Branko Čibej wrote:
> Let me understand this: are you both saying that you'd rather have yet
> another API for full histiry tracking instead of having this one at
> least support the concept, even if the inital implementation doesn't do
> everything?

More or less.  I'm saying that this particular function, which is the
function that gets used to calculate the <peg-rev, path> X op-rev ->
path transform, should:

  * Follow copies and renames, going backwards
  * Follow renames but not copies, going forwards

Since we have neither rename tracking nor forward pointers in our FS, we
cannot currently follow renames going forwards; so for now, we should
only allow forward movement when the node-copy is in the same place
within the operative rev.

We should never follow copies going forwards because:

  * Getting multiple results is an uncomfortable exceptional case.
  * In the presence of rename tracking, the answer involving only
    renames (or the one involving neither copies nor renames) is
    better than the answer involving copies, so we might generate
    an exception unnecessarily.
  * A copy may indicate a branch, which is not generally appropriate
    to follow going forwards.
  * A copy may indicate a file split or other code operation which
    is not appropriate to follow going forwards.

I have no objections to a future, separate API which allows searching
for all the copies of a node, or all the copies derived from a
node-copy, or both.  (I have concerns about my ability to implement such
an API within my planned libsvn_fs_fs architecture.  One of the key
design points of libsvn_fs_fs is that, once a revision is committed, it
never changes, except of course for the rev props.  So we can't go back
and add forward pointers to the data within earlier revs.  Still, that's
an implementation issue, not an API correctness issue.)

C.A.T.Magic wrote:
> if you do not traverse copies forward,
> will you be able to skip over files that
> are deleted from a revision but resurrected
> through a copy at a later time?

Ideally, I'd say there should be an operation which lets a file be
resurrected without generating a new node-copy.  I don't know offhand
what the UI for that operation should be, though.

> why should a new copy of a file be handled differently
> than a rename of the file or a cp/rm pair?

Because it's a different operation.  A rename preserves the identity of
the object while changing its location, while a copy produces a new (but
related) object.  You're only used to the result being the same because
you're used to a filesystem without history.

Imagine for a moment that we had really good merge support.  Say I want
to perform the following merge:

  BASELINE: contains file "foo"
  MAINLINE: copies "foo" to "bar"; renames "foo" to "baz"
  BRANCH: modifies file "foo"

The best merge result we can expect from a version control tool has the
branch's modifications to "foo" merged into "baz", with "bar"
unchanged.  To get this result, we have to distinguish the rename from
the copy.

(That kind of merge support isn't a total pipe dream; arch has it,
although it doesn't track copies at all.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by "C. Michael Pilato" <cm...@collab.net>.

Branko Čibej <br...@xbc.nu> writes:

> Let me understand this: are you both saying that you'd rather have yet
> another API for full histiry tracking instead of having this one at
> least support the concept, even if the inital implementation doesn't do
> everything?

Here's what I (think I) want:

   svn_error_t *svn_fs_history_next (apr_array_header_t **histories,
                                     svn_fs_history_t *history,
                                     ...
                                     apr_pool_t *pool);

And then I want an RA API and supporting svn_repos_* function which,
given a path, peg revision, and one or two other revisions, uses the
FS history stuff to location the committed-path of path@peg's history
line for each of the other revisions, where forward searching is
instructed to ignore successors that were created by copy.  The
additional work of tracking N successors lines and returning a
potentially massive tree of choices to the client app is, in my
opinion, quite premature.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Branko Čibej <br...@xbc.nu>.

C. Michael Pilato wrote:

>Greg Hudson <gh...@MIT.EDU> writes:
>
>  
>
>>I'm not saying it's never useful to know what copies of a node were made
>>in the past or the future.  But the purpose of this function is to aid
>>commands like "svn cat -r N filename", and that should not traverse
>>copies going forward.
>>    
>>
>
>FWIW, I agree with you.
>  
>

Let me understand this: are you both saying that you'd rather have yet
another API for full histiry tracking instead of having this one at
least support the concept, even if the inital implementation doesn't do
everything?

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by "C.A.T.Magic" <c....@gmx.at>.

C.A.T.Magic wrote:
>   echo foo >file1.txt 
>   mv file2.txt file1.txt
>   cp file2.txt file1.txt

should mean
     echo foo >file1.txt
     mv file1.txt file1.txt   < 1->2
     cp file2.txt file1.txt
ofcourse.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by "C.A.T.Magic" <c....@gmx.at>.

C. Michael Pilato wrote:
> Greg Hudson <gh...@MIT.EDU> writes:
> 
>>I'm not saying it's never useful to know what copies of a node were made
>>in the past or the future.  But the purpose of this function is to aid
>>commands like "svn cat -r N filename", and that should not traverse
>>copies going forward.

if you do not traverse copies forward,
will you be able to skip over files that
are deleted from a revision but resurrected
through a copy at a later time?

echo foo >file.txt
svn add file.txt
svn commit -m "initial"  # rev 1
svn rm file.txt
svn commit -m "deleted"  # rev 2
svn cp file.txt -r1 (reposurl)/file.txt (reposurl)/

this is a very common case.
also consider copies from branches back into trunk.

-
why should a new copy of a file be handled differently
than a rename of the file or a cp/rm pair?

isn't it equal to say
   echo foo >file1.txt
   cp file1.txt file2.txt
or to say
   echo foo >file1.txt
   mv file2.txt file1.txt
   cp file2.txt file1.txt
?
from a users point of view the result is the same
in both cases: two identical files with equal history.
but now you say svn would treat them different,
depending on how they were created...

I'd much prefer if both cases would be equal and
svn reports and follows both copies.
(yes I know there are no forward pointers in the db yet...)

:-)
=====
c.a.t.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by "C.A.T.Magic" <c....@gmx.at>.

C. Michael Pilato wrote:
> Greg Hudson <gh...@MIT.EDU> writes:
>>I'm not saying it's never useful to know what copies of a node were made
>>in the past or the future.  But the purpose of this function is to aid
>>commands like "svn cat -r N filename", and that should not traverse
>>copies going forward.
> 
> FWIW, I agree with you.

it's probably VERY useful -
( copied from thread "Dropping Subversion" )

cvs status -v foo
===================================================================
File: foo        Status: Up-to-date

    Working revision:    1.4
    Repository revision: 1.4     /CVS/Project/foo,v
    Sticky Tag:          (none)
    Sticky Date:         (none)
    Sticky Options:      (none)

    Existing Tags:
         foo-V01-00                    (revision: 1.1)
         foo-V02-00                    (revision: 1.2)
         foo-V03-01                    (revision: 1.3)
         foo-V04-03                    (revision: 1.4)

this would require an svn command for something like
'list all node-copies of this file|dir that currently exist in -rHEAD 
below /tags or /branches'

:-)
======
c.a.t.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by "C. Michael Pilato" <cm...@collab.net>.

Greg Hudson <gh...@MIT.EDU> writes:

> I'm not saying it's never useful to know what copies of a node were made
> in the past or the future.  But the purpose of this function is to aid
> commands like "svn cat -r N filename", and that should not traverse
> copies going forward.

FWIW, I agree with you.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

On Thu, 2004-04-01 at 19:30, C.A.T.Magic wrote:
> > With proper rename tracking, an object can exist
> > in at most one place in the future.  A copy of the object doesn't count.
> 
> Maybe if there was a -tiny- difference between a normal 'cp'
> and a 'cp to create a branch', some things would be a lot easier?

I don't think so.  I really think a copy doesn't count, whether it was
made for a branch or for another purpose.  (If it was made for a rename,
it does count, but we should really have rename-tracking to fix that
problem; branch-copy-tracking is no substitute.)

I'm not saying it's never useful to know what copies of a node were made
in the past or the future.  But the purpose of this function is to aid
commands like "svn cat -r N filename", and that should not traverse
copies going forward.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by "C.A.T.Magic" <c....@gmx.at>.


Greg Hudson wrote:
> On Thu, 2004-04-01 at 15:21, Ben Collins-Sussman wrote:
> 
>>** And in some ideal future, we'd get back one of
>>
>>   1. "object doesn't exist by any name in rFUTURE."
>>
>>   2. "object exists at N different locations in rFUTURE, please try
>>again using one of these URLs:"
> 
> 
> No, I never said that.  With proper rename tracking, an object can exist
> in at most one place in the future.  A copy of the object doesn't count.

Maybe if there was a -tiny- difference between a normal 'cp'
and a 'cp to create a branch', some things would be a lot easier?
I'm thinking of something like
   svn cp --branch (reposurl)/A/ (reposurl)/branches/A-branch/
   (alias svn branch?)
which could add a small marker (or property) to the new
copy to tell that it is a new -branch-.
this way you could distinguish which file to follow on a rename and 
which not.
    svn cp (reposurl) A/ B/
        # a 'normal' rename, follow it on 'future' searches
    svn mv (reposurl) A/ B/
        # a 'normal' move, follow it on 'future' searches
    svn cp --branch (reposurl) A/ B/
        # a -special- copy, don't follow it unless told to
    svn cp --tag (reposurl) A/ B/
        # a -special- copy, don't follow it unless told to
        # additionally marks the target as readonly.

Or maybe a custom configured option could automatically
assume --branch if the source and destination url of a
'svn cp' match a certain pattern.
Maybe such a special --branch property could also make it easier to
automatically search for branches that need merge into trunk.
I know all this can be done by 'manually' looking at the
urls, but automated tools that don't "know" about the decided
(/trunk /tags /branches) repository structure could have it
much easier then.
A "cp --tag" option could additionally give the target folder
a 'read only' property to prevent accidental commits to it.

:-)
======
c.a.t.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

On Thu, 2004-04-01 at 15:21, Ben Collins-Sussman wrote:
> ** And in some ideal future, we'd get back one of
> 
>    1. "object doesn't exist by any name in rFUTURE."
> 
>    2. "object exists at N different locations in rFUTURE, please try
> again using one of these URLs:"

No, I never said that.  With proper rename tracking, an object can exist
in at most one place in the future.  A copy of the object doesn't count.

>    3. a successful reply, assuming the object exists in only one place
> in rFUTURE.

You'd get one of (1) or (3).


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Ben Collins-Sussman <su...@collab.net>.

On Thu, 2004-04-01 at 13:16, Greg Hudson wrote:

> I'm saying we should do the optimistic-future-searching thing mentioned
> in (1) above, when presented with a future operative revision.  How are
> we miscommunicating?

Ah, I understand now.  Let's do a checksum here.  I'll spell everything
out.

** At the moment, if I run 'svn subcommand -rFUTURE foo.c', I may get
one of two answers back:

   1. "path not found in rFUTURE."

   2. a successful reply, though the future thing sitting in foo.c's
path may be totally unrelated.

** In your proposal, we'd get back one of

   1. "path not found in rFUTURE."

   2. "path found in rFUTURE, but is unrelated to foo.c".

   3. a successful reply (assuming the same object is in the same place
in rFUTURE.)

** And in some ideal future, we'd get back one of

   1. "object doesn't exist by any name in rFUTURE."

   2. "object exists at N different locations in rFUTURE, please try
again using one of these URLs:"

   3. a successful reply, assuming the object exists in only one place
in rFUTURE.

All good?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

On Thu, 2004-04-01 at 13:26, Ben Collins-Sussman wrote:
> On Thu, 2004-04-01 at 11:34, Greg Hudson wrote:
> > On Thu, 2004-04-01 at 10:55, Ben Collins-Sussman wrote:
> > > There are two approaches here:
> > > 
> > >    1. as ghudson suggests, run RA->get_logs() on (5, foo.c), and verify
> > > that the (3, foo.c) is actually the same object.
> > > 
> > >    2. throw a specific error ("future searching not yet implemented"),
> > > so that callers can catch the error and make a helpful suggestion ("try
> > > running 'svn cat -rHEAD' on the URL instead of the working file.")
> > 
> > Have you ever cd'd to your Subversion working directory and run "svn log
> > -r HEAD"?  Do you really want to break that?
> 
> I think 'svn log' is a bad example here, because that's the one
> subcommand that *does* follow history already.

No, log is a fine example.  It currently follows history while searching
backwards, but it doesn't follow history when deciding what node-rev to
start searching from.  Try it:

  cd your-svn-working-dir/subversion/libsvn_fs/notes
  svn log structure  # Note history going back to r1
  svn log -r155 structure

> And how far does your argument go?  Are you suggesting that until we
> change the fs schema to make future-searching possible, we shouldn't add
> the past-searching feature to any subcommand?  Are bunches of behaviors
> going to break that I'm not considering?

I'm saying we should do the optimistic-future-searching thing mentioned
in (1) above, when presented with a future operative revision.  How are
we miscommunicating?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Ben Collins-Sussman <su...@collab.net>.

On Thu, 2004-04-01 at 11:34, Greg Hudson wrote:
> On Thu, 2004-04-01 at 10:55, Ben Collins-Sussman wrote:
> > There are two approaches here:
> > 
> >    1. as ghudson suggests, run RA->get_logs() on (5, foo.c), and verify
> > that the (3, foo.c) is actually the same object.
> > 
> >    2. throw a specific error ("future searching not yet implemented"),
> > so that callers can catch the error and make a helpful suggestion ("try
> > running 'svn cat -rHEAD' on the URL instead of the working file.")
> 
> Have you ever cd'd to your Subversion working directory and run "svn log
> -r HEAD"?  Do you really want to break that?

I think 'svn log' is a bad example here, because that's the one
subcommand that *does* follow history already.  It never occurred to me
that it would need to be retrofitted with the new history-finding
routine.  Are you suggesting that?

And how far does your argument go?  Are you suggesting that until we
change the fs schema to make future-searching possible, we shouldn't add
the past-searching feature to any subcommand?  Are bunches of behaviors
going to break that I'm not considering?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

On Thu, 2004-04-01 at 10:55, Ben Collins-Sussman wrote:
> There are two approaches here:
> 
>    1. as ghudson suggests, run RA->get_logs() on (5, foo.c), and verify
> that the (3, foo.c) is actually the same object.
> 
>    2. throw a specific error ("future searching not yet implemented"),
> so that callers can catch the error and make a helpful suggestion ("try
> running 'svn cat -rHEAD' on the URL instead of the working file.")

Have you ever cd'd to your Subversion working directory and run "svn log
-r HEAD"?  Do you really want to break that?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Ben Collins-Sussman <su...@collab.net>.

On Thu, 2004-04-01 at 07:54, Branko Cibej wrote:
> Quoting Greg Hudson <gh...@MIT.EDU>:
> > We discussed this in the earlier (pre-1.0) thread.  Consensus was that
> > we can't trace into the future across renames of the file or its
> > parent directories (because all copies look the same, whether or not they
> > were renames, and even if there's only one copy of the same node in the
> > target rev, it's too hard to find given the current FS schema), but we
> > can check if <future-rev,path> is the same node-copy, and use that if so.
> 
> Yes I know the current schema doesn't support the forward search, but the _API_
> should allow it.

I feel like we're getting ahead of ourselves here.  I've been working on
cmpilato's patch for a day or so, which deals with the exact same
issue/API, but it's client-side function.  I'm posting the
patch-in-progress below for people to examine, because I think it should
have been seen first, before Shlomi's server-side stuff.  The theory is
that once we get subcommands using the new client history function, we
can switch the function from running RA->get_logs to
RA->some_new_fs_history_routine().  Back up a minute here.

Mike's new client routine *does* allow future searching... at least, the
API doesn't forbid it.  It takes a (rev, path) pair as a "peg" for a
line of history, and then two revisions X and Y that make up a range to
search.  The function potentially returns both (X, old-path-location)
and (Y, future-path-location).

The reason I've not yet committed this patch is because I'm precisely
bumping up against this wall of "searching into the future".  That is,
I've taught 'svn cat -rX foo.c' to use the new client history function
pre-emptively:  it uses (BASE, foo.c) as the peg, and passes X as the
"start" of the range to search, with an unspecified end range.  

However, in trans-test.py #8, BASE of foo.c happens to be 3, and HEAD is
5.  When the test runs 'svn cat -rHEAD foo.c', Mike's history function
throws an error.  It tries running RA->get_logs() on (3, foo.c), but
never finds the path in 5 (duh), and throws an error.

There are two approaches here:

   1. as ghudson suggests, run RA->get_logs() on (5, foo.c), and verify
that the (3, foo.c) is actually the same object.

   2. throw a specific error ("future searching not yet implemented"),
so that callers can catch the error and make a helpful suggestion ("try
running 'svn cat -rHEAD' on the URL instead of the working file.")

I'm much more inclined to go the second route for now, because it's much
simpler.  The first approach seems nice in an academic way, but I doubt
it will be truly useful to users.  Users will see *some* error:  either
"sorry, foo.c in the future isn't the same foo.c you have now!", or
"sorry, can't look into the future."  Either way, they're going to have
to work around the problem by using the URL.

Here's the log message + attached patch-in-progress:

-------------------------------------------------

A patch (mostly) from Mike Pilato: major step towards solving issue
#1093 -- "svn diff (and other subcommands) do not notice rename/copies."

This implements a new function which takes a revision range and a
(rev, path) pair as input (the "peg" for defining a line of history),
and returns new (rev, path) pairs at either end of the range.

As a proof of concept, 'svn cat -rN wcpath' now follows copy history
backwards in time.

* subversion/include/svn_client.h
  (svn_client_repos_locations): New.

* subversion/libsvn_client/client.h
  (svn_client__prev_log_path): New.

* subversion/libsvn_client/ra.c
  (struct log_message_baton, svn_client__prev_log_path, log_receiver,
   svn_client_repos_locations): New.

* subversion/libsvn_client/cat.c 
  (svn_client_cat): make 'svn cat' use the new history-finding
  function whenever invoked on a wc-path, but never on a URL.

Re: The "follow copy history" initiative

Posted by Branko Cibej <br...@xbc.nu>.

Quoting Greg Hudson <gh...@MIT.EDU>:

> On Wed, 2004-03-31 at 21:37, Branko Čibej wrote:
> > I have a problem with this patch: it only looks in the _past_. That
> > restriction is unnecessary (and evil :-), you should be able to peg
> > /foo/bar/baz@100 and ask where it is at revision 200.
> 
> We discussed this in the earlier (pre-1.0) thread.  Consensus was that
> we can't trace into the future across renames of the file or its
> parent directories (because all copies look the same, whether or not they
> were renames, and even if there's only one copy of the same node in the
> target rev, it's too hard to find given the current FS schema), but we
> can check if <future-rev,path> is the same node-copy, and use that if so.

Yes I know the current schema doesn't support the forward search, but the _API_
should allow it. Adding forwarrd pointers to the history graph isn't that much
of a big deal as far as schema change is concerned, and could theoretically be
done in the 1.x line, or a different backend could add them. If the function
returns "sorry not implemented" for peg < target, that's fine for now.

A search into the future can obviously return more than one path.
(Theoretically, so can a search into the past, because in general, logical node
history is a DAG, not a tree.) In both cases it is possible to select "the"
correct node version (and path) if branches aren't considered equal.

Anyway. I propose that the API allows searches in any direction, and to
optionally return a list of paths. For example,

    svn_fs_get_node_paths(path="...", peg=HEAD, target=HEAD, all=TRUE)

would return all the branches that the node currently exsists on. Cute.

> but "svn_fs_get_node_path" sounds likea simple FS operation.

There's no reason why it shouldn't _be_ a simple operation. The fact that our
schema currently makes it a total pain is embarrasing, but irrelevant.

(Sorry if the above seems like a rant, but we're past 1.0 now and no longer in
the "make it work" mode. We've made enough unfortunate API decisions based on
incomplete models as it is. Whilc this was excusable before 1.0, IMHO it no
longer is.)

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Greg Hudson <gh...@MIT.EDU>.

On Wed, 2004-03-31 at 21:37, Branko Čibej wrote:
> I have a problem with this patch: it only looks in the _past_. That
> restriction is unnecessary (and evil :-), you should be able to peg
> /foo/bar/baz@100 and ask where it is at revision 200.

We discussed this in the earlier (pre-1.0) thread.  Consensus was that
we can't trace into the future across renames of the file or its parent
directories (because all copies look the same, whether or not they were
renames, and even if there's only one copy of the same node in the
target rev, it's too hard to find given the current FS schema), but we
can check if <future-rev,path> is the same node-copy, and use that if
so.

It looks like Shlomi's code just returns the unmodified pathname if the
peg_revision is earlier than past_revision, but that's dangerous.

> I also don't like the name of the funciton, obviously, because it's not
> just about files. I suggest svn_fs_get_node_path.

I don't like the name either (it's too long and, as brane said, it
shouldn't be specific to files), but "svn_fs_get_node_path" sounds like
a simple FS operation.  I don't immediately have a good suggestion.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Branko Čibej <br...@xbc.nu>.

Shlomi Fish wrote:

>On Wed, 31 Mar 2004, Ben Collins-Sussman wrote:
>
>  
>
>>Step #3 is something that Shlomi Fish is interested in working on,
>>possibly on a branch.  The idea is that we add a new RA function to the
>>bottom of our RA vtable (so as to be 1.1 compliant) which asks the
>>*repository* to do the history search using the fs_history API.  That
>>way the client gets back a single targeted answer, rather than sloughing
>>through hundreds of logs.  Once the new RA function exists, we upgrade
>>the new client function (created in step #1) to use it instead of
>>RA->get_logs().
>>
>>    
>>
>
>Here is my initial patch to it. I implemented an svn_fs function to do
>just that. I also added a function to svn_ra_plugin_t to call it, and
>binded it only in ra_local (for the time being).
>  
>

I have a problem with this patch: it only looks in the _past_. That
restriction is unnecessary (and evil :-), you should be able to peg
/foo/bar/baz@100 and ask where it is at revision 200.

I also don't like the name of the funciton, obviously, because it's not
just about files. I suggest svn_fs_get_node_path.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: The "follow copy history" initiative

Posted by Shlomi Fish <sh...@vipe.stud.technion.ac.il>.

On Wed, 31 Mar 2004, Ben Collins-Sussman wrote:

> Step #3 is something that Shlomi Fish is interested in working on,
> possibly on a branch.  The idea is that we add a new RA function to the
> bottom of our RA vtable (so as to be 1.1 compliant) which asks the
> *repository* to do the history search using the fs_history API.  That
> way the client gets back a single targeted answer, rather than sloughing
> through hundreds of logs.  Once the new RA function exists, we upgrade
> the new client function (created in step #1) to use it instead of
> RA->get_logs().
>

Here is my initial patch to it. I implemented an svn_fs function to do
just that. I also added a function to svn_ra_plugin_t to call it, and
binded it only in ra_local (for the time being).

Regards,

	Shlomi Fish

----------------------------------------------------------------------
Shlomi Fish        shlomif@vipe.technion.ac.il
Home Page:         http://shlomif.il.eu.org/

You are banished! You are banished! You are banished!

Hey? I'm just kidding!