You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by kf...@collab.net on 2005/05/20 17:17:40 UTC

Stage 1 of true rename support.

History of the problem:
=======================

Subversion's implementation of renames is problematic.  We implement
'svn rename foo bar' as 'svn copy foo bar; svn delete foo'.  This is
speaking loosely; we don't actually turn one client operation into
two, of course.  But the effect is the same.  In the repository, a
rename looks like two events that happen as part of one commit: a copy
of SRC to DST, and a removal of SRC.

The reason we do things this way is, well, stupid.  Basically, we let
a detail of working copy implementation impose itself on core
Subversion.  The motivating scenario was this:

   $ svn mv foo bar
   $ svn ci -m "Committing only one side of a rename." bar

(Could just as easily be 'foo' instead of 'bar' in the commit.)

We thought, "Well, if we just treat rename as copy+delete, then the
above scenario works consistently, no matter whether the user
specifies both SRC and DST or just one."  Of course, this solution
makes it difficult to distinguish between renames and copies/deletes
in the repository, and it loses object identity (!).  Somehow we
thought that was okay at the time.  I don't know what we were smoking.

I was one of the advocates of this line of thinking, perhaps the
principal advocate, and I hereby recant.  I was wrong.  I was so
wrong.  Honey, can you ever forgive me for what I did?  Can't we make
this work, somehow?

A Multi-Staged Solution:
========================

Fixing rename should happen in stages, because it involves several
entirely separable tasks.  The stages I have in mind are:

   1. Implement true renames in the repository.  That is, implement
      svn_fs_rename() and whatever associated machinery goes along
      with it, and make 'svn rename URL1 URL2' use it.

      Complications: 
      Search in issue #898 for the word "subtle", and you will see a
      big diagram from me and Mike Pilato that shows why implementing
      svn_fs_rename() is not quite as trivial as one might think.

      Notes:

      Commits from a working copy would not use this functionality
      yet.  That comes in stage 2.  Also, updates would not try to do
      any fancy copying to be efficient; they'd continue to receive a
      'delete' directive and an 'add with history' directive, and get
      the whole new file from the RA layer, just as they do now.
      Making updates more efficient is stage 3.

   2. Support true renames from working copy commits.  This means that
      when you do 'svn rename foo bar' in a working copy, foo's entry
      knows this it is a rename source and knows the dest, and bar's
      entry knows it is a rename dest and knows the source.

      If you try to commit just one side of a rename, the client
      library will error articulately, telling you that you must
      commit either all or none of a rename, and pointing you to the
      other side.  I think this is appropriate, because by running
      'svn rename' (or the GUI equivalent), the user has expressed
      their intention that this is all one operation.  The client
      library should protect the user from contradicting themselves
      later.

      This stage would also have to ensure support for the
      rename+modification scenario described in issue #898.  As the
      issue says, there really aren't any major complications there,
      it really should just work "out of the box".  However, stage 2
      is the first moment where this becomes an issue, so we should
      not 

   3. (optional) Support efficient use of working-copy-side data when
      update receives a rename.  This is merely a matter of
      performance, not correctness.  The idea is that if the RA layer
      transmits a rename from server to client, and the client already
      has the source, then the server can just tell the client to mv
      the source to the dest, and maybe apply a diff if there are
      modifications too.

What To Do Right Now:
=====================

I think we should aim to do stage 1 for Subversion 1.3.

That means finding a solution for the repository node-revision ID
problem described in issue #898.

I'm not going to resummarize here the problem described in the issue,
or the solutions outlined there.  Anyone who cares about this enough
to enter the discussion, please just look at #898.  Note that the
diagram is in the extra-formal style preferred by me and (apparently)
no one else: the path components do not label the directories they
appear in, but rather represent directory *entries* and name the thing
they're pointing to in the next level down.  Thus, all rectangular
objects are directories, and the lone file just appears as "(file)".

So, seeing the problem there, does anyone have any thoughts for an
elegant solution?

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by "C. Michael Pilato" <cm...@collab.net>.
"Peter S. Housel" <ho...@acm.org> writes:

> Greg Hudson <gh...@MIT.EDU> wrote:
> > On Fri, 2005-05-20 at 13:45 -0500, C. Michael Pilato wrote:
> >> Your particular concerns, for example, is one which I've been hearing
> >> lately from certain of CollabNet's corporate Subversion-using
> >> customers, who have equal hopes of having true renames alleviate this
> >> pain.
> > Didn't we largely address this problem in r12616 (i.e. in svn 1.2) by
> > transmitting deletes before adds within a directory?
> 
> That doesn't help if the object to be deleted has local mods.

Right.  In my customer's case, it's a source tree with generated
(unversioned) files lying about in it.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by "Peter S. Housel" <ho...@acm.org>.
Greg Hudson <gh...@MIT.EDU> wrote:
> On Fri, 2005-05-20 at 13:45 -0500, C. Michael Pilato wrote:
>> Your particular concerns, for example, is one which I've been hearing
>> lately from certain of CollabNet's corporate Subversion-using
>> customers, who have equal hopes of having true renames alleviate this
>> pain.
> 
> Didn't we largely address this problem in r12616 (i.e. in svn 1.2) by
> transmitting deletes before adds within a directory?

That doesn't help if the object to be deleted has local mods.

-Peter-


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Greg Hudson <gh...@MIT.EDU>.
On Fri, 2005-05-20 at 13:45 -0500, C. Michael Pilato wrote:
> Your particular concerns, for example, is one which I've been hearing
> lately from certain of CollabNet's corporate Subversion-using
> customers, who have equal hopes of having true renames alleviate this
> pain.

Didn't we largely address this problem in r12616 (i.e. in svn 1.2) by
transmitting deletes before adds within a directory?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by "C. Michael Pilato" <cm...@collab.net>.
Joseph Galbraith <ga...@vandyke.com> writes:

> So step 3 is important to my non-voting, non-counting, insignificant
> self-- and I would claim it is more than just a matter of
> performance.

Easy there.  The feedback of real users is often faaaaaar more
valuable than that which comes from folks too close to the code.  :-)

Your particular concerns, for example, is one which I've been hearing
lately from certain of CollabNet's corporate Subversion-using
customers, who have equal hopes of having true renames alleviate this
pain.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Joseph Galbraith <ga...@vandyke.com>.
>    3. (optional) Support efficient use of working-copy-side data when
>       update receives a rename.  This is merely a matter of
>       performance, not correctness.  The idea is that if the RA layer
>       transmits a rename from server to client, and the client already
>       has the source, then the server can just tell the client to mv
>       the source to the dest, and maybe apply a diff if there are
>       modifications too.

One of the reasons I want true rename support is because
I (in my ignorance, perhaps), think it could solve one
of the major pains of mixed mode unix / windows use
of subversion: the very specific case of someone renaming
a file and changing only the case.

If such an operation in the repository was reflected to
the working copy as a rename during update and revert
(and hence to the OS as a MoveFile operation) this
would work under windows.

And since our commit hook prevents people from committing
multiple files with names differing only in case, but can't
possible prevent people (especially windows users) from
committing files with the wrong case, this is a real
problem.

So step 3 is important to my non-voting, non-counting,
insignificant self-- and I would claim it is more than
just a matter of performance.

Thanks,

Joseph

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by kf...@collab.net.
Chia-liang Kao <cl...@clkao.org> writes:
> Also a fast lookup through the history for "possible locations" of a
> given file/rev has to be implemented to efficiently make use of the
> information to support merge across renames.

/me hears the hint (re issue #1970)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Chia-liang Kao <cl...@clkao.org>.
 <kfogel <at> collab.net> writes:

>    1. Implement true renames in the repository.  That is, implement
>       svn_fs_rename() and whatever associated machinery goes along
>       with it, and make 'svn rename URL1 URL2' use it.
> 
>    2. Support true renames from working copy commits.  This means that
>       when you do 'svn rename foo bar' in a working copy, foo's entry
>       knows this it is a rename source and knows the dest, and bar's
>       entry knows it is a rename dest and knows the source.

Too fast, note that we will need an additional editor method to support
this.

Also a fast lookup through the history for "possible locations" of a
given file/rev has to be implemented to efficiently make use of the
information to support merge across renames.

Cheers,
CLK



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Daniel Patterson <da...@danpat.net>.
kfogel@collab.net wrote:
> Daniel Patterson <da...@danpat.net> writes:
> 
>>   Just a note (and I'm not sure how helpful) on how Clearcase handles
>>   this operation.  For Clearcase, the rename of a file bumps
>>   the version number of the containing directory, but not of the
>>   item being renamed itself.  This avoids the issue of having to
>>   commit a change to the file itself (and the confusion about which
>>   side the commit should happen to).
> 
> 
> Are you assuming that both src and dst are always in the same
> directory?  That's not necessarily the case...

   No, and this is kind of one of the areas that Clearcase's lack
   of atomic commits is a bit of a pain.

   If you do basically "cleartool mv file1 ../newdir", you can
   get yourself in a bit of a poo.  If you commit the source
   directory, then the file is now missing in the repository.
   If you only commit the dest directory, you now have two
   copies of the same file in two places.  This happens quite
   a bit in our installation of Clearcase, particularly amoung
   the less experienced developers.

   It's much closer to regular filesystem semantics than how
   Subversion works.  Clearcase basically treats a rename as
   an unlink() and link(), while bumping the versions of the
   containing directories, with no guarantee that the unlink()
   and link() happen in the same transaction.

   I'm definetly not suggesting that Subversion should do it this
   way.  Just an FYI and a "here's how someone else implements this,
   and here's the drawbacks to that approach".

daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by kf...@collab.net.
Daniel Patterson <da...@danpat.net> writes:
>    Just a note (and I'm not sure how helpful) on how Clearcase handles
>    this operation.  For Clearcase, the rename of a file bumps
>    the version number of the containing directory, but not of the
>    item being renamed itself.  This avoids the issue of having to
>    commit a change to the file itself (and the confusion about which
>    side the commit should happen to).

Are you assuming that both src and dst are always in the same
directory?  That's not necessarily the case...

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Daniel Patterson <da...@danpat.net>.
kfogel@collab.net wrote:
>       If you try to commit just one side of a rename, the client
>       library will error articulately, telling you that you must
>       commit either all or none of a rename, and pointing you to the
>       other side.  I think this is appropriate, because by running
>       'svn rename' (or the GUI equivalent), the user has expressed
>       their intention that this is all one operation.  The client
>       library should protect the user from contradicting themselves
>       later.

   Just a note (and I'm not sure how helpful) on how Clearcase handles
   this operation.  For Clearcase, the rename of a file bumps
   the version number of the containing directory, but not of the
   item being renamed itself.  This avoids the issue of having to
   commit a change to the file itself (and the confusion about which
   side the commit should happen to).

   However, it makes it somewhat difficult to calculate what name a
   particular revision of a file had at any point in time, as the change
   of name is not a part of the files history, but only of its containing
   directory.

daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Erik Huelsmann <eh...@gmail.com>.
On 5/23/05, Marc Haisenko <ha...@webport.de> wrote:
> Just a minor user question: consider the following "recipe":
> 
> # svn mv foo bar
> # echo "Hello" >foo
> # svn add foo
> # svn commit
> 
> Right now this works, the editor shows "R foo" and reports that foo is being
> replaced. However, this doesn't work:
> 
> # svn rm foo
> # svn mv bar foo
> svn: 'foo' is scheduled for deletion; it must be committed before being
> overwritten
> 
> Will this work when true renames are implemented ?

This is an implementation problem in the working copy library and
unrelated to true renames. It can be fixed, but only in a minor
release (not in a patch release). Ben Reser has been working on fixing
it, but the fix isn't done yet. (It's in a branch in the repository,
so if you feel like working on it, please do!)

> Thanks,
>        Marc

Bye,


Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Stage 1 of true rename support.

Posted by Matthias Wächter <Ma...@tttech.com>.
Marc Haisenko <ha...@webport.de> wrote on 23.05.2005 10:44:06:

> Just a minor user question: consider the following "recipe":
> 
> # svn mv foo bar
> # echo "Hello" >foo
> # svn add foo
> # svn commit
> 
> Right now this works, the editor shows "R foo" and reports that foo is 
being 
> replaced. However, this doesn't work:
> 
> # svn rm foo
> # svn mv bar foo
> svn: 'foo' is scheduled for deletion; it must be committed before being 
> overwritten

What brings me (again) to the question of atomic multi-operations, for 
example mixing local and server-side operations or aggregating multiple 
server-side operations. For example, after checkout, make the following 
operation(s) atomic:

svn co http://server/repos/trunk proj; cd proj

svn rm http://server/repos/trunk/lib
svn cp http://server/repos/tags/lib-1.5 http://server/repos/trunk/lib
svn rm README.oldlibs
svn cp README README.oldlibs
sed -e "s/lib-1\.4/lib-1.5/g" README
svn commit


Any thoughts about this issue/feature? I think, especially pending 
server-side operations and mixing them with local commits will be a big 
challenge, if it is implementable at all.

- Matthias
-- 
Das letzte Wort ist noch nicht committed
The final word is not yet committed


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Marc Haisenko <ha...@webport.de>.
Just a minor user question: consider the following "recipe":

# svn mv foo bar
# echo "Hello" >foo
# svn add foo
# svn commit

Right now this works, the editor shows "R foo" and reports that foo is being 
replaced. However, this doesn't work:

# svn rm foo
# svn mv bar foo
svn: 'foo' is scheduled for deletion; it must be committed before being 
overwritten

Will this work when true renames are implemented ?
Thanks,
	Marc

-- 
Marc Haisenko
Systemspezialist
Webport IT-Services GmbH
mailto: haisenko@webport.de

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by "Ph. Marek" <ph...@bmlv.gv.at>.
On Friday 20 May 2005 22:10, kfogel@collab.net wrote:
> Eric Hanchrow <of...@blarg.net> writes:
> > Just curious: if svn gets true rename support, would that imply that
> > it could, in effect, implement "hard links" in the repository?  Put
> > another way, would it be possible (even if it's a bad idea) to have a
> > single item in the repository have two different names?  That way
> > editing one would affect the other.
>
> It wouldn't necessarily imply hard links in the fs.  Currently I think
> we need to prevent them, because we can't let cycles happen.
Apart from having true hardlinks in the repository (to have something like 
svn:externals on a per-file-base), maybe the case of issue 2286 
(http://subversion.tigris.org/issues/show_bug.cgi?id=2286) "Identical files 
should share storage space in repository" could be thought of - it seems to 
me that a change in the repository format could be necessary for that, too.

And issue 2286 is about "hardlinking" - to other, always older, revisions, and 
there only files, so there won't be any cycles.


Regards,

Phil


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by kf...@collab.net.
Eric Hanchrow <of...@blarg.net> writes:
> Just curious: if svn gets true rename support, would that imply that
> it could, in effect, implement "hard links" in the repository?  Put
> another way, would it be possible (even if it's a bad idea) to have a
> single item in the repository have two different names?  That way
> editing one would affect the other.

It wouldn't necessarily imply hard links in the fs.  Currently I think
we need to prevent them, because we can't let cycles happen.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Eric Hanchrow <of...@blarg.net>.
Just curious: if svn gets true rename support, would that imply that
it could, in effect, implement "hard links" in the repository?  Put
another way, would it be possible (even if it's a bad idea) to have a
single item in the repository have two different names?  That way
editing one would affect the other.

I don't actually know if this is a good idea, even if it's possible; I
just thought of it because Visual SourceSafe has a similar feature.

-- 
... belief in the omniscient hacker is indistinguishable from
belief in a Supreme Being.  There is simply no argument one can
give that will dissuade a true believer, yet when the believer is
asked for a demonstration he is unable to produce one.
        -- Michael Shamos


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by kf...@collab.net.
Martin Furter <mf...@rola.ch> writes:
> Why specify both names?
> 
> As you said the client knows the other side of the rename, so it could
> automatically add it to the commit.
> 
> But, wait, it's one operation, so there aren't really two 'sides' ;)
>
> What i really want to say is: I'm a lazy guy and filename completion on
> the commandline works for the new name only, so i'd like to specify only
> one of those two names to commit the rename.

Most commits use implied arguments -- the target is a subtree or
subtrees (sometimes implied '.').  Thus, usually you will not be in
the position of having to name either side of the rename.

But when the user *does* specify exactly one side, I think it would be
too much "Do What [you guess] I Mean" instead of "Do What I Say" for
Subversion to commit the other side of the rename as well.  I feel
it's safer to just error out.  At any rate, I'd like to at least start
with that behavior and see how it works out.

(However, this is a minor detail anyway.  If I get outvoted on this
one, I'm not going to feel too bad.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Stage 1 of true rename support.

Posted by Martin Furter <mf...@rola.ch>.

On Fri, 20 May 2005 kfogel@collab.net wrote:

>    2. Support true renames from working copy commits.  This means that
>       when you do 'svn rename foo bar' in a working copy, foo's entry
>       knows this it is a rename source and knows the dest, and bar's
>       entry knows it is a rename dest and knows the source.
> 
>       If you try to commit just one side of a rename, the client
>       library will error articulately, telling you that you must
>       commit either all or none of a rename, and pointing you to the
>       other side.  I think this is appropriate, because by running
>       'svn rename' (or the GUI equivalent), the user has expressed
>       their intention that this is all one operation.  The client
>       library should protect the user from contradicting themselves
>       later.

Why specify both names?

As you said the client knows the other side of the rename, so it could
automatically add it to the commit.

But, wait, it's one operation, so there aren't really two 'sides' ;)

What i really want to say is: I'm a lazy guy and filename completion on
the commandline works for the new name only, so i'd like to specify only
one of those two names to commit the rename.

Martin


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org