You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Jeffrey Pierson <jp...@realgreen.com> on 2011/09/07 06:53:38 UTC

Thoughts about issue #3625 (shelving)

With the new patch features from 1.7 would it make sense to implement this
feature internally as a stored patch with some extra changes to the client
to allow exporting the shelved changes as a patch and more obviously shelve
and unshelve?

As a TortoiseSVN user I'm thinking such feature could be added to an
individual client as a internally managed patch without any changes to the
core SVN API. This being said, client side API changes would make it more
widely adoptable and consistent. My guess is that with or without this
feature, clients will like grow this feature post 1.7 because of the patch
enhancements.

Re: Thoughts about issue #3625 (shelving)

Posted by Daniel Shahaf <da...@elego.de>.

Greg Stein wrote on Thu, Sep 08, 2011 at 23:43:04 -0400:
> On Thu, Sep 8, 2011 at 19:33, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > Stefan Sperling wrote on Thu, Sep 08, 2011 at 00:36:05 +0200:
> >...
> >> Sure. But in the initial implementation we could just restore the former
> >> working copy state, including mixed-revisioness etc. Just rewind everything
> >> back to where it was and let a subsequent update sort out the conflicts.
> >> That would already be a big improvement over the diff/patch approach.
> 
> To do this, we'd also want to move op_depth==0 over to the
> SHELF_NODES. Probably a good idea to just copy all of it, rather than
> piecemeal like I suggested, as I suspect that we'd eventually find a
> reason to have BASE present.
> 

Forward-porting of patches?  i.e., locally maintaining a patch on top of
upstream source?  And then when you port the patch it might be nice to
do a 3-way merge (old BASE, new BASE, (old BASE)+patch -> (new BASE)+patch).

> > If we do this, it would be nice if 'restore the former state' could be
> > done offline --- e.g., by retaining the relevant pristines as long as
> > the shelf exists.
> 
> Absolutely. There shouldn't be *any* reason to contact the server. All
> the pristines would be held via SHELF_NODES and SHELF_ACTUAL.
> 

Thought so, just calling out the functional aspect (offline unshelving)
as opposed to the implementation one (this and that tables).

> > (Now, if the working copy is at state S and someone asks to return to
> > a previously-shelved state, what do we do with S?  Do we discard it, or
> > do we first save it somewhere?  And if we save it... do we save it as
> > a new shelf?)
> 
> I *do* envision multiple shelves, and that makes it (imo) even more
> important to look at "unshelve" as merging changes onto the current
> state.
> 
> Regarding current state S... I think "unshelve" should NOT be allowed
> if local mods exist. Thus, we never need to worry about S. The user
> can always explicitly shelve that, return to a no-mods state, and then
> unshelve state-R.
> 
> And yes, we can have an --ignore-local-mods switch to unshelve even
> when local mods exist.
> 

No one said that there are local mods in S; it might be a state with
various mixed-revisions and switched trees, but no local mods.  The
problems that these two introduce might be similar to the problem that
'plain' local mods (text/prop edits) introduce, though?

> Also consider: the shelves can then act as multiplexors for the
> working copy. You could have one shelf for trunk, one for
> branches/1.7.x, one for 1.6.x, one for branches/fs-successor-ids, and
> for some trunk changes that you set aside.
> 
> I've had to use git lately, and our shelves could almost look like
> git's branches. Swap around among them based on what you're doing at
> the time.
> 
> Cheers,
> -g

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Stein <gs...@gmail.com>.

On Fri, Sep 9, 2011 at 10:38, Greg Hudson <gh...@mit.edu> wrote:
> On Fri, 2011-09-09 at 08:09 -0400, Greg Stein wrote:
>> Greg Hudson said this is more akin to git stash than branches. I
>> haven't used git's stashes to see how it actually differs from
>> branches. I guess it is simply that changing branches leaves local
>> mods, rather than stashing pseudo-reverts the local mods.
>
> * Branches record tree states, while stashes record changesets as
> applied to the working copy and index.  You can "git stash", "git
> checkout branchname", and "git stash pop" the changeset such that it is
> now applied against the new branch, if it applies cleanly.  You can do
> similar things with branches using rebase, but the sequence of
> operations would be different and more complicated.
>
> * Stashes are a working copy feature, and aren't part of the history
> model.  This isn't necessarily an interesting distinction for us, but it
> has some consequences within the universe of git--a subsidiary
> repository's git fetch won't retrieve stashes, they won't be in the
> target space of commit specifiers, you don't have to create commit
> messages for them, etc..
>
> Stashes don't make git more expressive than local branches and rebase,
> but in some ways it's a useful UI concept to keep them separate.

Thanks for the extra detail!

>> Mercurial calls it shelving.
>
> Aha.  I'll note that shelving isn't a core feature of Mercurial but an
> extension.  Even if there are aliases so the command is accessible via
> both names, the feature needs to have a primary name (which will be how
> it's documented in the book, etc.).

Yup.

$ svn praise --help
blame (praise, annotate, ann): Output the content of specified files or
URLs with revision and author information in-line.
usage: blame TARGET[@REV]...
...

I will be lobbying for 'stash' as primary, and 'shelve' as an alias.

Cheers,
-g

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Hudson <gh...@MIT.EDU>.

On Fri, 2011-09-09 at 08:09 -0400, Greg Stein wrote:
> Greg Hudson said this is more akin to git stash than branches. I
> haven't used git's stashes to see how it actually differs from
> branches. I guess it is simply that changing branches leaves local
> mods, rather than stashing pseudo-reverts the local mods.

* Branches record tree states, while stashes record changesets as
applied to the working copy and index.  You can "git stash", "git
checkout branchname", and "git stash pop" the changeset such that it is
now applied against the new branch, if it applies cleanly.  You can do
similar things with branches using rebase, but the sequence of
operations would be different and more complicated.

* Stashes are a working copy feature, and aren't part of the history
model.  This isn't necessarily an interesting distinction for us, but it
has some consequences within the universe of git--a subsidiary
repository's git fetch won't retrieve stashes, they won't be in the
target space of commit specifiers, you don't have to create commit
messages for them, etc..

Stashes don't make git more expressive than local branches and rebase,
but in some ways it's a useful UI concept to keep them separate.

> Mercurial calls it shelving.

Aha.  I'll note that shelving isn't a core feature of Mercurial but an
extension.  Even if there are aliases so the command is accessible via
both names, the feature needs to have a primary name (which will be how
it's documented in the book, etc.).

Re: Thoughts about issue #3625 (shelving)

Posted by Daniel Shahaf <da...@elego.de>.

Philip Martin wrote on Fri, Sep 09, 2011 at 13:43:35 +0100:
> Daniel Shahaf <da...@elego.de> writes:
> 
> > <idea age="old">
> >
> > % cat .subversion/config
> > [shorthand]
> > s7 = https://svn.apache.org/repos/asf/subversion/branches/1.7.x
> > % svn info ^s7/ | grep URL
> > URL: https://svn.apache.org/repos/asf/subversion/branches/1.7.x
> >
> > </idea>
> 
> If we implemented that would substitution occur in the client or the
> libraries?  Would it apply to svn:externals?

We could break the API into two levels: a lower-level API that simply
exposes a hash of shortnames -> URLs and a higher-level API that takes
a URL such as ^foo/trunk and returns an absolute URL.

I don't think having ^s7/ URLs in externals is a good idea as it makes
the property values dependent upon ~/.subversion/config entries.  We
might want to have ^svn:foo/some/dir/here/ style URLs too, though, where
^svn:foo/ are expanded, not by looking for a key called 'svn:foo' in the
config hash, but by some other means specific to each svn:%s key.

Re: Thoughts about issue #3625 (shelving)

Posted by Philip Martin <ph...@wandisco.com>.

Daniel Shahaf <da...@elego.de> writes:

> <idea age="old">
>
> % cat .subversion/config
> [shorthand]
> s7 = https://svn.apache.org/repos/asf/subversion/branches/1.7.x
> % svn info ^s7/ | grep URL
> URL: https://svn.apache.org/repos/asf/subversion/branches/1.7.x
>
> </idea>

If we implemented that would substitution occur in the client or the
libraries?  Would it apply to svn:externals?

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: Thoughts about issue #3625 (shelving)

Posted by Daniel Shahaf <da...@elego.de>.

Greg Stein wrote on Fri, Sep 09, 2011 at 08:09:51 -0400:
> On Fri, Sep 9, 2011 at 07:05, Daniel Shahaf <da...@elego.de> wrote:
> > Once we have a .svn area shared by multiple working copies, though,
> > something like that would be useful --- perhaps, 'switch this wc to
> > the most recent snapshot of branches/fs-progress that you have in .svn'.
> > ('svn switch ^/subversion/branches/fs-progress@WCDB', or perhaps give it
> > a symbolic name (and make 'update' change what the symbolic name points
> > to).
> 
> I've found that git has symbolic names for repositories. Kinda handy
> to refer to "myrepos" vs "upstream", rather than remembering the URLs.
> 

<idea age="old">

% cat .subversion/config
[shorthand]
s7 = https://svn.apache.org/repos/asf/subversion/branches/1.7.x
% svn info ^s7/ | grep URL
URL: https://svn.apache.org/repos/asf/subversion/branches/1.7.x

</idea>

> Cheers,
> -g

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Stein <gs...@gmail.com>.

On Fri, Sep 9, 2011 at 07:05, Daniel Shahaf <da...@elego.de> wrote:
> Greg Stein wrote on Thu, Sep 08, 2011 at 23:43:04 -0400:
>> Also consider: the shelves can then act as multiplexors for the
>> working copy. You could have one shelf for trunk, one for
>> branches/1.7.x, one for 1.6.x, one for branches/fs-successor-ids, and
>> for some trunk changes that you set aside.
>
> Why do you have separate NODES and SHELF_NODES tables then?  I'd
> intuitively expect the NODES table to be replaced by the SHELF_NODES
> table, i.e., every working copy state --- including the one immediately
> after 'checkout' --- becomes a shelf.  (Though perhaps the first shelf
> is 'special' in one or more to-be-determined ways.)

True! Could certainly do that. A little trickier to upgrade, and lots
queries would need to be altered... but sure. We'd only have to take
that hit once, and add a 'shelf_id' to the NODES table.

>> I've had to use git lately, and our shelves could almost look like
>> git's branches. Swap around among them based on what you're doing at
>> the time.
>
> Usually want to hack on them concurrently --- i.e., to create a backport
> branch and 'make check' it while at the same time adding it to STATUS
> and looking at wc-queries.sql on trunk for someone on IRC.  Having
> multiple shelves within a single working copy isn't good enough for
> such use.

Yup. Just saying.

Greg Hudson said this is more akin to git stash than branches. I
haven't used git's stashes to see how it actually differs from
branches. I guess it is simply that changing branches leaves local
mods, rather than stashing pseudo-reverts the local mods.

> Once we have a .svn area shared by multiple working copies, though,
> something like that would be useful --- perhaps, 'switch this wc to
> the most recent snapshot of branches/fs-progress that you have in .svn'.
> ('svn switch ^/subversion/branches/fs-progress@WCDB', or perhaps give it
> a symbolic name (and make 'update' change what the symbolic name points
> to).

I've found that git has symbolic names for repositories. Kinda handy
to refer to "myrepos" vs "upstream", rather than remembering the URLs.

Cheers,
-g

Re: Thoughts about issue #3625 (shelving)

Posted by Daniel Shahaf <da...@elego.de>.

Greg Stein wrote on Thu, Sep 08, 2011 at 23:43:04 -0400:
> Also consider: the shelves can then act as multiplexors for the
> working copy. You could have one shelf for trunk, one for
> branches/1.7.x, one for 1.6.x, one for branches/fs-successor-ids, and
> for some trunk changes that you set aside.
> 

Why do you have separate NODES and SHELF_NODES tables then?  I'd
intuitively expect the NODES table to be replaced by the SHELF_NODES
table, i.e., every working copy state --- including the one immediately
after 'checkout' --- becomes a shelf.  (Though perhaps the first shelf
is 'special' in one or more to-be-determined ways.)

> I've had to use git lately, and our shelves could almost look like
> git's branches. Swap around among them based on what you're doing at
> the time.

Usually want to hack on them concurrently --- i.e., to create a backport
branch and 'make check' it while at the same time adding it to STATUS
and looking at wc-queries.sql on trunk for someone on IRC.  Having
multiple shelves within a single working copy isn't good enough for
such use.

Once we have a .svn area shared by multiple working copies, though,
something like that would be useful --- perhaps, 'switch this wc to
the most recent snapshot of branches/fs-progress that you have in .svn'.
('svn switch ^/subversion/branches/fs-progress@WCDB', or perhaps give it
a symbolic name (and make 'update' change what the symbolic name points
to).

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Stein <gs...@gmail.com>.

On Fri, Sep 9, 2011 at 00:44, Greg Hudson <gh...@mit.edu> wrote:
> On Thu, 2011-09-08 at 23:43 -0400, Greg Stein wrote:
>> I've had to use git lately, and our shelves could almost look like
>> git's branches. Swap around among them based on what you're doing at
>> the time.
>
> I think this is closer to git's "stash" feature than git branches.  In
> fact, I was thinking of jumping in and asking why this was being called
> something gratuitously different.

Mercurial calls it shelving. (personally, I tend to refer to 'git
stash' rather than 'hg shelve' given the former's increased
mindshare.. but this thread started with the 'shelve' term)

I expected us to alias 'shelve' and 'stash' to the same operation,
much like we have both 'svn blame' and 'svn praise'.

Cheers,
-g

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Hudson <gh...@MIT.EDU>.

On Thu, 2011-09-08 at 23:43 -0400, Greg Stein wrote:
> I've had to use git lately, and our shelves could almost look like
> git's branches. Swap around among them based on what you're doing at
> the time.

I think this is closer to git's "stash" feature than git branches.  In
fact, I was thinking of jumping in and asking why this was being called
something gratuitously different.

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Stein <gs...@gmail.com>.

On Thu, Sep 8, 2011 at 19:33, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> Stefan Sperling wrote on Thu, Sep 08, 2011 at 00:36:05 +0200:
>...
>> Sure. But in the initial implementation we could just restore the former
>> working copy state, including mixed-revisioness etc. Just rewind everything
>> back to where it was and let a subsequent update sort out the conflicts.
>> That would already be a big improvement over the diff/patch approach.

To do this, we'd also want to move op_depth==0 over to the
SHELF_NODES. Probably a good idea to just copy all of it, rather than
piecemeal like I suggested, as I suspect that we'd eventually find a
reason to have BASE present.

> If we do this, it would be nice if 'restore the former state' could be
> done offline --- e.g., by retaining the relevant pristines as long as
> the shelf exists.

Absolutely. There shouldn't be *any* reason to contact the server. All
the pristines would be held via SHELF_NODES and SHELF_ACTUAL.

> (Now, if the working copy is at state S and someone asks to return to
> a previously-shelved state, what do we do with S?  Do we discard it, or
> do we first save it somewhere?  And if we save it... do we save it as
> a new shelf?)

I *do* envision multiple shelves, and that makes it (imo) even more
important to look at "unshelve" as merging changes onto the current
state.

Regarding current state S... I think "unshelve" should NOT be allowed
if local mods exist. Thus, we never need to worry about S. The user
can always explicitly shelve that, return to a no-mods state, and then
unshelve state-R.

And yes, we can have an --ignore-local-mods switch to unshelve even
when local mods exist.

Also consider: the shelves can then act as multiplexors for the
working copy. You could have one shelf for trunk, one for
branches/1.7.x, one for 1.6.x, one for branches/fs-successor-ids, and
for some trunk changes that you set aside.

I've had to use git lately, and our shelves could almost look like
git's branches. Swap around among them based on what you're doing at
the time.

Cheers,
-g

Re: Thoughts about issue #3625 (shelving)

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Stefan Sperling wrote on Thu, Sep 08, 2011 at 00:36:05 +0200:
> On Wed, Sep 07, 2011 at 05:48:47PM -0400, Greg Stein wrote:
> > On Wed, Sep 7, 2011 at 03:51, Stefan Sperling <st...@elego.de> wrote:
> > > A first-class 'shelving' feature wouldn't have to worry about conflicts.
> > > It would simply restore the working copy to the shelved state (either
> > > destroying unrelated local modifications, or raising an error in case
> > > of their presence).
> > 
> > I think unshelving can create a full set of conflicts. As above, even
> > a simple add/add conflict.
> >
> > If local mods exist, then we can simple disallow the unshelving. For
> > now. With some additional work on conflict handling, we could do a
> > full merge of the local mods and the shelved mods.
> 
> Sure. But in the initial implementation we could just restore the former
> working copy state, including mixed-revisioness etc. Just rewind everything
> back to where it was and let a subsequent update sort out the conflicts. 
> That would already be a big improvement over the diff/patch approach.

If we do this, it would be nice if 'restore the former state' could be
done offline --- e.g., by retaining the relevant pristines as long as
the shelf exists.

(Now, if the working copy is at state S and someone asks to return to
a previously-shelved state, what do we do with S?  Do we discard it, or
do we first save it somewhere?  And if we save it... do we save it as
a new shelf?)

Re: Thoughts about issue #3625 (shelving)

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Sep 07, 2011 at 05:48:47PM -0400, Greg Stein wrote:
> On Wed, Sep 7, 2011 at 03:51, Stefan Sperling <st...@elego.de> wrote:
> > A first-class 'shelving' feature wouldn't have to worry about conflicts.
> > It would simply restore the working copy to the shelved state (either
> > destroying unrelated local modifications, or raising an error in case
> > of their presence).
> 
> I think unshelving can create a full set of conflicts. As above, even
> a simple add/add conflict.
>
> If local mods exist, then we can simple disallow the unshelving. For
> now. With some additional work on conflict handling, we could do a
> full merge of the local mods and the shelved mods.

Sure. But in the initial implementation we could just restore the former
working copy state, including mixed-revisioness etc. Just rewind everything
back to where it was and let a subsequent update sort out the conflicts. 
That would already be a big improvement over the diff/patch approach.

Re: Thoughts about issue #3625 (shelving)

Posted by Greg Stein <gs...@gmail.com>.

Hi Jeffrey,

Thanks for writing. Stefan answers most of the high points, and I'll
describe my own thoughts in-line below. Myself, or anyone that beats
me to it, should probably start a notes/shelving document.

Below:

On Wed, Sep 7, 2011 at 03:51, Stefan Sperling <st...@elego.de> wrote:
> On Wed, Sep 07, 2011 at 12:53:38AM -0400, Jeffrey Pierson wrote:
>> With the new patch features from 1.7 would it make sense to implement this
>> feature internally as a stored patch with some extra changes to the client
>> to allow exporting the shelved changes as a patch and more obviously shelve
>> and unshelve?
>...
>  1) svn patch does not support all kinds of tree changes. For instance,
>   If your saved changes included copy operations they will appear as
>   additions after applying the patch. If your saved changes included
>   replaced directory trees, with some of the children of the replacing
>   tree having been deleted or also replaced, a patch cannot represent that.

Right.

> A first-class 'shelving' feature would save the entire state of the
> working copy database. It could represent the shelved state with much
> more accuracy than is possible with a patch file -- it could preserve
> all types of tree changes, conflict markers, etc.

This was my thought. To be more concrete:

Create a new SHELVED_NODES table, modeled almost exactly like NODES.
One more column would be added to primary key, 'shelf_id'. When
changes are shelved, we create an ID for that shelving, and then move
all nodes with op_depth > 0 into SHELVED.

Second, we have a SHELVED_ACTUAL table, modeled after ACTUAL_NODE with
two more columns: 'shelf_id', 'actual_checksum'. The ACTUAL_NODE
contents would be moved into this table, with the ID from above. The
contents of the file would be moved into the pristine system, and the
checksum stored into 'actual_checksum'.

We also need some details from op_depth==0. For example, what revision
are the text/prop changes against? Tree changes such as add/copy don't
need data from BASE (I think), but move/delete is removing a specific
revision. When the shelf is restored, we need to deal with conflicts
around those moves/deletions.

Hmm. We may need more info from BASE for adds. For example, before
shelving, it may be just an add (no BASE). But when unshelving, a node
exists, which means an add/add conflict.

... anyway. That is the basic approach that I was considering for the
feature. We can capture the entire state with these extra couple
tables.

>...
> A first-class 'shelving' feature wouldn't have to worry about conflicts.
> It would simply restore the working copy to the shelved state (either
> destroying unrelated local modifications, or raising an error in case
> of their presence).

I think unshelving can create a full set of conflicts. As above, even
a simple add/add conflict.

If local mods exist, then we can simple disallow the unshelving. For
now. With some additional work on conflict handling, we could do a
full merge of the local mods and the shelved mods.

>...

Cheers,
-g

Re: Thoughts about issue #3625 (shelving)

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Sep 07, 2011 at 12:53:38AM -0400, Jeffrey Pierson wrote:
> With the new patch features from 1.7 would it make sense to implement this
> feature internally as a stored patch with some extra changes to the client
> to allow exporting the shelved changes as a patch and more obviously shelve
> and unshelve?

What extra changes do you have in mind?

It is possible to shelve a set of changes like this:

  svn diff > mypatch
  svn revert -R .

and unshelve them later:

  svn revert -R .
  svn patch mypatch

There are caveats with using patches for shelving (with or without
an internal mechanism to handle the shelved patch):

 1) svn patch does not support all kinds of tree changes. For instance,
   If your saved changes included copy operations they will appear as
   additions after applying the patch. If your saved changes included
   replaced directory trees, with some of the children of the replacing
   tree having been deleted or also replaced, a patch cannot represent that.

A first-class 'shelving' feature would save the entire state of the
working copy database. It could represent the shelved state with much
more accuracy than is possible with a patch file -- it could preserve
all types of tree changes, conflict markers, etc.

 2) svn patch cannot do a 3-way merge. If the working copy state has
   changed in an incompatible way you'll get rejects instead of conflicts.
   To mitigate this problem you can update back in time to the revision
   the patch was made against before applying the patch.
   But this can get tedious if the diff was made from a mixed-revision
   working copy. Every path affected by the patch might have to be
   updated to some other revision. Of course, a Subversion client could
   automate this as part of a 'shelving' implementation based on 'svn diff'
   and 'svn patch'.

A first-class 'shelving' feature wouldn't have to worry about conflicts.
It would simply restore the working copy to the shelved state (either
destroying unrelated local modifications, or raising an error in case
of their presence).

> As a TortoiseSVN user I'm thinking such feature could be added to an
> individual client as a internally managed patch without any changes to the
> core SVN API. This being said, client side API changes would make it more
> widely adoptable and consistent. My guess is that with or without this
> feature, clients will like grow this feature post 1.7 because of the patch
> enhancements.

I don't think there is a huge problem with clients building shelving
support on top of svn diff and svn patch. As soon as Subversion supports
shelving as a first class feature these clients can replace their custom
implementation with API calls into the Subversion libraries.

Note that the very same thing happened with 'svn patch' and TortoiseSVN.
TortoiseSVN has had a 'patch' feature even before Subversion 1.7 existed.
TortoiseSVN's own patch implementation was basic. I don't think it supported
changes to properties, or even patching lines that had moved elsewhere in
the patch target. TortoiseSVN versions based on Subversion 1.7 replaced the
custom 'patch' implementation with a call into Subversion's libraries.
TortoiseSVN gained new functionality for its existing 'patch' feature.