You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by "Ed S. Peschko" <es...@pge.com> on 2006/10/12 00:44:49 UTC

named changesets

hey all,

I was wondering - how are named changesets implemented in subversion? We are looking
at (amongst others) subversion and perforce, and this seems to be a feature that we
couldn't really live without.. Basically I'd like to say:

	svn commit --name=<changeset_name>

and at a later point, say something like

	svn merge --name=<changeset_name> --branch=HEAD

and have subversion keep track of whether or not this is a legal operation or not.

Thanks,

Ed

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by Ben Collins-Sussman <su...@red-bean.com>.

On 10/12/06, Ed S. Peschko <es...@pge.com> wrote:

> ok, but if you said something like:
>
> svn commit (list of files) --name 'my_changeset'
>
> wouldn't it be possible to store the named changeset on the server, rather than
> relying on the number of the changeset?

Yes, it would, but the subversion server already assigns a permanent
global changeset number to the commit.   It *could* also attach a
human name to the commit... but the number is usually all one needs.

>
> Suppose I'm an administrator.
> Someone checks in, on a branch, four changesets with:
>
>     svn commit --name 'my_changeset' a b c
>     svn commit --name 'my_changeset2' d e f
>     svn commit --name 'my_changeset3' g h i
>     svn commit --name 'my_changeset4' j k l
>
>
> As an administrator, I now want to apply these changesets to the trunk. I don't
> have access to the working copy of the developer who checked these in, and hence
> don't have access to their metadata.

You have the complete history of the repository at your fingertips:
every changeset ever committed by every developer is browseable.
Choose the ones you want, and tell 'svn merge' to merge exactly those
commit numbers.

In other words, subversion already does what you want in this
scenario.  It's just using global numbers to refer to committed
changesets, rather than human-invented names.   Perforce is the same
way.

> > This scenario is fantasy.  :-)   Neither perforce nor subversion are
> > able to associate a file with more than one changeset at a time.
> > Sorry!
>
> well, its not fantasy wrt Clearcase. At least that is their claim; that you
> can create multiple changesets containing the same files. And I think that
> bitkeeper may be able to do this as well, although I'm not sure about that.
>

I don't know Clearcase well, maybe someone can speak up on that.

>
> And as long as the changesets are logically distinct, I don't see why this would
> be impossible to do.

Yes, it's certainly a doable thing.  It involves making the working
copy sophisticated enough to specify that 'this changed hunk here' is
part of one changeset, and 'this other changed hunk there, in the same
file' is part of a different changeset.  Perforce and subversion only
track changesets at the file-level, but yes, it's conceivable that a
higher-resolution system could be written to associate 'hunks' rather
than files.  No argument there.

Keep in mind that I'm just talking about managing overlapping
changesets within a working copy.  It's a very common thing that two
changesets committed at different points in history affect the same
file.  And as you've already said, the order in which you merge those
changesets starts becoming important.  That's already the case with
subversion.

> Its not too much for to ask a source control tool to give you a road map
> of what's going on, what has or has not been committed. I don't even think
> it would be that bad, scalability-wise.

Yes, it's bad scalability-wise.  Try using that model with thousands
of engineers.  It doesn't scale.  Trust me.  :-)  Dozens?  No problem.

You need to understand where Subversion is coming from:  like CVS,
it's designed for use by untold thousands of anonymous users.  The
idea of managing/tracking every working copy is sort of a bizarre
requirement in the world of open source, where who knows how many
hundreds of users are toying with the code, hacking on it, etc.

In a corporate environment, sure, it's nice to keep tabs on your team.
 But there are many other ways of doing that.  Folks have already
suggested having coders do their work on private branches.  That's a
great technique.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by "Ed S. Peschko" <es...@pge.com>.

> The 'name' is just some metadata stored within a user's working copy,
> to refer to some local edits that haven't yet been committed.  Anyone
> using the particular working copy can see the name.  However, it's not
> considered a good practice to have multiple users sharing a single
> working copy.

ok, but if you said something like:

svn commit (list of files) --name 'my_changeset'

wouldn't it be possible to store the named changeset on the server, rather than
relying on the number of the changeset?

Suppose I'm an administrator. 
Someone checks in, on a branch, four changesets with: 

    svn commit --name 'my_changeset' a b c
    svn commit --name 'my_changeset2' d e f
    svn commit --name 'my_changeset3' g h i
    svn commit --name 'my_changeset4' j k l


As an administrator, I now want to apply these changesets to the trunk. I don't 
have access to the working copy of the developer who checked these in, and hence
don't have access to their metadata. 

So how do I refer to these changesets if their names are not stored somewhere 
on the server?  

ie:
	svn merge ???

> >And finally, how robust are the changesets?
> >
> >Now suppose (s)he wants to merge changeset #1 into the
> >repository, without merging #2 or #3. Is it possible to merge
> >*just* the A,B, and D changes in #1 without touching the
> >changes in #2 and #3?
> 
> 
> This scenario is fantasy.  :-)   Neither perforce nor subversion are
> able to associate a file with more than one changeset at a time.
> Sorry!

well, its not fantasy wrt Clearcase. At least that is their claim; that you
can create multiple changesets containing the same files. And I think that
bitkeeper may be able to do this as well, although I'm not sure about that. 


And as long as the changesets are logically distinct, I don't see why this would 
be impossible to do. I could see difficulties if someone, say:

	1)  modified a portion of the file in one changeset
	2) modified the modification in another, 
	3) tried to merge in the later changeset but not the former

But if someone put a line at the beginning of the file in one changeset, 
and put a line at the end of the file in another changeset, there's no reason 
why these two changesets couldn't be applied separately.  

To simplify things, at first you could maybe insist that these changesets be 
applied in chronological order, but I see no reason why that should even
be an issue as long as one changeset doesn't depend on another.


> 
> >eg: One of the things that *really* bothers me about CVS
> >as an administrator is the fact that you have no way of knowing
> >who has changed what locally.
> 
> Why does it matter to you?  That's a curious request.

Because it gives an administrator/tech lead of a project a quick snapshot of
what everyone is working on and prevents 'orphan changes' - where someone
has worked on something in one tree and forgets about it.

Its not too much for to ask a source control tool to give you a road map
of what's going on, what has or has not been committed. I don't even think 
it would be that bad, scalability-wise.


You'd just need a small daemon that runs locally, that polls for changes in 
repositories, and when it finds a change, tells the server about it. The server 
then stores it in *its* metadata, and when a checkin is done, removes the 
difference from the database. 


In fact, I think that this still respects your design as well as its
scalability; the client is still talking to the server on a per-need basis,
and the communication is only going from client to server. 

Ed

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by Lars Gullik Bjønnes <la...@gullik.net>.

"Michael Sinz" <Mi...@sinz.org> writes:

| On 10/12/06, Ben Collins-Sussman <su...@red-bean.com> wrote:
| >
| > On 10/12/06, Ed S. Peschko <es...@pge.com> wrote:
| > > (
| > > ps - is the rule about the server not tracking any of the
| > > client data a rock solid one?
| >
| > Yes, that's what makes subversion scalable to thousands of users over
| > a WAN.  Perforce is great, but is really only designed for a fast LAN.
| 
| 
| ...and with a full time administrator too :-)

... but subversion does not take advantage of a fast LAN at all...

So with multiple users spread out over a lot of different machines and
networks, subversion is bliss, but with fewer users, working on
same machines IO trashing kills all svn performance. If in such a
setting it could take advantage over the super fast LAN I am sure much
could be gained.

-- 
	Lgb

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by Michael Sinz <Mi...@sinz.org>.

On 10/12/06, Ben Collins-Sussman <su...@red-bean.com> wrote:
>
> On 10/12/06, Ed S. Peschko <es...@pge.com> wrote:
> > (
> > ps - is the rule about the server not tracking any of the
> > client data a rock solid one?
>
> Yes, that's what makes subversion scalable to thousands of users over
> a WAN.  Perforce is great, but is really only designed for a fast LAN.

...and with a full time administrator too :-)

> eg: One of the things that *really* bothers me about CVS
> > as an administrator is the fact that you have no way of knowing
> > who has changed what locally.
>
> Why does it matter to you?  That's a curious request.

I was wondering the same thing but I sort of see this happening in
my group - mainly because of certain engineers' hesitance to make
branches and check in code.  There are some multi-month development
efforts (changes/updates) that have never been checked in anywhere
and the code being passed around manually to the other engineers
that depend on the new code.

A lot of this may be due to past poor tool sets and thus learned behaviors
that do not really make sense.

Knowing what a developer has changed locally - for whatever reason - is
usually not productive and can actually be counter-productive (especially
when it is a trial-balloon type of change or hack to test something).
However,
if the behavior is like I described above, then what is really going on is
trying to make the tool work around bad behavior.

FYI:  if you use perforce, and everyone checks out working copies into
> shared NFS home directories, then you *can* ask the perforce server to
> show changesets-in-progress to you which belong to other people.  But
> the NFS bit is key.
>
> > I think perforce can do this; are there any plans on adding this type of
> > functionality (maybe a daemon to send this info back to the server). Or
> > is it easy to write your own daemons that could do something like this?
>
> No plans to do this.  The only advantage of this feature is that it
> makes it easy to review your neighbor's changeset.  Instead of having
> them post a patch to a mailing list for review, you can ask perforce
> to just show you your neighbor's changeset.  It makes the code review
> process a touch nicer.  But subversion is an inherently disconnected
> system;  it would really violate the spirit and design to have the
> subversion server do any sort of client-side tracking.

Actually, I have pushed to change-set reviews but this is done within
a short-lived branch.  The branch is then destroyed once merged.
This has the added benefit that the change set can be worked on
by more than one person and the review of the changes is rather
easy as it is exactly the same process as the review of a patch or merge.

(This is mainly for larger changes - a spelling change or other such
minor thing usually does not go through the same level of code review)

-- 
Michael Sinz               Technology and Engineering Director/Consultant
"Starting Startups"                          mailto:Michael.Sinz@sinz.org
My place on the web                      http://www.sinz.org/Michael.Sinz

Re: named changesets

Posted by Ben Collins-Sussman <su...@red-bean.com>.

On 10/12/06, Ed S. Peschko <es...@pge.com> wrote:

> I'm also curious - you said that there are numbers associated with
> changesets. Are they stored on the client as well, or on the
> server?

The numbers are immutable names for changesets, and they exist only in
the server after a changeset has been committed.  The client can refer
to them, of course, when asking for server data.

> Since the names are stored on the client, I'm assuming that
> an administrator can't use the same name to reference the changeset
> as the client does... Is this right?

The 'name' is just some metadata stored within a user's working copy,
to refer to some local edits that haven't yet been committed.  Anyone
using the particular working copy can see the name.  However, it's not
considered a good practice to have multiple users sharing a single
working copy.

>
> And finally, how robust are the changesets?
>
> Here's an example of what I have in mind - suppose developer
> X creates four files (A B C and D). X then makes three changesets
> to those files, done in the following order:
>
> Changeset #1:
>
>         modifies A
>         modifies B
>         modifies D
>
> Changeset #2:
>
>         modifies B
>         modifies C
>         modifies D
>
> Changeset #3:
>
>         modifies A
>         modifies B
>         modifies C
>
> Now suppose (s)he wants to merge changeset #1 into the
> repository, without merging #2 or #3. Is it possible to merge
> *just* the A,B, and D changes in #1 without touching the
> changes in #2 and #3?

This scenario is fantasy.  :-)   Neither perforce nor subversion are
able to associate a file with more than one changeset at a time.
Sorry!

>
> Thanks again,
>
> Ed
>
> (
> ps - is the rule about the server not tracking any of the
> client data a rock solid one?

Yes, that's what makes subversion scalable to thousands of users over
a WAN.  Perforce is great, but is really only designed for a fast LAN.

> eg: One of the things that *really* bothers me about CVS
> as an administrator is the fact that you have no way of knowing
> who has changed what locally.

Why does it matter to you?  That's a curious request.

FYI:  if you use perforce, and everyone checks out working copies into
shared NFS home directories, then you *can* ask the perforce server to
show changesets-in-progress to you which belong to other people.  But
the NFS bit is key.

> I think perforce can do this; are there any plans on adding this type of
> functionality (maybe a daemon to send this info back to the server). Or
> is it easy to write your own daemons that could do something like this?

No plans to do this.  The only advantage of this feature is that it
makes it easy to review your neighbor's changeset.  Instead of having
them post a patch to a mailing list for review, you can ask perforce
to just show you your neighbor's changeset.  It makes the code review
process a touch nicer.  But subversion is an inherently disconnected
system;  it would really violate the spirit and design to have the
subversion server do any sort of client-side tracking.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by Johnathan Gifford <jg...@wernervas.com>.

>>> On Thu, Oct 12, 2006 at  2:26 PM, in message
<20...@venus>, "Ed
S. Peschko" <es...@pge.com> wrote: 
> On Thu, Oct 12, 2006 at 09:40:12AM +0200, Ph. Marek wrote:
>> Hello Ed,
>> 
>> 
>> what you want are (maybe) branches.
>> 
>> Just create a branch with your specified name, switch to that
branch, 
> commit, 
>> switch back if needed.
>> If the feature is completed, do a "svn merge" of that branch.
>> 
>> Does that help?
> 
> Yeah, it helps a bit, I guess -  your idea is that we would have
implement one 
> feature 
> per branch, and then branch off of the branch in order to implement
another 
> feature..
>

No, they are saying to create a branch from the trunk for each feature,
not off another branch.  When a feature is ready, merge it back to
trunk.  When you need to work on a new feature, crate another branch
from trunk.

> 
> Am I understanding this right? The main problem with this is that we
would 
> be 
> branching *all the time*, and that we would have to branch in place
because 
> our
> directory structure is tied to our environments, and those are
expensive to 
> build.
>

Subverison uses cheap copies to create branches.  Essentially, the file
in the branch is really a pointer to the original in the trunk until
that file is changed in the branch.  So there is no expensive overhead
in Subversion for creating branches.  As far as your systems, they
shouldn't know if they running a branch or the trunk as the file
structures under the branch should be identical to the trunk.

> 
> I'd rather say that this particular branch contains independent
features 
> 'a', 'b', and 'c', and that 'a' and 'b' are ready and can be merged,
but to 
> hold off on 'c'. 
> 

If a, b, and c are contained in their own branch, that would be no
problem to do.

>
> Maybe though, its just my state of mind here...
> 
> Ed
> 
>
---------------------------------------------------------------------
> To unsubscribe, e- mail: dev- unsubscribe@subversion.tigris.org
> For additional commands, e- mail: dev- help@subversion.tigris.org

Remember, Subversion is pretty dang flexible.  You can take either a
stable trunk or unstable trunk approach.  You can even do a combination
of both if your development team can keep it straight. While there are
recommendations to use tags, trunk, and branches, it's not a hard fast
rule or requirement in Subversion.  A number of folks like using
'labels' rather than 'tags'.

Johnathan Gifford
Subversion Administrator
Werner Enterprises, Inc.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by "Ed S. Peschko" <es...@pge.com>.

On Thu, Oct 12, 2006 at 09:40:12AM +0200, Ph. Marek wrote:
> Hello Ed,
> 
> 
> what you want are (maybe) branches.
> 
> Just create a branch with your specified name, switch to that branch, commit, 
> switch back if needed.
> If the feature is completed, do a "svn merge" of that branch.
> 
> Does that help?

Yeah, it helps a bit, I guess - your idea is that we would have implement one feature 
per branch, and then branch off of the branch in order to implement another feature..

Am I understanding this right? The main problem with this is that we would be 
branching *all the time*, and that we would have to branch in place because our
directory structure is tied to our environments, and those are expensive to build.

I'd rather say that this particular branch contains independent features 
'a', 'b', and 'c', and that 'a' and 'b' are ready and can be merged, but to 
hold off on 'c'. 

Maybe though, its just my state of mind here...

Ed

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by "Ph. Marek" <ph...@bmlv.gv.at>.

Hello Ed,


what you want are (maybe) branches.

Just create a branch with your specified name, switch to that branch, commit, 
switch back if needed.
If the feature is completed, do a "svn merge" of that branch.

Does that help?


Regards,

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by "Ed S. Peschko" <es...@pge.com>.

On Wed, Oct 11, 2006 at 08:47:31PM -0500, Ben Collins-Sussman wrote:
> On 10/11/06, Ed S. Peschko <es...@pge.com> wrote:
> >hey all,
> >
> >I was wondering - how are named changesets implemented in subversion? We 
> >are looking
> >at (amongst others) subversion and perforce, and this seems to be a 
> >feature that we
> >couldn't really live without.. Basically I'd like to say:
> >
> >        svn commit --name=<changeset_name>
> 
> Perforce tracks all changesets on the server.  Subversion doesn't do
> that;  the server doesn't track any sort of client data whatsoever.
> 
> However, a subversion client *does* have simple bookkeeping that
> allows you to define and name changesets, and commit them exactly as
> you show above.
> 

> 
> >
> >and at a later point, say something like
> >
> >        svn merge --name=<changeset_name> --branch=HEAD
> >
> 
> Once you commit to subversion, the commit gets a permanent changeset
> name in the server, just a large integer.  When you want to merge a
> changeset from one branch to another, you just use the changeset
> number.  So yeah, it's pretty darn close to what you're showing.

first of all, thanks for the reply.. I'm just curious though, 
if there is a number stored somewhere that represents a changeset, 
it would seem trivial to give an option to the commit to make an 
alias of a name for that number on the server side..

I'm also curious - you said that there are numbers associated with
changesets. Are they stored on the client as well, or on the 
server? Since the names are stored on the client, I'm assuming that 
an administrator can't use the same name to reference the changeset 
as the client does... Is this right?

And finally, how robust are the changesets?

Here's an example of what I have in mind - suppose developer 
X creates four files (A B C and D). X then makes three changesets 
to those files, done in the following order:

Changeset #1:

	modifies A
	modifies B
	modifies D

Changeset #2:

	modifies B
	modifies C
	modifies D

Changeset #3:

	modifies A 
	modifies B
	modifies C

Now suppose (s)he wants to merge changeset #1 into the 
repository, without merging #2 or #3. Is it possible to merge 
*just* the A,B, and D changes in #1 without touching the 
changes in #2 and #3?

Thanks again,

Ed

(
ps - is the rule about the server not tracking any of the 
client data a rock solid one? 

eg: One of the things that *really* bothers me about CVS 
as an administrator is the fact that you have no way of knowing 
who has changed what locally. 

However, I don't want to force people to atomically lock everything 
in order to get an idea of what is going on on other people's 
machines. 

I'd like a way to query individual clients who have talked to my
server, and see if any non-committed changes still exist out there. 

I think perforce can do this; are there any plans on adding this type of 
functionality (maybe a daemon to send this info back to the server). Or
is it easy to write your own daemons that could do something like this?
)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: named changesets

Posted by Ben Collins-Sussman <su...@red-bean.com>.

On 10/11/06, Ed S. Peschko <es...@pge.com> wrote:
> hey all,
>
> I was wondering - how are named changesets implemented in subversion? We are looking
> at (amongst others) subversion and perforce, and this seems to be a feature that we
> couldn't really live without.. Basically I'd like to say:
>
>         svn commit --name=<changeset_name>

Perforce tracks all changesets on the server.  Subversion doesn't do
that;  the server doesn't track any sort of client data whatsoever.

However, a subversion client *does* have simple bookkeeping that
allows you to define and name changesets, and commit them exactly as
you show above.

>
> and at a later point, say something like
>
>         svn merge --name=<changeset_name> --branch=HEAD
>

Once you commit to subversion, the commit gets a permanent changeset
name in the server, just a large integer.  When you want to merge a
changeset from one branch to another, you just use the changeset
number.  So yeah, it's pretty darn close to what you're showing.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org