You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "N. Thomas" <nt...@cise.ufl.edu> on 2004/03/10 22:45:16 UTC

branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

* Lawrence Kesteloot <lk...@teamten.com> [2004-03-10 12:07:53 -0800]:
> I must admit I'm similarly confused by people on this list who branch
> several times a day.  I have a hard enough time keeping track of the
> one branch we do per release.

I don't do it as often as once per day, but every time I add a new
feature to the code that I'm working on, I will branch, do the write,
compile, test, commit cycle a bunch of times and then merge back into
the trunk, deleting the branch when I am done. (This is for a personal
project of mine, but I suppose if I were working on it full-time, and I
had enough bite-sized features, then I could imagine myself branching
more often.)

Two nice things about doing it this way:

    - the trunk always has the features I want fully implemented and is
      never in a state of unfinished, partly-working/partly-broken

    - I can work on adding multiple components to my system this way by
      putting them all in separate branches. It would be terribly
      confusing if I were to do it all in the main trunk.

Also, to quote a recent Slashdot posting:

    I don't know what your development model is, but branching and tagging are
    often some of the most frequent (and slowest, in CVS) operations.

    Many projects follow the "make branch, fix bug in branch, test branch and
    then merge" cycle, which makes a lot of sense.

                               -- Slashdot comment #8495286, by aurum42 (712010)

and here is another one:

    "Consider GCC"

    Once a week, a snapshot release is made. That means a tag is added. This
    operation takes, on average, 40 minutes, because the GCC source tree is
    large.

    Every time someome makes a branch, they create a tag just before branching
    (for use later on, with diffs and merging). 40 minutes to tag, another 40
    minutes to branch.

    All because these are, stupidly, O(n) operations instead of O(1). 

                             -- Slashdot comment #8495755, by by devphil (51341)
                                (Obviously he is talking about CVS here.)

Branching in Subversion is cheap, so we exploit them to the max in this way.

Thomas

-- 
N. Thomas
nthomas@cise.ufl.edu
Etiamsi occiderit me, in ipso sperabo

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by Brad Appleton <br...@bradapp.net>.
Wow! Thanks for the kind words Mark.

For those who are interested in some similar material that is freely available online (some stuff that is in the book, and other stuff that isn't), you can peek at the following websites:

 * www.scmpatterns.com -- has a quick-reference from the book, and a summary of the patterns, plus a publications section with some online articles

 * acme.bradapp.net -- aside from some links and a page-ful of SCM definitions, also has several papers that were initial material from the book (not all of it made it into the book). Of particular note are the "Streamed Lines" paper on branching best-practices, and some of the "ClearCase Best Practices" slides are generally applicable to other tools that support branching.

 * www.cmcrossroads.com -- has a monthly newsletter to which I and several other contribute. One recent relevant article is "codeline merging and locking: continuous updates, two-phased commits" from the November newsletter

On Thu, Mar 11, 2004 at 01:05:13PM -0700, Mark wrote:
> For all those who, like me, are converting from VSS and are struggling to
> get their heads around "real" version control/SCM, I'd like to recommend
> Brad's book (mentioned in his sig).
> 
> I'm reading it ATM and it's a big help in figuring out how SCM *should* be
> done, which (it turns out) is not how VSS encourages it to be done.

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by Brad Appleton <br...@bradapp.net>.
On Fri, Mar 12, 2004 at 11:18:12AM +0100, Stefan Haller wrote:
> Brad Appleton <br...@bradapp.net> wrote:
> 
> > So if you primarily work on one task at a time, you have a
> > single branch all to yourself. When you are done with your
> > change (and after you have "updated" from the main trunk)
> > you "commit" your change to the main trunk.
> 
> I'm not sure I really understand what you mean.  Are you
> saying you would first merge from the trunk to your branch
> (the changes that other people have committed to the trunk
> in the meantime), and then merge back from your branch to
> the trunk?

Yes. This is a commonly recurring standard best practice in
most VC tool "communities" where the tool has decent branching
support (i.e., not VSS :). In CVS and SVN the "update" command
does this. In ClearCase/UCM it is called rebase (short for
"rebaseline"), I think Perforce uses "sync". Bitkeeper calls it
"pull". Other tools call it "import" or "merge-in".

The idea is that, you are about to commit your changes to the
codeline. if other changes have been committed to the codeline
since you started your change, then your sandbox is not "up to
date" with the latest "good" state of the codeline. Hence if you
commit your changes, you will have potential inconsistencies and
even merge-conflicts to reconcile, and you may "break" the build
of the codeline. If you break the build, it impacts the whole
team because none of them can commit their changes now either.

So the prevailing wisdom that has emerged says, find a way to
test the result of my changes + the codeline such that if the
result fail, it only impacts me and my sandbox and not the result
of the team. There are two ways to do this:

A) Don't use "Latest-and-Greatest"!
----------------------------------
Instead, only use the most recently "blessed" (e.g. promoted)
baseline (label or tag). This offloads a lot of merge and
build work and resultant labeling+promoting to a buildmeister
and/or build-blesser. Having changes that are not "in sync"
with the latest stuff becomes increasingly more common and
it takes increasingly longer for builds to be blessed and for
the codeline and sandboxes to be "in sync".

The upside is that it is easier to isolate the set of
changes that you had to make, because you don't have to
checkout/merge/add any files/lines for changes that you had to
merge-in from elsewhere. If your VC tool has decent support
for being able to figure out which changes were REALLY made
by you and which ones were simply carried-forward by you,
this is less of a traceability concern.

OTOH, it might be easier to "reuse" the un-synced changes
in your workspace to "propagate forward" into a subsequent
parallel supported release. (Then again, it might not be any
easier, and could even be harder).

B) Update your Sandbox to Keep Current
--------------------------------------
Use latest-and-greatest. Do an update as often as desired
when there are new commits to the codeline. Keep your sandbox
(and branch) in sync with the latest state of the codeline so
that you don't have a "big bang" merge at the end of your task
and have to reconcile a maximal number of changes and your own
rework efforts. Instead do regular, frequent, and incremental
integration into your own sandbox so you only merge small and
easy chunks at a time, and decrease the amount of time and the
likelihood of occurrence that the codeline may be broken and that
you will have to do major rework before committing your changes.

The upside is that frequent incremental integration helps keep
everyone current and reduces the size and complexity of merge
conflicts and eases their reconciliation. It also minimizes
the window of time between when you are ready to commit your
changes and when you have finished committing them and have
verified the result is still consistent/correct.

The downside is your branch contains lots of changes that were
carried forward by you but not necessarily made by you. Again,
this is more of a traceability concern. Some would say it also
makes it harder to "subtract" the added functionality from the
codeline if desired at a later date - and this is true to some
extent. At the same time, following this practice decreases the
likelihood that it will be necessary as well as the likelihood
that a change will "break the build" (whereas if you haven't
done it, and you here about this, you worry about how to undo a
broken build because you are more used to it happening because
you don't sync as frequently - a bit of a catch-22)

So which is best?
=================
In general, most small and medium projects prefer the Frequent
Incremental Update approach - what I call "Continuous Update"
in my article "Codeline Merging and Locking: Continuous Update,
Two-Phased Commits" in Nov'03 CMCrossroads news at:
  <http://www.cmcrossroads.com/newsletter/articles/agilenov03.pdf>

Larger projects, particularly those that have dedicated
build-meisters that typically don't let developers commit their
own changes tend to eschew the "Latest-and-Greatest" and insist
on using static, formally identified/blessed labels.  It is
more careful and controlled but also adds a lot of development
"friction" and wait-time at the benefit of reducing the cost of
rework by preventing the "big merges" (rather than amortizing
them over small frequent chunks :-).

In the end, both are different risk-management approaches that
have their own appeal to their own audiences. There is "pay
now" (the static baseline), and there is "pay later" (don't
use anything and wait till it burns you), and there is "pay
as you go" (the frequent and disciplined use of incremental
integration, even during one's own change-task).

However, I have noticed in the last 5 years that more and more
shops are leaning toward developer "push"-style integration
(allowing developers to merge/commit their own changes), and
requiring them to rebase-before committing. To mitigate the
risk, they use what I call a "docking line" and the developers
push ("dock") their changes to this "active" development line,
and then the SCM/Build folks can preview/audit the stability
of what is there before deciding to "pull" the "docked" changes
from the active development line over to the mainline or
release-line branch.

I personally find that in my experience, the more frequent and
more incremental approach gives better overall stability and
suitability PROVIDED that developers are disciplined about 
making sure their stuff works and won't-break-the-build before
merging it and learn how to successfully merge, and generally
do a good job of using encapsulation and modularity in their 
coding. It also means "code ownership" (e.g. of a module/class)
can not be "exclusive" but is more like "stewardship" than
ownership (exclusive code ownership makes it difficult to do
this, and forces a more sequential-locking approach, and more
"wait-time" for the code-owner to make the changes you would
otherwise get their help on when reconciling  merge conflicts).

Good design, discipline and collaboration keep codelines
consistent, correct and coherent, and make LATEST-AND-GREATEST
with continuous/hyperfrequent integration+updates be very
effective and HIGHLY productive. If you don't have all three
of those things and continuous (the encapsulation/modularity,
the discipline to test what you have to ensure you don't break
the build, the ability to collaborate well to resolve merging
concurrent changes) then you break something for either
the SCM/Buildmeisters, or the QA/V&V, or the code-owners,
and ultimately for project management. In those cases the
formal static baselining and throw-it-over-the-wall "pull
model" of integration is more rampant, and takes more time,
but gives more reliable quality results (and results in more
adversarial relationships between those competing roles).

For more info on the "Docking Line" pattern, you can see the
two sets of powerpoint slides from previous RUC conference
presentations I've given at:
  http://acme.bradapp.net/#ClearCase

For more info on "Active Development Line", "Release Line"
and "Mainline" patterns you can see the "SCM Patterns" book
(www.scmpatterns.com) and also see precursor descriptions
of them in a rather comprehensive (and lengthy :) branching
best practices paper at:
  http://acme.bradapp.net/branching/

For more info in particular on "Continuous Update" and 
several companion practices that accompany it, see the
aforementioned paper on codeline merging and locking
  http://www.cmcrossroads.com/newsletter/articles/agilenov03.pdf

It talks about the following dozen or so locking-related
practices and the circumstances (context) in which each is
appropriate to use.  Alternatives range from no locking and
a single integration machine, to an integration token, to
various forms of codeline locking.

Continuous Workspace Update
 * Workspace Update
   + Post-Commit Notification
 * Private Checkpoint/Versions
   + Private Archive
   + Private Branch
   + Task Branch
   + Checkpoint Label

Two-phased "Commit"
 (where the commit "transaction" is viewed as having two phases:
  a commit-phase, and a "preparation" phase that consists of:
  rebase+reconcile, rebuild+retest, resolve)
 * Pre-Commit Validation
 * Codeline Locking (and factors of team-size, build/test-time,
   parallel tasks, likelihood of collisions/conflicts,
   commit-duration and overlap)
   + Single Release Point (e.g., single integration machine)
   + Integration Token
   + Codeline Write-Lock
      - Full Codeline Lock
      - Partial Codeline Lock
      - Double-Checked Codeline Lock
      - Phased Codeline Lock
   It discusses appropriate context for the locking patterns
   based on the above mentioned factors.

All of those locking-related patterns are successfully recurring
solutions in common practice. But the context is important. Use
a pattern in the wrong context, and at best you might simply
be doing more than you really need to (at worst you could really
foul things up).

Hope that helps!!!
-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

SVN, AccuRev and BitKeeper (was Re: branching several times a day)

Posted by Brad Appleton <br...@bradapp.net>.
On Tue, Mar 23, 2004 at 10:48:03AM -0600, Brad Appleton wrote:
> One thing I find interesting in all of this is the different
> approaches exemplified by SVN, Accu-Rev and Bitkeeper:
> 
> - In SVN, branches and labels (tags) are both working-copies.
>   One can take the POV that "everything is a copy" and branches
>   and workspaces are equivalent in that regard.
> 
> - In Accu-Rev, the overarching concept is that of the "stream"
>   (e.g., branch or codeline). In essence, everything is a stream.
>   A workspace is an instance of a stream. A label is simply a 
>   "static" stream.
> 
> - In Bitkeeper, the overarching concept is that of the "workspace".
>   In essence, a workspace is a first-class repository. Hence you
>   automatically have "private versioning" where you can checkin
>   private versions without making them visible to others, but
>   without having to branch. Branching doesn't happen until you
>   "pull" a change from another workspace, and then it only takes
>   place for the files that had true parallel activity. Change-sets
>   are supported, but labels are different beasts from workspaces
>   or branches or change-sets.

So, to summarize it seems that regarding branches/streams,
workspaces, and labels/tags, the above three can be
oversimplified just a tad as follows:
 - SVN: branches and tags are both just workspaces - all are equivalent
 - Accu-Rev: branches and tags are both "streams"; workspaces are separate
   but related (and configured with one or more streams)
 - BK: workspaces are "branches" since they're 1st-class repositories;
   labels are separate from both of them.

ASCII art time (okay - a "table" is not really "art")

           | Workspaces | Codelines  | Baselines  |
-----------+------------+------------+------------+
Subversion | Equivalent | Equivalent | Equivalent |
-----------+------------+------------+------------+
Accu-Rev   | DIFFERENT  | Equivalent | Equivalent |
-----------+------------+------------+------------+
BitKeeper  | Equivalent | Equivalent | DIFFERENT  |
-----------+------------+------------+------------+

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Brad Appleton <br...@bradapp.net>.
On Tue, Mar 23, 2004 at 04:57:20PM +0200, Nuutti Kotivuori wrote:
> Well, I guess we view complexity in a different way. For me, something
> like this is less complex:
> 
>   http://www.cmcrossroads.com/bradapp/acme/branching/branch-creation.html#BranchPerTask

You're absolutely right! I forgot my own advice about
the "wholeness" versus "separateness" value. I suspect
branch-per-task (private branch) approach will look
more appealing if a greater emphasis is placed on
separateness/separability, whereas the regularly rebased
branch-per-developer (private branch) approach will probably
look more appealing if a greater emphasis is placed on having
a continuously integrated whole and as few codelines as possible.

One thing I find interesting in all of this is the different
approaches exemplified by SVN, Accu-Rev and Bitkeeper:

- In SVN, branches and labels (tags) are both working-copies.
  One can take the POV that "everything is a copy" and branches
  and workspaces are equivalent in that regard.

- In Accu-Rev, the overarching concept is that of the "stream"
  (e.g., branch or codeline). In essence, everything is a stream.
  A workspace is an instance of a stream. A label is simply a 
  "static" stream.

- In Bitkeeper, the overarching concept is that of the "workspace".
  In essence, a workspace is a first-class repository. Hence you
  automatically have "private versioning" where you can checkin
  private versions without making them visible to others, but
  without having to branch. Branching doesn't happen until you
  "pull" a change from another workspace, and then it only takes
  place for the files that had true parallel activity. Change-sets
  are supported, but labels are different beasts from workspaces
  or branches or change-sets.

> So such a person does not have the emotional baggage left over from
> other version control systems, or the conceptual baggage from reading
> branching and tagging operations as somehow special.

Yes. I think it was Piaget who said we tend to try to understand
the unfamiliar in terms of that which is already familiar to us.

> Obviously when dealing with humans, such things must be accounted for.

 :-)

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Nuutti Kotivuori <na...@iki.fi>.
Brad Appleton wrote:
> Here I'm thinking The one with fewer branches (and more
> "change-sets" on a branch) is the less complex.

[...]

> SVN doesn't care because branches and tags are both copies.  But the
> vtree is conceptually cleaner because instead of lots of
> disconnected sequential but separate "segments" it has a single line
> with a single name/identity and the same number of "tagged"
> reference points for each activity.

Well, I guess we view complexity in a different way. For me, something
like this is less complex:

  http://www.cmcrossroads.com/bradapp/acme/branching/branch-creation.html#BranchPerTask

than something where things from a single branch get merged several
times, like in the example figure you posted.

However, I do see your point - there will be lots of disconnected
segments and a lot of branch operations in the tree, instead of just
having one branch and a lot of merges from that branch.

>> Also, you seem to be assuming that copying would be a more
>> heavyweight operation than, say, changing a line in a file - other
>> version control systems avoid copying by doing other changes, such
>> as recording branch-tags somewhere, or merge history - so you
>> cannot say that copying costs anything, since the alternative
>> cannot be not doing anything at all.
>>
>> So it seems to me to be avoiding some cost that just isn't there.
>
> With SVN that may be true. With other systems where branches and
> tags are separate things with separate mechanisms, the story
> would be different.

Oh, definitely. I was talking purely in a Subversion perspective here
- eg. what's best practise for Subversion.

> Even with SVN, the cost is still there, its just that it doesn't
> changes based on whether you do a branch or a tag because the same
> mechanism is used for both.

Well, the cost is the same as any versioned change to the repository.

>>> Plus there is a different between being lightweight, and being
>>> "perceived" as lightweight.

[...]

>> Well yeah, a lot of people have emotional baggage left over from
>> previous version control systems that can be hard to deal with.
>
> Emotional baggage and conceptual baggage are different things.  Just
> as a domain/analysis model can have differences from a
> design/implementation model - the differences between the two aren't
> necessarily emotional, and do correspond to important user needs. I
> have learned the hard way that it is unwise to casually dismiss such
> a difference as emotional, and that whenever there is a difference
> between the conceptual "domain" model and the design/implementation
> model, that ignoring or dismissing the difference as unimportant
> usually will come back to bite me in the end.
>
> So while I agree it is different, I would urge you to please not be
> so quick dismiss the difference as emotional/unimportant just
> because the technical cost may be the same. Those conceptual
> mismatches usually have something important to tell us about a
> fundamental need that would do us well to better understand.

Oh sorry - I certainly didn't mean to dismiss it as
emotional/unimportant. I actually meant a different thing.

What I meant that somebody who has never used a version control system
in his life reads the Subversion book as first introduction. That
person will first understand that Subversion can do cheap copies and
after that he will understand that cheap copies can be used for
branching and tagging, for example. To such a person, branching and
tagging *is* perceived lightweight and to be nothing special - since
it is no different than him trying to version his project by taking
copies of it, it just takes less space and is faster.

So such a person does not have the emotional baggage left over from
other version control systems, or the conceptual baggage from reading
branching and tagging operations as somehow special.

Obviously when dealing with humans, such things must be accounted for.

-- Naked

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Brad Appleton <br...@bradapp.net>.
On Tue, Mar 23, 2004 at 12:34:42AM +0200, Nuutti Kotivuori wrote:
> And as for the associated storage or sandbox - in Subversion it is
> trivial to change the branch that a sandbox is associated to, while
> keeping local edits (unfinished changes) in the sandbox. You can even
> create a new branch from the changes (and versions) that exist right
> now in your sandbox. So there is no need to take sandboxes into
> account at all when deciding whether to re-use the same branch-tag or
> to come up with a new.

Yes - for SVN the above sounds correct when deciding which of
the two patterns to prefer (Private Branch versus Task Branch)

> > The reason for doing that instead of a new-branch per task would be
> > if I could still get the benefits of task-based change-set grouping
> > without having to create so many additional branch-names and
> > associated "copies". If I could reuse the "copy", and let the "name"
> > be associated with revision resulting from porting the changes to
> > the trunk, then I'm still creating new names (that get associated
> > with trunk revisions) but I'm not creating new branches/tags nor
> > their associated copies, and the version tree has a much simpler
> > structure, and I'm still doing a whole lot less copying (even tho it
> > is a lightweight operation, its still not as lightweight as not
> > doing it at all)
> 
> Well, yes, by re-using the branches the version tree has a much
> simpler structure in that not so much branching and tagging is going
> on. But on the other hand the logical structure of the version tree is
> more complex, since a single branch has several successive tasks, with
> lulls in between possibly, instead of just having small branches with
> short lifetimes that get merged back to trunk, having only one task
> per branch.

Here I'm thinking The one with fewer branches (and more
"change-sets" on a branch) is the less complex. Think of 
a UML sequence diagram. Let's say each "branch" gets its
own "object lifecycle" instance and "swim-lane" and
each change task/activity get's its own little "box" of
activity on the lifeline of the branch the activity takes
place upon. The result might look very similar to
   http://acme.bradapp.net/branching/#VTreeDiagrams
(only top-to-bottom instead of left-to-right :-)

Here, "featureA" and "featureB" are examples of "activities"
that take place on the same branch instead of a separate branch.
So I'm thinking if I create fewer branches but with multiple
sequential activities on development branches, I have a
single branch and multiple activities per "lane" instead of
multiple branches+activities per swim-lane. Instead of having
a branch+tag per activity per developer, I have a branch per
developer with multiple activities and tags on it.

SVN doesn't care because branches and tags are both copies.
But the vtree is conceptually cleaner because instead of lots
of disconnected sequential but separate "segments" it has a
single line with a single name/identity and the same number of
"tagged" reference points for each activity.

> Also, you seem to be assuming that copying would be a more
> heavyweight operation than, say, changing a line in a file - other
> version control systems avoid copying by doing other changes, such as
> recording branch-tags somewhere, or merge history - so you cannot say
> that copying costs anything, since the alternative cannot be not doing
> anything at all.
> 
> So it seems to me to be avoiding some cost that just isn't there.

With SVN that may be true. With other systems where branches and
tags are separate things with separate mechanisms, the story
would be different. Even with SVN, the cost is still there,
its just that it doesn't changes based on whether you do a
branch or a tag because the same mechanism is used for both.

> > Plus there is a different between being lightweight, and being
> > "perceived" as lightweight. For some folks, no matter how well you
> > explain to them that branching in SVN is inexpensive+lightweight, it
> > still seems conceptually "heavyweight" to them. And even if the
> > implementation of doing so is lightweight, it will still seem
> > conceptually more complex (as will having to delete+rebranch,
> > something they would regard as an indication of additionally
> > technical "residue" required by the additional conceptual
> > complexity, for something they see as not being necessary in the
> > first place).
> 
> Well yeah, a lot of people have emotional baggage left over from
> previous version control systems that can be hard to deal with.

Emotional baggage and conceptual baggage are different things.
Just as a domain/analysis model can have differences from a
design/implementation model - the differences between the two
aren't necessarily emotional, and do correspond to important
user needs. I have learned the hard way that it is unwise
to casually dismiss such a difference as emotional, and that
whenever there is a difference between the conceptual "domain"
model and the design/implementation model, that ignoring or
dismissing the difference as unimportant usually will come
back to bite me in the end.

So while I agree it is different, I would urge you to please
not be so quick dismiss the difference as emotional/unimportant
just because the technical cost may be the same. Those conceptual
mismatches usually have something important to tell us about a
fundamental need that would do us well to better understand.

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Nuutti Kotivuori <na...@iki.fi>.
Brad Appleton wrote:
> On Sun, Mar 21, 2004 at 12:39:17AM +0200, Nuutti Kotivuori wrote:
>> Then again - "cvs update -j" is used for *merging*.
[...]
> Yes - that is the usage I meant (sorry about that).
[...]

Ah, okay. Then it all makes sense.

>> In the original context where I used that expression it is all
>> pretty much the same. The whole point is that after a commit to the
>> mainline, there are no changes in the branch that are uncommitted -
>> so "rebranching" is only to let the version control system know
>> that further changes are against the mainline at that point, and
>> not against the last edit on the previous change done on the branch
>> - there are no changes to merge, just metadata to inform.
>
> Right! (thank you for stating it so clearly). So what I'm seeing is
> that while SVN does indeed support change-sets, there is an implicit
> assumption by "svn log" that historical information about a branch
> starts with the beginning of the branch,

Right. Or by default it actually assumes that the historical
information starts from the beginning of the entire codeline - and you
have to specifically tell it to not follow branches to stop at the
beginning of the branch.

> and while there may be revisions of a branch that correspond to
> change sets, the notion that multiple successive revisions on the
> branch may be part of a larger task that are "delivered to the
> trunk" as a whole, doesn't really transport across branches.

Correct.

> The trunk knows that the stuff committed to it since the previous
> trunk revision is all one group. But the branch from which those
> changes came has no knowledge of that. The branch just knows about
> commits to itself, not bundles of changes-sets merged from/to some
> other branch.

Correct. Even more specifically, the trunk even doesn't know that a
bundle of change-sets are merged to it - it just gets a commit of
edits and doesn't care what created them. So the information that the
commit is a result of a merge operation pulling some change-sets from
a branch needs to be described in the log message, if that information
is to be retained.

> So I hear you telling me that the only way to tell a branch that it
> should "reset" it's history-logger after it's been "resynced" with
> the codeline is to explicitly "rebranch" it so that it thinks its
> branch-off point is the new-tip of the trunk instead of a previous
> one.

Yes.

> The other interesting thing is that the branch name apparently is
> not associated with the revision on the trunk that resulted from
> merging that branch to the trunk. So folks that created a branch for
> a specific task in their tracking system (e.g.  bugzilla or Jira or
> scarab, etc.) who may have used the task-id in the branch-name,
> often expect that branch-name (containing the task-id) to somehow
> "live on" in the history of the trunk once the changes are merged to
> the trunk.

Again correct. The only way for the branch-name to live on is to
record it in the log message, so people can trace it later.

> It seems they don't get that right now, based on a recent request I
> saw about wanting the ability to create some kind of "alias name"
> for a revision number that is separate from the notion of a "tag"
> (which I can fully understand, tho I agree calling it a "tag" rather
> than a revision alias would cause confusion, or perhaps a "Name"
> property of a revision, if properties can be associated with a
> revision)

Yes, that's right.

As a sidetrack from this - to implement smart merging, one usually
needs to record what has already been merged, so it does not get
merged again. As a side product of this tracking, it obviously means
that branch-name (or something similar) must be recorded in the
history of trunk, along with the actual change-sets that were
merged. So in a way you were right earlier when you said that smart
merging would fix this - as a side effect, yes - it would also record
what has been merged where, so it doesn't have to be written in the
log messages.

>> Is there a reason why one should not delete a private branch after
>> a commit to the mainline, and recreate it when starting a new one?
>
> Why recreate the branch rather than create a new branch by a new
> name?

Well, keeping the name was mostly just to show that it is possible -
in practise, there isn't often a reason to keep the name.

> If you delete it and then recreate it, I would say that you are, in
> essence, still using the private branch pattern, where the
> rebranching is part of a tool-implementation specific tactic for
> "delivering" changes from the private branch to the trunk that also
> tells the "delivering" branch it is now "resynced" as far as history
> is concerned.

Yes, obviously it is the way things are often done in Subversion. To
be even more exact, the deletion is to tell the branch that it is
"delivered", and the recreation is what happens at the start of the
new task to tell that the changes are against trunk at that
point. There might be some time and some commits on trunk between
these two events.

But it might as well be said that creating a new branch at the start
of a task and deleting it at the end of it, only keeping the same name
is the native private branch pattern, and re-using the old branch is
just a tool-implementation specific tactic to avoid a cubersome
re-branching operation.

> If I create a new branch for a new task instead of reusing (via
> deleting and then rebranching with the same name), then it is either
> because I still want the branch-tag to be associated with the
> delivered set of changes and their history, or because I still want
> the associated storage or sandbox. It seems I can't get the former
> (the delivered changeset => name association) without the latter.

Well, neither is exactly correct. In any case, the reference to the
delivered set of changes and their history should be recorded in the
log message on the commit to the trunk. For example, "merged 354:364
from /branches/naked-private-task to /trunk". This information will
never disappear or change, so it will forever be assosiated with
it. Keeping the branch around and not deleting it will only mean that
it is more easily located, since it is still alive at the HEAD
revision of the repository. But even that is not too useful, since the
branch does not know when it was merged to trunk - or even, if it ever
was - that information is only in the history of the trunk.

So, if you want to keep branch-tag as a mnemonic for the delivered set
of changes and their history, you *tag* the trunk revision at which
the changes were merged to the trunk. Then people can look at the
change-set produced at that revision to see the actual change - and
look in the log message of that revision to see where to find the
entire set of changes and their history as they were in the branch.

And as for the associated storage or sandbox - in Subversion it is
trivial to change the branch that a sandbox is associated to, while
keeping local edits (unfinished changes) in the sandbox. You can even
create a new branch from the changes (and versions) that exist right
now in your sandbox. So there is no need to take sandboxes into
account at all when deciding whether to re-use the same branch-tag or
to come up with a new.

>> I mean, sure, you can keep on editing in the same branch, you can
>> use revisions to mark ranges, you could build support into
>> Subversion to record the latest commit and only show diffs and logs
>> until that and smart merging will save your ass if you merge
>> changes twice - but why bother? Is there something to be gained by
>> all that?
>
> The reason for doing that instead of a new-branch per task would be
> if I could still get the benefits of task-based change-set grouping
> without having to create so many additional branch-names and
> associated "copies". If I could reuse the "copy", and let the "name"
> be associated with revision resulting from porting the changes to
> the trunk, then I'm still creating new names (that get associated
> with trunk revisions) but I'm not creating new branches/tags nor
> their associated copies, and the version tree has a much simpler
> structure, and I'm still doing a whole lot less copying (even tho it
> is a lightweight operation, its still not as lightweight as not
> doing it at all)

Well, yes, by re-using the branches the version tree has a much
simpler structure in that not so much branching and tagging is going
on. But on the other hand the logical structure of the version tree is
more complex, since a single branch has several successive tasks, with
lulls in between possibly, instead of just having small branches with
short lifetimes that get merged back to trunk, having only one task
per branch. Also, you seem to be assuming that copying would be a more
heavyweight operation than, say, changing a line in a file - other
version control systems avoid copying by doing other changes, such as
recording branch-tags somewhere, or merge history - so you cannot say
that copying costs anything, since the alternative cannot be not doing
anything at all.

So it seems to me to be avoiding some cost that just isn't there.

> Plus there is a different between being lightweight, and being
> "perceived" as lightweight. For some folks, no matter how well you
> explain to them that branching in SVN is inexpensive+lightweight, it
> still seems conceptually "heavyweight" to them. And even if the
> implementation of doing so is lightweight, it will still seem
> conceptually more complex (as will having to delete+rebranch,
> something they would regard as an indication of additionally
> technical "residue" required by the additional conceptual
> complexity, for something they see as not being necessary in the
> first place).

Well yeah, a lot of people have emotional baggage left over from
previous version control systems that can be hard to deal with.

But for a person who doesn't have such baggage, it is conceptually
very straightforward to operate by: when you want to make a change to
the mainline, branch from the mainline - when you have committed
(merged) the changes to the mainline, delete the branch.

> But I think the answer to your question is that with SVN, what you
> described (deleting+rebranching) is really just part of the SVN
> implementation of using a "private branch" (assuming you are still
> using the same branch-name when you rebranch). I had thought you
> were doing something more like the "task branch" pattern, and
> wondered why not "reuse" the existing branch rather than create a
> new branch by a new name. It sounds like that's not what you're
> doing.

Actually, I joined this conversation half-way, to try and clear up the
facts on what Subversion can do and what it can't do right now - so I
don't have anything I really am doing. And this sidetrack here was
just because I was wondering *why* should one reuse the existing
branch, be it a private branch or a task branch.

> In SVN, because branches and tags are both "copies", it can be
> difficult to discuss branching and labeling concepts in ways that
> treat both as separate/separable things

Yeah, it doesn't translate as easily to other version control systems
(or concepts) as a more traditional model would.

> (and that is essentially one of the main differences between a
> private branch and a task-branch. One reuses the work-stream and the
> work-space, while the other creates a new work-stream and workspace
> for the purpose of keeping the branch0name around as a mnemonic for
> the revision corresponding to the delivered change-set)

Well, like discussed above, the situation isn't exactly the same with
Subversion. So the difference mostly boils down to just how you name
the branch and how you end up using it.

-- Naked


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Brad Appleton <br...@bradapp.net>.
On Sun, Mar 21, 2004 at 12:39:17AM +0200, Nuutti Kotivuori wrote:
> Then again - "cvs update -j" is used for *merging*. It is confusingly
> the same command name, but it is really a merge operation. "cvs update
> -j" can be used to port changes between branches and the codeline, and
> can be used to rebase a branch.

Yes - that is the usage I meant (sorry about that). I
understand cvs update has several uses. One of which is simply
to update/refresh a workspace with the "latest stuff" from the
"right" codeline, typically before beginning work on a new task
(so I can reuse the current sandbox rather than scrap it and
make a new one) and another is to port/rebase changes from
the trunk (including merging any parallel changes)

> In the original context where I used that expression it is all pretty
> much the same. The whole point is that after a commit to the mainline,
> there are no changes in the branch that are uncommitted - so
> "rebranching" is only to let the version control system know that
> further changes are against the mainline at that point, and not
> against the last edit on the previous change done on the branch -
> there are no changes to merge, just metadata to inform.

Right! (thank you for stating it so clearly). So what I'm
seeing is that while SVN does indeed support change-sets,
there is an implicit assumption by "svn log" that historical
information about a branch starts with the beginning of the
branch, and while there may be revisions of a branch that
correspond to change sets, the notion that multiple successive
revisions on the branch may be part of a larger task that are
"delivered to the trunk" as a whole, doesn't really transport
across branches.

The trunk knows that the stuff committed to it since the
previous trunk revision is all one group. But the branch from
which those changes came has no knowledge of that. The branch
just knows about commits to itself, not bundles of changes-sets
merged from/to some other branch.

So I hear you telling me that the only way to tell a branch
that it should "reset" it's history-logger after it's been
"resynced" with the codeline is to explicitly "rebranch" it
so that it thinks its branch-off point is the new-tip of the
trunk instead of a previous one.

The other interesting thing is that the branch name apparently
is not associated with the revision on the trunk that resulted
from merging that branch to the trunk. So folks that created
a branch for a specific task in their tracking system (e.g.
bugzilla or Jira or scarab, etc.) who may have used the task-id
in the branch-name, often expect that branch-name (containing
the task-id) to somehow "live on" in the history of the trunk
once the changes are merged to the trunk.

It seems they don't get that right now, based on a recent
request I saw about wanting the ability to create some kind
of "alias name" for a revision number that is separate from
the notion of a "tag" (which I can fully understand, tho I
agree calling it a "tag" rather than a revision alias would
cause confusion, or perhaps a "Name" property of a revision,
if properties can be associated with a revision)

> Is there a reason why one should not delete a private branch after a
> commit to the mainline, and recreate it when starting a new one?

Why recreate the branch rather than create a new branch by
a new name?  If you delete it and then recreate it, I would
say that you are, in essence, still using the private branch
pattern, where the rebranching is part of a tool-implementation
specific tactic for "delivering" changes from the private branch
to the trunk that also tells the "delivering" branch it is now
"resynced" as far as history is concerned.

If I create a new branch for a new task instead of reusing
(via deleting and then rebranching with the same name), then it
is either because I still want the branch-tag to be associated
with the delivered set of changes and their history, or because
I still want the associated storage or sandbox. It seems I can't
get the former (the delivered changeset => name association)
without the latter.

> I
> mean, sure, you can keep on editing in the same branch, you can use
> revisions to mark ranges, you could build support into Subversion to
> record the latest commit and only show diffs and logs until that and
> smart merging will save your ass if you merge changes twice - but why
> bother? Is there something to be gained by all that?

The reason for doing that instead of a new-branch per task
would be if I could still get the benefits of task-based
change-set grouping without having to create so many additional
branch-names and associated "copies". If I could reuse the
"copy", and let the "name" be associated with revision resulting
from porting the changes to the trunk, then I'm still creating
new names (that get associated with trunk revisions) but I'm not
creating new branches/tags nor their associated copies, and
the version tree has a much simpler structure, and I'm still
doing a whole lot less copying (even tho it is a lightweight
operation, its still not as lightweight as not doing it at all)

Plus there is a different between being lightweight,
and being "perceived" as lightweight. For some folks,
no matter how well you explain to them that branching in
SVN is inexpensive+lightweight, it still seems conceptually
"heavyweight" to them. And even if the implementation of doing
so is lightweight, it will still seem conceptually more complex
(as will having to delete+rebranch, something they would regard
as an indication of additionally technical "residue" required
by the additional conceptual complexity, for something they
see as not being necessary in the first place).

But I think the answer to your question is that with SVN, what
you described (deleting+rebranching) is really just part of the
SVN implementation of using a "private branch" (assuming you are
still using the same branch-name when you rebranch). I had thought
you were doing something more like the "task branch" pattern,
and wondered why not "reuse" the existing branch rather than
create a new branch by a new name. It sounds like that's not what
you're doing.

In SVN, because branches and tags are both "copies", it can
be difficult to discuss branching and labeling concepts in
ways that treat both as separate/separable things (and that
is essentially one of the main differences between a private
branch and a task-branch. One reuses the work-stream and the
work-space, while  the other creates a new work-stream and
workspace for the purpose of keeping the branch0name around
as a mnemonic for the revision corresponding to the delivered
change-set)

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Nuutti Kotivuori <na...@iki.fi>.
Brad Appleton wrote:
> On Wed, Mar 17, 2004 at 01:46:55AM +0200, Nuutti Kotivuori wrote:
>> What is the term usually used to describe a single change (with a
>> log message usually) that has been "checked-in" to any branch?
>
> If its "checked-in" to a private-branch or a task-branch and
> the corresponding task is not yet finished, the common term
> I've heard (which the PrivateVersions pattern uses) is a
> "checkpoint", and the process of doing it is often called
> "checkpointing" (e.g., I've created a private "checkpoint"
> of my changes on my branch of my working-copy, but until I
> actually merge my changes to the codeline, I haven't really
> "committed" them (to the codeline) because theya rent yet
> visible to the rest of the team.

Okay, well, perhaps just talking about changes that are checked-in,
but have not been committed, will do.

>>> Why is it necessary to rebranch the branch (I assume you mean
>>> "reparent" the branch to a subsequent version of the trunk - which
>>> is almost what a "rebase" or "rebaseline operation does, in
>>> theory).
>>
>> Almost, but not exactly. "Reparenting" or "rebaselining" usually
>> means taking a change-set against an earlier version of mainline
>> (or baseline, heh), and adapting it to be against a later version
>> of mainline - eg. keeping the changes made on the branch, but just
>> making them be based against a later version of trunk.
>
> Okay. so for me, branch reparenting, rebranching, and branch
> rebaselining have a subtle but important difference.

Okay.

[...]

> * Rebasing (or Rebaselining) is "porting" the latest stuff from
> the codeline (trunk) into my branch. See slides 19-20 at
> http://www.cmcrossroads.com/bradapp/acme/clearcase/RUC2003_SCMA34.zip
> I thought that "cvs update" did the same thing?

Ah, I believe there is a tiny mixup here.

First of all - a plain "cvs update" is used to bring a working copy up
to date with the HEAD of the current branch. So assuming that there
are a bunch of changes as just local edits in the working copy
(changes that have not been checked-in or checkpointed), then yes -
"cvs update" does "port" the latest stuff from the codeline to the
working copy. However, if the changes are already checked-in or
checkpointed into a branch, "cvs update" does no rebasing what so
ever.

Then again - "cvs update -j" is used for *merging*. It is confusingly
the same command name, but it is really a merge operation. "cvs update
-j" can be used to port changes between branches and the codeline, and
can be used to rebase a branch. The results of the merge appear as
local edits though, so conflict resolution can be done, and the
results of the merge (or port) need to be checked-in or checkpointed
after that by "cvs commit".

[...]

I'm afraid I don't grok powerpoint, so your references were not of
much use for me. I believe to have some sort of an idea however.

In the original context where I used that expression it is all pretty
much the same. The whole point is that after a commit to the mainline,
there are no changes in the branch that are uncommitted - so
"rebranching" is only to let the version control system know that
further changes are against the mainline at that point, and not
against the last edit on the previous change done on the branch -
there are no changes to merge, just metadata to inform.

>> And again - smart merging only alleviates the merge conflicts and
>> allows you to do things differently, but it does not change the
>> logical issue here.
>
> Agreed. I don't usually see it causing any problems tho. But
> see the "Whole IS the sum of its parts" section of my earlier
> post - in those cases, it can be problematic.

Yes, with smart merging, there usually isn't a problem with it.

> Actually - I have read it, but its been a while (I think two
> years) and there have been changes, and I haven't been able to
> play with it to keep it fresh in my mind. Your examples were
> useful to me. Many thanks!!!

The conversation has been enlightening and helpful in seeing
alternative views and alternative terminology for the subject. Thank
you.

One small matter though which is a bit bugging me. We've talked here
how different things *can* be done in Subversion and how to handle the
problems resulting from some - but not the rationale behind it.

Is there a reason why one should not delete a private branch after a
commit to the mainline, and recreate it when starting a new one? I
mean, sure, you can keep on editing in the same branch, you can use
revisions to mark ranges, you could build support into Subversion to
record the latest commit and only show diffs and logs until that and
smart merging will save your ass if you merge changes twice - but why
bother? Is there something to be gained by all that?

-- Naked


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Brad Appleton <br...@bradapp.net>.
On Wed, Mar 17, 2004 at 01:46:55AM +0200, Nuutti Kotivuori wrote:
> What is the term usually used to describe a single change (with a log
> message usually) that has been "checked-in" to any branch?

If its "checked-in" to a private-branch or a task-branch and
the corresponding task is not yet finished, the common term
I've heard (which the PrivateVersions pattern uses) is a
"checkpoint", and the process of doing it is often called
"checkpointing" (e.g., I've created a private "checkpoint"
of my changes on my branch of my working-copy, but until I
actually merge my changes to the codeline, I haven't really
"committed" them (to the codeline) because theya rent yet
visible to the rest of the team.

> > Why is it necessary to rebranch the branch (I assume you mean
> > "reparent" the branch to a subsequent version of the trunk - which
> > is almost what a "rebase" or "rebaseline operation does, in
> > theory).
> 
> Almost, but not exactly. "Reparenting" or "rebaselining" usually means
> taking a change-set against an earlier version of mainline (or
> baseline, heh), and adapting it to be against a later version of
> mainline - eg. keeping the changes made on the branch, but just making
> them be based against a later version of trunk.

Okay. so for me, branch reparenting, rebranching, and branch rebaselining
have a subtle but important difference.

* Rebasing (or Rebaselining) is "porting" the latest stuff from
  the codeline (trunk) into my branch. See slides 19-20 at
  http://www.cmcrossroads.com/bradapp/acme/clearcase/RUC2003_SCMA34.zip
  I thought that "cvs update" did the same thing?
  
* What I call "Rebranching" is described in an SCM-9 paper by
  Buffenbarger and Gruell, see
  http://cs.boisestate.edu/~buff/publications/scm9/abstract.html
  And slides 21-22 at
  http://www.cmcrossroads.com/bradapp/acme/clearcase/RUC2003_SCMA34.zip

(slide 38 discusses some pros-vs-cons of rebasing and rebranching)

* Branch reparenting is described well in Michael Bay's book
  "Software Release Methodology". It's a "trick" that has
  the same effect as "rebranching" but without having to create
  a new/separate branch. I think that in terms of history tho,
  its slightly different from rebasing and rebranching (not
  sure about that)

> And again - smart merging only alleviates the merge conflicts and
> allows you to do things differently, but it does not change the
> logical issue here.

Agreed. I don't usually see it causing any problems tho. But
see the "Whole IS the sum of its parts" section of my earlier
post - in those cases, it can be problematic.

> branches. Both of these issues would probably become clear as water by
> reading the Subversion book. It really isn't that bad of a a read -
> and if you are looking to cut time, reading just Chapter 4 should be
> enough, with first peeking through Chapter 2 Section 3.2 in case
> there's something uncertain as to what revisions in Subversion
> actually mean.

Actually - I have read it, but its been a while (I think two
years) and there have been changes, and I haven't been able to
play with it to keep it fresh in my mind. Your examples were
useful to me. Many thanks!!!

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Nuutti Kotivuori <na...@iki.fi>.
Brad Appleton wrote:
> On Sat, Mar 13, 2004 at 02:10:19AM +0200, Nuutti Kotivuori wrote:
>> Each commit is its own change-set and there is no way independent
>> from branches to group different commits (change-sets) into belong
>> to a certain task.
>
> Okay. I was assuming that the "stuff" in between commits
> constitutes a change-set at each commit. That much sounds correct.

Hm. I didn't quite understand this, but it sounds you mean the same
thing. Just in case there's any confusion, please read Subversion and
Changesets, at the end of the page here:

http://svnbook.red-bean.com/html-chunk/ch04s03.html#svn-ch-4-sect-3.2

> Where I think I go astray in is my use of "commit" versus
> Subversion's.  They way I use "commit" and normally see it used, it
> refers to one's changes being checked-in to "the" codeline (not my
> private branch or task branch, but to the integration
> codeline/mainline that the rest of the team uses to see the latest
> and greatest "stuff")

Right. In Subversion (and CVS) lingo, "commit" is equal to "check-in"
and is used for all modifications that are transmitted to the
repository.

What is the term usually used to describe a single change (with a log
message usually) that has been "checked-in" to any branch? I will
continue to use commit in Subversion terms below, unless otherwise
noted.

> So I had assumed there was an easy way to tell "svn log" to show me
> all the stuff on my branch since the most recent commit to the
> codeline. Sounds like this is not the case (bummer).

Again, Subversion (and CVS) lingo calls this operation merging, or
porting, the changes from the branch to the trunk.

> Which means, as a previous poster pointed out, I need to tag every
> commit I do on the mainline, and every intentional "checkpoint" on
> my own private branch. I want to believe there is some easier way
> than this - is there?

Ascii art time! Whoop-pahh! (mangled from design document)

 _____     _____     _____     _____      _____      _____
|     |   |     |   |     |   |     |    |     |    |     |
| A:1 |-->| A:2 |-->| A:4 |-->| A:6 |-.->| A:7 |-.->| A:9 |
|_____|   |_____|   |_____|   |_____| |  |_____| |  |_____|
             \                        /          |
              \                      /           |
               \  _____     _____   / _____      |
                \|     |   |     | / |     |    /
                 | B:3 |-->| B:5 |-->| B:8 |->-'
                 |_____|   |_____|   |_____|

Here, let's assume that "A" is "the" codeline, or trunk, or whatever,
and "B" is your private branch. If you are at a working copy of B
which is at revision 5, and issue "svn log", you will get log messages
for B:5, B:3, A:2 and A:1. If you instead issue "svn log
--stop-on-copy", you will get only B:5 and B:3. Then again if you are
at a working copy of B which is at revision 9, and issue "svn log",
you will get log messages for B:8, B:5, B:3, A:2 and A:1. And if you
issue "svn log --stop-on-copy", you will get B:8, B:5 and B:3. But if
you wish to obtain only the difference since your last merge (or
commit to mainline), you have to explicitly say "svn log -r HEAD:5" in
the working copy.

Now consider the alternative, where the branch *is* deleted, and
re-created:

 _____     _____     _____     _____      _____             _____ 
|     |   |     |   |     |   |     |    |     |           |     |
| A:1 |-->| A:2 |-->| A:4 |-->| A:6 |-.->| A:7 |--------.->| A:9 |
|_____|   |_____|   |_____|   |_____| |  |_____|        |  |_____|
             \                        /     \           |
              \                      /       \          |
               \  _____     _____   /         \  _____  |
                \|     |   |     | /           \|     |/
                 | B:3 |-->| B:5 |-             | B:8 |
                 |_____|   |_____|              |_____|

Here, the usage does not differ, and nobody needs to provide manual
revision numbers.

>> Again, you can 'fix' this by supplying revision numbers, which gets
>> you only the current task commit messages.
>
> That should work, at least for the commits on my own branch (which
> is perhaps good enough). Is there some "builtin" notion of
> "previous" or "last" (or MRC), which refers to the revision of the
> most-recent-commit on the "current" or named branch? (that sure
> would be nice). I'd love to be able to associate a name/alias with
> those and other revision numbers without having to create a "copy"
> in order to tag.  Can I do this?

The revision number of the latest change (eg. latest commit in
Subversion terms) in the branch of the current working copy is called
COMMITTED. The revision before that is PREV. But there is no tag or
anything for the merge done to the mainline.

>> branch, nothing more. If you wish to Subversion that the next
>> commit is not a change-set on top of the old changes, but a
>> change-set against the trunk again, you need to re-branch the
>> branch (what a twist of words).
>
> Why is it necessary to rebranch the branch (I assume you mean
> "reparent" the branch to a subsequent version of the trunk - which
> is almost what a "rebase" or "rebaseline operation does, in
> theory).

Almost, but not exactly. "Reparenting" or "rebaselining" usually means
taking a change-set against an earlier version of mainline (or
baseline, heh), and adapting it to be against a later version of
mainline - eg. keeping the changes made on the branch, but just making
them be based against a later version of trunk.

What I meant was an operation equivalent to simply deleting the
current branch and creating it again with the same name from trunk.

> If I did an update to "rebase" my WC on my branch, why can't that
> have the same effect? (is it related to why SVN can't do "smart"
> merging - yet)

Um, now this is getting really confused. What does a working copy have
to do with this whole thing? And "update" is again a bad choice of
words since Subversion uses update to bring a working copy up to date
with changes committed in the repository mainline or branch. It does
not bring in changes from other branches (or the mainline) than which
the working copy is in.

And again - smart merging only alleviates the merge conflicts and
allows you to do things differently, but it does not change the
logical issue here.

>> So, what you do in Subversion is to signify that the new commits
>> you are making again changes against the mainline of the
>> development, and not against your earlier change by... you guessed
>> it... re-branching.
>
> Ouch! Me no like that :-(

To re-cap the whole thing, let's look at the pictures above again.

In the first figure, revision B:8 is based against B:5, so all of
Subversion handles it as such - a successor of B:5 - and is totally
unaware of the merge (or commit to mainline as by your terminology)
that was committed as A:7. Hence you must specify in everything you do
that you are interested only in changes (or log messages) that
happened since B:5.

In the second figure, revision B:8 is based against A:7, so it is
already totally independent of the earlier changes made in the branch
and it is handle by what it actually is - an separate change against
the trunk again.

***

I believe a lot of the confusion here is about terms and such - and
the rest is about how Subversion handles change-sets and
branches. Both of these issues would probably become clear as water by
reading the Subversion book. It really isn't that bad of a a read -
and if you are looking to cut time, reading just Chapter 4 should be
enough, with first peeking through Chapter 2 Section 3.2 in case
there's something uncertain as to what revisions in Subversion
actually mean.

-- Naked


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Brad Appleton <br...@bradapp.net>.
On Sat, Mar 13, 2004 at 02:10:19AM +0200, Nuutti Kotivuori wrote:
> Each commit is its own change-set and there is no way independent
> from branches to group different commits (change-sets) into belong to
> a certain task.

Okay. I was assuming that the "stuff" in between commits
constitutes a change-set at each commit. That much sounds
correct.

Where I think I go astray in is my use of "commit" versus
Subversion's.  They way I use "commit" and normally see it used,
it refers to one's changes being checked-in to "the" codeline
(not my private branch or task branch, but to the integration
codeline/mainline that the rest of the team uses to see the
latest and greatest "stuff")

So I had assumed there was an easy way to tell "svn log" to
show me all the stuff on my branch since the most recent commit
to the codeline. Sounds like this is not the case (bummer).
Which means, as a previous poster pointed out, I need to tag
every commit I do on the mainline, and every intentional
"checkpoint" on my own private branch. I want to believe there
is some easier way than this - is there?

> Again, you can 'fix' this by supplying revision numbers, which gets
> you only the current task commit messages.

That should work, at least for the commits on my own branch
(which is perhaps good enough). Is there some "builtin"
notion of "previous" or "last" (or MRC), which refers to
the revision of the most-recent-commit on the "current"
or named branch? (that sure would be nice). I'd love to be
able to associate a name/alias with those and other revision
numbers without having to create a "copy" in order to tag.
Can I do this?

> branch, nothing more. If you wish to Subversion that the next commit
> is not a change-set on top of the old changes, but a change-set
> against the trunk again, you need to re-branch the branch (what a
> twist of words).

Why is it necessary to rebranch the branch (I assume you mean
"reparent" the branch to a subsequent version of the trunk -
which is almost what a "rebase" or "rebaseline operation does,
in theory). If I did an update to "rebase" my WC on my branch,
why can't that have the same effect? (is it related to why
SVN can't do "smart" merging - yet)

> So, what you do in Subversion is to signify that the new commits you
> are making again changes against the mainline of the development, and
> not against your earlier change by... you guessed it... re-branching.

Ouch! Me no like that :-(
-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day

Posted by Nuutti Kotivuori <na...@iki.fi>.
Phew, this message seems to be really hard to split in pieces since
every part refers to every other part. So, I left just enough context
on the comments to understand what it refers to, but read the original
mail to understand what is being talked about. Also, since I have a
bit of trouble with my mails, I don't know if somebody else already
answered this fully. And I'm also not so sure I know what I am talking
about.

Brad Appleton wrote:
> If you created a branch-per-task, then the branch is a grouping
> mechanism for your resulting change-set for that task.

Yes.

> Does SVN already have a grouping mechanism (independent from
> branches) that identifies what my change-set was and all the
> participating files and file-changes from the beginning-to-end of my
> task?

No. Each commit is its own change-set and there is no way independent
from branches to group different commits (change-sets) into belong to
a certain task.

> If it has that (I had the impression it does), then do I need the
> branch to do that grouping on a per-task basis? Or can I just use it
> as a private workstream in which I work on one task at a time, each
> task having its own change-set which I can still identify and diff
> against as a change-set (rather than as an entire branch).

If you work on one task at a time, then the way you can get the entire
change-set for the task is to get a diff from the first revision
(commit, change-set) of the task to the last revision (commit,
change-set) of the task. That diff is then the combined change-set for
the task - and that is obviously what people do when they merge a
branch back to trunk. So you have to mark down (or check from history)
the revisions in a case like this.

>> Also, I wouldn't want the history from each feature to be muddled
>> with other ones -- something that happens if you don't delete the
>> branch.
>
> Why would that happen if you only work on one task at any given time
> in your branch and start start a new task until the previous one is
> completed?

It doesn't, as such - but since Subversions history (log) traverses
copies (branches) a few things happen.

If you have a branch created for a task, and some commits on it - when
you ask for log on the branch, you first get log messages for you
commits on the branch and then log messages for the trunk before the
point you branched from. If you ask log (history) to not traverse
copies, then you will get only the commit messages that you have made
on the branch - since it does not traverse back to the trunk.

But if you have a branch that survives several tasks in succession,
instead of getting logs for the current task and then logs for the
trunk, you get logs for the current task, then logs for every other
task performed in this branch and then the trunk. And if you ask log
not to traverse copies, you get commit messages for all tasks ever
performed on the branch.

Again, you can 'fix' this by supplying revision numbers, which gets
you only the current task commit messages.

> Each change should happen independently of the other ones - right? 
> And the version of the codeline that you would want the subsequent
> change-task to be based off of is already right there in your branch
> and working-copy.

The version of the codeline that you want the subsequent change-task
to be based off is only the right one nobody else makes any changes in
the trunk in the meantime. Otherwise you want to refresh the version
of the codeline you wish to be working on top of from the head of the
development at that time, either by merging or copying.

> The private branch gives you your own private working-space and
> versioning-space (the branch) to checkout and checkin changes in
> isolation before committing them.

Yes.

> The existing change-set mechanism should give you the mechanism for
> identifying and referencing/comparing/diffing each individual change
> (unmuddled from the others since they were non-overlapping), and
> should be able to do that without needing the branch.

Only if you supply revision numbers to identify each task.

> Or am I misunderstanding/overstating what an SVN change-set
> does for you?

A bit.

>> But this doesn't make sense since I have now put two unrelated
>> features into the same work stream of my SCM.
>
> If you instead created a new branch, you would branch it off the
> trunk that already contained the changes for your previously
> committed task, yes? So then I'm thinking that in either case, the
> initial configuration of the codeline that is in your workstream at
> the beginning of that next task is identical either way.

Again, only if there are no intervening commits after you finished
your last task and before you started a new one. Otherwise you wish to
merge or copy those changes in before.

And yes, the contents of the working copy may be identical, but the
history is not. If you continue on working on the branch directly,
history will show that it is just further development on the same
branch, nothing more. If you wish to Subversion that the next commit
is not a change-set on top of the old changes, but a change-set
against the trunk again, you need to re-branch the branch (what a
twist of words). Ofcourse since branches are so cheap and either
operation is just a simple 'svn cp' on URLs, it hardly matters if you
just rebranch your private workspace, or create a new branch.

> Is there something stopping me from telling svnlog to give me the
> history since my last commit (as opposed to since I created the
> branch)? Can I do that without having to create a tag if SVN tracks
> the change-set for my task-commit(s)?

Um. Yes, you can do that by specifying exact revision numbers.

> [It could be I don't have a good enough understanding of SVN here]

Okay, there is one big thing that is probably causing the confusion.

When you "commit" your entire change-set (that is the product of
several smaller change-sets, eg. commits most likely) to the mainline
of the development - which in Subversion is a merge operation - that
is then a change on the *mainline* (trunk) only! It does not affect
your branch in any way. Nothing tells the branch that "this task or
change-set is now finished". So when you keep on developing on the
same branch, it is as if you are still making changes on the old
change-set and only revision numbers can separate those changesets.

So, what you do in Subversion is to signify that the new commits you
are making again changes against the mainline of the development, and
not against your earlier change by... you guessed it... re-branching.

-- Naked

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by Brad Appleton <br...@bradapp.net>.
On Fri, Mar 12, 2004 at 04:14:37PM -0500, N. Thomas wrote:
> The only thing I have a question about is this: why would you not delete
> your branch when you are done with it? It makes more sense to start a
> new private branch with a pristine copy -- especially more so for
> Subversion where branching is O(1).

Why did you create your branch? Did you create it so that
you could have a separate stream dedicated to a particular
activity? Or did you create it so you could have a separate
place in which to work on one activity at a time?

If you created a branch-per-task, then the branch is a grouping
mechanism for your resulting change-set for that task. Does SVN
already have a grouping mechanism (independent from branches)
that identifies what my change-set was and all the participating
files and file-changes from the beginning-to-end of my task?

If it has that (I had the impression it does), then do I need
the branch to do that grouping on a per-task basis? Or can I
just use it as a private workstream in which I work on one
task at a time, each task having its own change-set which I
can still identify and diff against as a change-set (rather
than as an entire branch).

> Also, I wouldn't want the history from each feature to be
> muddled with other ones -- something that happens if you
> don't delete the branch.

Why would that happen if you only work on one task at any
given time in your branch and start start a new task until
the previous one is completed? Each change should happen
independently of the other ones - right? And the version of
the codeline that you would want the subsequent change-task
to be based off of is already right there in your branch
and working-copy. The private branch gives you your own
private working-space and versioning-space (the branch) to
checkout and checkin changes in isolation before committing
them. The existing change-set mechanism should give you the
mechanism for identifying and referencing/comparing/diffing
each individual change (unmuddled from the others since they
were non-overlapping), and should be able to do that without
needing the branch.

Or am I misunderstanding/overstating what an SVN change-set
does for you?

> Now if I understand your concept of private branches, I would initially
> branch like this:
> 
>     /
>     /tags
>     /trunk
>     /branches
>     /branches/nthomas
> 
> I would add my goodbyeworld feature into /branches/nthomas, and then
> when everything is merged into trunk and synced up, when I want to add
> my hellouniverse feature, I would still work in /branches/nthomas.

Yes.

> But this doesn't make sense since I have now put two unrelated features
> into the same work stream of my SCM.

If you instead created a new branch, you would branch it
off the trunk that already contained the changes for your
previously committed task, yes? So then I'm thinking that in
either case, the initial configuration of the codeline that
is in your workstream at the beginning of that next task is
identical either way.

Is there something stopping me from telling svnlog to give
me the history since my last commit (as opposed to since I
created the branch)? Can I do that without having to create
a tag if SVN tracks the change-set for my task-commit(s)?

[It could be I don't have a good enough understanding of SVN here]
-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by "N. Thomas" <nt...@cise.ufl.edu>.
* Brad Appleton <br...@bradapp.net> [2004-03-10 16:57:05 -0600]:
> Have you considered the "Private Branch" (a.k.a. "Personal Branch")
> pattern of using a branch per developer per active/concurrent
> feature or task?

I like your idea about private branches and such. (Although my
particular repo has only one person accessing it, this makes a ton of
sense where there are multiple devs.)

> And instead of deleting or obsoleting the branch (its not needed for
> change-set purposes) you use it for the next task you are about to
> start work on (because it should be the same content as the main trunk
> at that time).
>
> Seems to me this still provides you the folllwing benefits, but with
> far fewer branches/copies needing to be created.

The only thing I have a question about is this: why would you not delete
your branch when you are done with it? It makes more sense to start a
new private branch with a pristine copy -- especially more so for
Subversion where branching is O(1).

Also, I wouldn't want the history from each feature to be muddled with
other ones -- something that happens if you don't delete the branch.

Suppose my hello-world repo looks like this:

    /
    /tags
    /trunk
    /branches

now I add a task-branch:

    /
    /tags
    /trunk
    /branches
    /branches/goodbyeworld

Anytime I  run "svn log ." in /branches/goodbyeworld, I get the history
for my goodbyeworld feature that I am adding to the project.

And when I am done with that and want to add another feature, I merge my
changes back into the trunk.

Now I add another feature, so I branch once again:

    /
    /tags
    /trunk
    /branches
    /branches/hellouniverse

Now if I understand your concept of private branches, I would initially
branch like this:

    /
    /tags
    /trunk
    /branches
    /branches/nthomas

I would add my goodbyeworld feature into /branches/nthomas, and then
when everything is merged into trunk and synced up, when I want to add
my hellouniverse feature, I would still work in /branches/nthomas.

But this doesn't make sense since I have now put two unrelated features
into the same work stream of my SCM.

Thomas

-- 
N. Thomas
nthomas@cise.ufl.edu
Etiamsi occiderit me, in ipso sperabo


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by Stefan Haller <ha...@ableton.com>.
Brad Appleton <br...@bradapp.net> wrote:

> So if you primarily work on one task at a time, you have a
> single branch all to yourself. When you are done with your
> change (and after you have "updated" from the main trunk)
> you "commit" your change to the main trunk.

I'm not sure I really understand what you mean.  Are you saying you
would first merge from the trunk to your branch (the changes that other
people have committed to the trunk in the meantime), and then merge back
from your branch to the trunk?  Isn't this going to cause lots of
problems?  It has been my experience that it's best to always merge only
one way between branches.


-- 
Stefan Haller
Ableton
http://www.ableton.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by Mark <ma...@msdhub.com>.
 
For all those who, like me, are converting from VSS and are struggling to
get their heads around "real" version control/SCM, I'd like to recommend
Brad's book (mentioned in his sig).

I'm reading it ATM and it's a big help in figuring out how SCM *should* be
done, which (it turns out) is not how VSS encourages it to be done.

Mark

-----Original Message-----
From: Brad Appleton [mailto:brad@bradapp.net] 
Sent: Wednesday, March 10, 2004 3:57 PM
To: N. Thomas
Cc: users@subversion.tigris.org
Subject: Re: branching several times a day (was Re: Sourcesafe user needs
primer on branching source control)

[snip good advice]

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

Posted by Brad Appleton <br...@bradapp.net>.
On Wed, Mar 10, 2004 at 05:45:16PM -0500, N. Thomas wrote:
> I don't do it as often as once per day, but every time I add a new
> feature to the code that I'm working on, I will branch, do the write,
> compile, test, commit cycle a bunch of times and then merge back into
> the trunk, deleting the branch when I am done. (This is for a personal
> project of mine, but I suppose if I were working on it full-time, and I
> had enough bite-sized features, then I could imagine myself branching
> more often.)

So you are using the "Task Branch" pattern (branch per task/feature).

Have you considered the "Private Branch" (a.k.a. "Personal Branch")
pattern of using a branch per developer per active/concurrent
feature or task?

So if you primarily work on one task at a time, you have a
single branch all to yourself. When you are done with your
change (and after you have "updated" from the main trunk)
you "commit" your change to the main trunk. And instead of
deleting or obsoleting the branch (its not needed for change-set
purposes) you use it for the next task you are about to start
work on (because it should be the same content as the main
trunk at that time). And you don't create a new "private branch"
unless you have to work on some other task in parallel with
what you are currently working on in your initial private branch.

Seems to me this still provides you the folllwing benefits,
but with far fewer branches/copies needing to be created.

> Two nice things about doing it this way:
> 
>     - the trunk always has the features I want fully implemented and is
>       never in a state of unfinished, partly-working/partly-broken
> 
>     - I can work on adding multiple components to my system this way by
>       putting them all in separate branches. It would be terribly
>       confusing if I were to do it all in the main trunk.

The main reason why folks would do a Task-Branch instead of a
Private-Branch is iff the task-branch was also identifying
the corresponding change-set. But if you already have an
existing separate mechanism for that (as SVN does, as does
Perforce, and several other tools) then you don't need it
for that purpose, you only need it for the purposes you
state above (because it provides not just an isolated
work area, but an isolated work "stream" in which you can
safely to checkins and make private versions/checkpoints
of incomplete changes without "breaking the build" for the
rest of the team)

-- 
Brad Appleton <br...@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org