You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Alexy Khrabrov <al...@setup.org> on 2002/08/06 14:17:05 UTC

SVN is not truly distributed?

In an interview with Larry McVoy, the BitKeeper's founder,
there's a remark that contrasts BitKeeper with all other VCS's:

[http://www.kerneltrap.org/node.php?id=222]

  I will predict that you will never see a centralized system evolve into 
  a distributed system. So CVS/Subversion/ClearCase/Perforce/etc will all 
  stay with the centralized client/server architecture. 
  They may try to replicate distributed systems and it will sort of work, 
  but all the corner cases will not work. 
  You need to design a distributed system to be distributed from day one.

Is it the feeling of Subversion's founders that Subversion is not truly
distributed?  I thought that the web-based nature of SVN allows for easy
replication of repositories and multiple repositories, and modular 
architecture allows to manage those locally and add syncing globally in
any n-way fashion needed.  

What exactly can be missing in SVN "from day 1"?
I'd like  to understand if there's a fundamental issue here 
SVN needs to address in order to provide everything BitKeeper
can eventually.

Cheers,
Alexy

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by Zack Weinberg <za...@codesourcery.com>.
On Thu, Aug 08, 2002 at 01:28:02PM +0100, Stephen C. Tweedie wrote:
> > BitKeeper still doesn't support true branches ("lines of development").
> 
> Subversion doesn't support true branches, either.  It has a
> sort-of-equivalent feature in the form of copies.  bk does more or
> less the same thing --- a branch is a clone, not a copy, and resides
> outside the original development directory in the repository, just
> like on subversion.
> 
> The "lines of development" feature being talked about in bitkeeper is
> _not_ about branches.  It's about tagging specific subsets of the
> history lines in the merge tree of *one* bitkeeper repository.

It was my understanding that the multiple cloned trees technique was
not as convenient for some tasks as LODs were supposed to be.  Larry
always described it as a kludge until LODs worked.  My information may
be out of date - I last paid serious attention to BK development in
1999.

zw

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: SVN is not truly distributed?

Posted by Bill Tutt <ra...@lyra.org>.
> From: Stephen C. Tweedie [mailto:sct@redhat.com]
> 
> Hi,
> 
> On Wed, Aug 07, 2002 at 03:12:03PM -0700, Zack Weinberg wrote:
> 
> > BitKeeper still doesn't support true branches ("lines of
development").
> 
> Subversion doesn't support true branches, either.  It has a
> sort-of-equivalent feature in the form of copies.  bk does more or
> less the same thing --- a branch is a clone, not a copy, and resides
> outside the original development directory in the repository, just
> like on subversion.
> 

Quite frankly, I'm not sure what you think true branches are then.
Copies in Subversion function exactly like branches. However, Subversion
isn't a multi-dimensional version space like ClearCase. It's a flat,
filesystem like version space. A branch's name is the entire repository
path to the destination of the copy.  Branch nesting is completely
supported. In fact, branching branches with sub-branches in O(1) time is
(after the appropriate bug is fixed) completely supported. All branch
related operations are O(1) in fact. The fact that the user interface
calls it a copy is just a user interface decision, and has little
relevance to what actually happens in the data model. You can also
(using the data model, and not existing APIs unfortunately) iterate over
all branches that contain a specific file in preparation for determining
a possible merge source.

Heck, we even support easily converting your working copy from one
branch to the other using "svn switch".

Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by "Stephen C. Tweedie" <sc...@redhat.com>.
Hi,

On Wed, Aug 07, 2002 at 03:12:03PM -0700, Zack Weinberg wrote:

> BitKeeper still doesn't support true branches ("lines of development").

Subversion doesn't support true branches, either.  It has a
sort-of-equivalent feature in the form of copies.  bk does more or
less the same thing --- a branch is a clone, not a copy, and resides
outside the original development directory in the repository, just
like on subversion.

The "lines of development" feature being talked about in bitkeeper is
_not_ about branches.  It's about tagging specific subsets of the
history lines in the merge tree of *one* bitkeeper repository.

> Branching in bk always 
> This has been the number one missing feature for three years now.

Odd, I've got about a dozen local branches of the bitkeeper 2.4 kernel
tree, which I use to automate merging between various different
feature sets I've got locally (eg. uml, kdb, ext3 development trees,
etc.)

--Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by Steven Shaw <st...@iprimus.com.au>.
Stellation?

There seem to be afew configuration mangagement systems on the brew at the
moment: subversion, opencm, stellation ....

I've read a bit about OpenCM. It has had support for disconnected commits
(aka commit to local repository). Unfortunately that functionality has been
disabled in the initial public release (pending a stable version, I think).
It uses Xdelta compression although doesn't seem to use Josh MacDonald's
library but reimplements a slightly altered algorithm. The systems makes
heavy use of cryptographic hashes (for globally unique names). In opencm
there is no server-side support for directories. I think this is opencm's
primary weakness. I think this leads to a bloated repository (which one one
report I read was being fixed? with compression/gzip). I could be way wrong.
The information at their website is a good read. It probably holds useful
information for Subversion gurus who want to think about adding distributed
operation.

I'll be sure to check out the Stellation homepage, too!

Cheers, Steve.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by "Stephen C. Tweedie" <sc...@redhat.com>.
Hi,

On Wed, Aug 07, 2002 at 09:16:17PM -0400, Mark C. Chu-Carroll wrote:

> This is what kills me about Larry's endless claims that you
> can't build a truly distributed SCM system on a non-distributed one.
> BitKeeper *is* built using a non-distributed system as its basis

Right.

> Doing true distribution in an SCM system is *very* hard. But ultimately,
> what you need is essentially a globally unique identifier for branches
> and versions; an ability to recognize differences between different 
> repositories using those globally unique branch/version IDs;

Amen --- global identification of complete histories is precisely what
makes proper distributed operation possible.  Too many people get hung
up on "unless it works in such-and-such a manner internally, you can't
distribute it".  It's not the internals that count as much as whether
you can maintain appropriate labels for the versioned objects and
their versions.

--Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by "Mark C. Chu-Carroll" <mc...@watson.ibm.com>.
On Wednesday 07 August 2002 06:12 pm, Zack Weinberg wrote:
> On Wed, Aug 07, 2002 at 04:54:56PM -0500, Jim Blandy wrote:
> > Statements like "you need to design a distributed system to be
> > distributed from day one" are sort of mythological.  Who says?  Why?
> > It's not really an assertion that one can argue for or against based
> > on the code --- it's more an assertion about the code's soul, or
> > something.  So I don't really know how to respond to it.  Maybe a
> > shrug is about right.
> >
> > Doesn't BitKeeper itself have some corner cases?  Some approximations?
> > Are those cut corners and approximations more or less annoying than
> > those Subversion will inherit from its centralized-server origins?  Or
> > is it all just a matter of how much time you put into the details?
>
> BitKeeper still doesn't support true branches ("lines of development").
> This has been the number one missing feature for three years now.  I
> know what the gory details of that problem were in 1999; the situation
> may have changed superficially, but I would bet it still boils down to
> constraints on the implementation coming from the attempt to maintain
> compatibility with SCCS.

This is what kills me about Larry's endless claims that you
can't build a truly distributed SCM system on a non-distributed one.
BitKeeper *is* built using a non-distributed system as its basis, and
it still suffers from some nasty warts as a result; but when it comes
to the distributed part, that doesn't seem to be much of a barrier.

I'm far from an expert on the subversion internals. But I understand
a bit of the basics. And I'm working on an SCM system as well, which
is designed from the ground up to support distribution, despite the fact
that that feature is not implemented in our current system. (We planned
and designed for the capability to support it; we haven't implemented
on that design yet.)

Doing true distribution in an SCM system is *very* hard. But ultimately,
what you need is essentially a globally unique identifier for branches
and versions; an ability to recognize differences between different 
repositories using those globally unique branch/version IDs; and
an ability to package up the differences between different repositories
in changesets that allow repositories to reconcile their shared data.

I think that Subversion doesn't yet support globally unique identifiers in
the sense required for Bitkeeper style distribution. But that's *far* from
difficult to add. Subversion currently does, I think,  do a style of
changeset in the form of deltas -- which is the basis of the complete
changeset clusters that you need for repository reconciliation. It's
all very posisble to build on Subversion. Not easy mind you - but
there's nothing in the core of Subversion that makes it impossible. To
be honest, it's not even going to be that much harder for Subversion,
which didn't plan support for it in the basic design, than it will be for
Stellation, which planned for it from day one. And I say that as the
leader of the Stellation project, which gives me a clear bias to 
give extra credit to the Stellation design!

	-Mark

-- 
Mark Craig Chu-Carroll,  IBM T.J. Watson Research Center  
*** The Stellation project: Advanced SCM for Collaboration
***		http://www.eclipse.org/stellation
*** Work Email: mcc@watson.ibm.com  ------- Personal Email: markcc@bestweb.net



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by Zack Weinberg <za...@codesourcery.com>.
On Wed, Aug 07, 2002 at 04:54:56PM -0500, Jim Blandy wrote:
> Statements like "you need to design a distributed system to be
> distributed from day one" are sort of mythological.  Who says?  Why?
> It's not really an assertion that one can argue for or against based
> on the code --- it's more an assertion about the code's soul, or
> something.  So I don't really know how to respond to it.  Maybe a
> shrug is about right.
> 
> Doesn't BitKeeper itself have some corner cases?  Some approximations?
> Are those cut corners and approximations more or less annoying than
> those Subversion will inherit from its centralized-server origins?  Or
> is it all just a matter of how much time you put into the details?

BitKeeper still doesn't support true branches ("lines of development").
This has been the number one missing feature for three years now.  I
know what the gory details of that problem were in 1999; the situation
may have changed superficially, but I would bet it still boils down to
constraints on the implementation coming from the attempt to maintain
compatibility with SCCS.

zw

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by Jim Blandy <ji...@red-bean.com>.
Alexy Khrabrov <al...@setup.org> writes:
> In an interview with Larry McVoy, the BitKeeper's founder,
> there's a remark that contrasts BitKeeper with all other VCS's:
> 
> [http://www.kerneltrap.org/node.php?id=222]
> 
>   I will predict that you will never see a centralized system evolve into 
>   a distributed system. So CVS/Subversion/ClearCase/Perforce/etc will all 
>   stay with the centralized client/server architecture. 
>   They may try to replicate distributed systems and it will sort of work, 
>   but all the corner cases will not work. 
>   You need to design a distributed system to be distributed from day one.
> 
> Is it the feeling of Subversion's founders that Subversion is not truly
> distributed?  I thought that the web-based nature of SVN allows for easy
> replication of repositories and multiple repositories, and modular 
> architecture allows to manage those locally and add syncing globally in
> any n-way fashion needed.  
> 
> What exactly can be missing in SVN "from day 1"?
> I'd like  to understand if there's a fundamental issue here 
> SVN needs to address in order to provide everything BitKeeper
> can eventually.

I think I've heard Karl say he thought Subversion could support
changesets, or at least their most important qualities.  I haven't
thought it all through, but I think he may be right.

I know Subversion will adapt to meet various needs that we've put off
for now.  Maybe we'll meet them well, maybe poorly.  If we maintain
the energy around Subversion it has now, I suspect we'll do pretty
well.

Statements like "you need to design a distributed system to be
distributed from day one" are sort of mythological.  Who says?  Why?
It's not really an assertion that one can argue for or against based
on the code --- it's more an assertion about the code's soul, or
something.  So I don't really know how to respond to it.  Maybe a
shrug is about right.

Doesn't BitKeeper itself have some corner cases?  Some approximations?
Are those cut corners and approximations more or less annoying than
those Subversion will inherit from its centralized-server origins?  Or
is it all just a matter of how much time you put into the details?

Re: SVN is not truly distributed?

Posted by "Mark C. Chu-Carroll" <mc...@watson.ibm.com>.
This is going to sound a bit harsh, but that's the way it goes.

Larry McVoy is wrong.

The core of BitKeeper is an SCCS ripoff. That is, a non-distributed
SCM system. It implemented distributed functionality on that
core.

It's hard to implement a fully distributed SCM system. Of that,
there is no doubt. It's far from straightforward to move from a
typical client/server SCM system to a fully distributed one. 
But it's not impossible. Far from it. And in fact, his livelihood is
entirely derived from a system that does exactly that.  BitKeeper
itself proves him wrong.

	-Mark


On Tuesday 06 August 2002 10:17 am, Alexy Khrabrov wrote:
> In an interview with Larry McVoy, the BitKeeper's founder,
> there's a remark that contrasts BitKeeper with all other VCS's:
>
> [http://www.kerneltrap.org/node.php?id=222]
>
>   I will predict that you will never see a centralized system evolve into
>   a distributed system. So CVS/Subversion/ClearCase/Perforce/etc will all
>   stay with the centralized client/server architecture.
>   They may try to replicate distributed systems and it will sort of work,
>   but all the corner cases will not work.
>   You need to design a distributed system to be distributed from day one.
>
> Is it the feeling of Subversion's founders that Subversion is not truly
> distributed?  I thought that the web-based nature of SVN allows for easy
> replication of repositories and multiple repositories, and modular
> architecture allows to manage those locally and add syncing globally in
> any n-way fashion needed.
>
> What exactly can be missing in SVN "from day 1"?
> I'd like  to understand if there's a fundamental issue here
> SVN needs to address in order to provide everything BitKeeper
> can eventually.
>
> Cheers,
> Alexy
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

-- 
Mark Craig Chu-Carroll,  IBM T.J. Watson Research Center  
*** The Stellation project: Advanced SCM for Collaboration
***		http://www.eclipse.org/stellation
*** Work Email: mcc@watson.ibm.com  ------- Personal Email: markcc@bestweb.net



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by Greg Stein <gs...@lyra.org>.
On Tue, Aug 06, 2002 at 10:31:51AM -0400, Greg Hudson wrote:
> On Tue, 2002-08-06 at 10:17, Alexy Khrabrov wrote:
> > [ Larry McVoy stated ]
> >   I will predict that you will never see a centralized system evolve into 
> >   a distributed system.
> 
> To some extent, this is like someone from Microsoft saying that you can
> never build a good desktop GUI on top of a Unix-like system, that you
> have to design your whole operating system around your GUI.  Microsoft
> has a vested interest in people believing that, and I think most people
> would say that MacOS X proves they were wrong.

hehe... reading between the lines, are you saying that Larry has a vested
interest in people believing that SVN will never be distributable [like
BitKeeper] ??

:-) :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: SVN is not truly distributed?

Posted by Bill Tutt <ra...@lyra.org>.
> From: Greg Hudson [mailto:ghudson@MIT.EDU]
> 
> On Tue, 2002-08-06 at 10:17, Alexy Khrabrov wrote:
> >   I will predict that you will never see a centralized system evolve
> into
> >   a distributed system.
> 
> To some extent, this is like someone from Microsoft saying that you
can
> never build a good desktop GUI on top of a Unix-like system, that you
> have to design your whole operating system around your GUI.  Microsoft
> has a vested interest in people believing that, and I think most
people
> would say that MacOS X proves they were wrong.
> 
> But fundamentally, Larry is right: distributed operation is not one of
> Subversion's design goals, and that is likely to show.  Subversion 1.0
> won't be able to copy or merge files between repositories, and you
> certainly won't be able to commit several sets of changes to a local
> repository and propagate them, with history, from there to a central
> repository at a later date.  These features might be added at a later
> time, but they might run into fundamental assumptions with resulting
> clumsiness.  For instance, the way we handle copy history right now is
> very much tied to a single repository, so if you "svn cp" a file from
> one repository to another, it may turn out that nothing will remember
> where that copy came from, so (unlike a copy within a repository) "svn
> log" won't show the full history of the resulting file.
> 

That's only half true, the underlying data model is very well factored
for extension to a more complex multi-repository model at a later date.
This is especially true after cmpilato finishes changing the source
information for a Copy to be Transaction based instead of
RepositoryRevision based. Subversion's data model has only one part of
the data model that determines any order to the system, and that's the
RepositoryRevision information. This should make it very easy to morph
Subversion into a point-in-time/ everything-is-a-branch distributed
system. 

Subversion 1.0 isn't a distributed repository system, but it also wasn't
meant to be. That was one of the 1.0 lines in the sand to somewhat
reduce the scope of the problem enough to get traction on getting some
code written. 

> The real question, which is yet to be determined, is whether we will
be
> able to handle the 90% of distributed operation which people actually
> want.  Some people say that Subversion is already good enough for them
> because you can do local diffs and status operations without talking
to
> any repository at all.  So you can do a fair amount of work on an
> airplane, you just can't do intermediate commits.
> 

I think the answer is likely to be yes. Larry made an SCCS store into a
distributed system. I don't see why we can't do the same.

Admittedly, there will be corresponding data model changes (mostly table
key extensions) and another repository reload afterwards, but there you
go. 

Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: SVN is not truly distributed?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2002-08-06 at 10:17, Alexy Khrabrov wrote:
>   I will predict that you will never see a centralized system evolve into 
>   a distributed system.

To some extent, this is like someone from Microsoft saying that you can
never build a good desktop GUI on top of a Unix-like system, that you
have to design your whole operating system around your GUI.  Microsoft
has a vested interest in people believing that, and I think most people
would say that MacOS X proves they were wrong.

But fundamentally, Larry is right: distributed operation is not one of
Subversion's design goals, and that is likely to show.  Subversion 1.0
won't be able to copy or merge files between repositories, and you
certainly won't be able to commit several sets of changes to a local
repository and propagate them, with history, from there to a central
repository at a later date.  These features might be added at a later
time, but they might run into fundamental assumptions with resulting
clumsiness.  For instance, the way we handle copy history right now is
very much tied to a single repository, so if you "svn cp" a file from
one repository to another, it may turn out that nothing will remember
where that copy came from, so (unlike a copy within a repository) "svn
log" won't show the full history of the resulting file.

The real question, which is yet to be determined, is whether we will be
able to handle the 90% of distributed operation which people actually
want.  Some people say that Subversion is already good enough for them
because you can do local diffs and status operations without talking to
any repository at all.  So you can do a fair amount of work on an
airplane, you just can't do intermediate commits.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org