You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by David Weintraub <qa...@gmail.com> on 2011/10/06 18:22:33 UTC

Subversion and Remote Repository Synchronization

Let's say I have a team in the U.S. where my Subversion repository is
kept, and I have a remote team in India. The remote team in India is
complaining about the length of time for checkouts and commits. Is
there a solution to this particular issue in Subversion?

I could create a local Svnsync repository, but that's read-only. Is it
possible for the Indian users to checkout from the SvnSync repository,
then do a relocate to the U.S. main repository, and then check in
their changes? Would this be any faster than directly checking out
from the U.S. repository?

What about using "svngit"? We could have an automated process that
pulls data from the Subversion repository in the U.S. and creates a
local Git repository in India using "svngit'. This could be done when
there's no one in the Indian office. Developers could then checkout
and commit their changes to their local Git repository. In the middle
of the night, the Git repository could then push its changes to
Subversion using "gitsvn" Is this a possibility?

I know other revision control systems have a variety of methods in
handling this issue:

* Git, of course, can easily create a local Indian copy of the
repository, and everyone there can checkout and commit to that local
repository. Changes in the local Indian Git repository can then be
pushed to the U.S. main Git repository.

* Perforce can use repository proxies
<http://www.perforce.com/customers/white_papers/distributed_software_development_perforce>.
The proxies will deliver local copies of a requested checkout if it
exists, or fetch the copy from the remote server when necessary. There
is no synchronization issues, but later checkouts are fairly fast. In
fact, many sites have a nightly process that pre-fetches the data from
the remote repository to the proxy since the first request for a
particular version of a file will take a long time.

* ClearCase has the most interesting (and complex) solution. ClearCase
has something called MultiSite. With MultiSite you create a local copy
of the remote repository. This is similar to SVNSync. However, what
MultiSite does is only give one site read-write permission on a per
branch basis. Other sites will be able to see that branch, but it will
be read-only. Instead they'll have their own read-write branch (which
is read-only to everyone else).

For example, I have a site in India and in the U.S. My repository is
in the U.S., with MultiSite, I create a duplicate repository in India.
My U.S. office can read and write to our "trunk" (/main in ClearCase
parlance), but the India office can only read data from the "trunk".
The Indian office creates a branch based upon the trunk called
"india".  The Indian office can read and write to that branch. The
U.S. office only has read capabilities on this branch.

This will allow the U.S. office to merge the changes from the "india"
branch to "trunk" and allow the Indian office to synchronize the U.S.
changes from "trunk" to the "india" branch.

I was thinking of implementing some sort of MultiSite in Subversion,
but although the branches would "match", there would be an issue with
revision numbering. For example, in ClearCase, both the Indian office
and the U.S. office would be talking about the same version when they
talk about revision #6 of a particular file on the trunk or version
#24 of a particular file on the Indian branch. This is because each
file and each branch has their own numbering scheme. However, in
Subversion, since the whole repository is revisioned, what would be
revision 6 of /module/trunk/build.xml in the U.S. could be revision 15
in India.

Any ideas?

-- 
David Weintraub
qazwart@gmail.com

Re: Subversion and Remote Repository Synchronization

Posted by Ed <SV...@0x1b.com>.
Subversion is not disributed - try svnsync for a while, most of the
pain should go away
otherwise check out http://www.wandisco.com/

On Thu, Oct 6, 2011 at 9:22 AM, David Weintraub <qa...@gmail.com> wrote:
> Let's say I have a team in the U.S. where my Subversion repository is
> kept, and I have a remote team in India. The remote team in India is
> complaining about the length of time for checkouts and commits. Is
> there a solution to this particular issue in Subversion?
>
> I could create a local Svnsync repository, but that's read-only. Is it
> possible for the Indian users to checkout from the SvnSync repository,
> then do a relocate to the U.S. main repository, and then check in
> their changes?

the checkin is "proxied" by the mirror - don't mess up the WC with a relocate

 Would this be any faster than directly checking out
> from the U.S. repository?
>
> What about using "svngit"? We could have an automated process that
> pulls data from the Subversion repository in the U.S. and creates a
> local Git repository in India using "svngit'. This could be done when
> there's no one in the Indian office. Developers could then checkout
> and commit their changes to their local Git repository. In the middle
> of the night, the Git repository could then push its changes to
> Subversion using "gitsvn" Is this a possibility?
>
> I know other revision control systems have a variety of methods in
> handling this issue:
>
> * Git, of course, can easily create a local Indian copy of the
> repository, and everyone there can checkout and commit to that local
> repository. Changes in the local Indian Git repository can then be
> pushed to the U.S. main Git repository.
>
> * Perforce can use repository proxies
> <http://www.perforce.com/customers/white_papers/distributed_software_development_perforce>.
> The proxies will deliver local copies of a requested checkout if it
> exists, or fetch the copy from the remote server when necessary. There
> is no synchronization issues, but later checkouts are fairly fast. In
> fact, many sites have a nightly process that pre-fetches the data from
> the remote repository to the proxy since the first request for a
> particular version of a file will take a long time.
>
> * ClearCase has the most interesting (and complex) solution. ClearCase
> has something called MultiSite. With MultiSite you create a local copy
> of the remote repository. This is similar to SVNSync. However, what
> MultiSite does is only give one site read-write permission on a per
> branch basis. Other sites will be able to see that branch, but it will
> be read-only. Instead they'll have their own read-write branch (which
> is read-only to everyone else).
>
> For example, I have a site in India and in the U.S. My repository is
> in the U.S., with MultiSite, I create a duplicate repository in India.
> My U.S. office can read and write to our "trunk" (/main in ClearCase
> parlance), but the India office can only read data from the "trunk".
> The Indian office creates a branch based upon the trunk called
> "india".  The Indian office can read and write to that branch. The
> U.S. office only has read capabilities on this branch.
>
> This will allow the U.S. office to merge the changes from the "india"
> branch to "trunk" and allow the Indian office to synchronize the U.S.
> changes from "trunk" to the "india" branch.
>
> I was thinking of implementing some sort of MultiSite in Subversion,

that would be WANdisco

> but although the branches would "match", there would be an issue with
> revision numbering. For example, in ClearCase, both the Indian office
> and the U.S. office would be talking about the same version when they
> talk about revision #6 of a particular file on the trunk or version
> #24 of a particular file on the Indian branch. This is because each
> file and each branch has their own numbering scheme. However, in
> Subversion, since the whole repository is revisioned, what would be
> revision 6 of /module/trunk/build.xml in the U.S. could be revision 15
> in India.
>
> Any ideas?
>
> --
> David Weintraub
> qazwart@gmail.com
>

RE: Subversion and Remote Repository Synchronization

Posted by Les Mikesell <le...@gmail.com>.
On Oct 7, 2011 8:27 AM, "Bob Archer" <Bo...@amsi.com> wrote:
>
> > On Thu, Oct 6, 2011 at 12:55 PM, Daniel Shahaf <d....@daniel.shahaf.name>
> > wrote:

>
> Of course, the other solution is get a faster/bigger pipe to reduce the
transfer times.
>
The issue may be more latency on operations that wait for handshakes than
pure bandwidth.  Using svnserve might be slightly better than other access
methods.

-- 
  Les Mikesell
     lesmikesell@gmail.com

RE: Subversion and Remote Repository Synchronization

Posted by Bob Archer <Bo...@amsi.com>.
> On Thu, Oct 6, 2011 at 12:55 PM, Daniel Shahaf <d....@daniel.shahaf.name>
> wrote:
> > David Weintraub wrote on Thu, Oct 06, 2011 at 12:22:33 -0400:
> >> What about using "svngit"? We could have an automated process that
> >> pulls data from the Subversion repository in the U.S. and creates a
> >> local Git repository in India using "svngit'. This could be done when
> >> there's no one in the Indian office. Developers could then checkout
> >> and commit their changes to their local Git repository. In the middle
> >> of the night, the Git repository could then push its changes to
> >> Subversion using "gitsvn" Is this a possibility?
> >
> > And what do you do when the push step fails due to the Subversion
> > repository having changed after the pull?
> 
> I think you are supposed to branch for your local git work, then 'rebase' the
> svn copy (equivalent to upate) before merging your branch and using
> dcommit to push it back to the svn master.  Conceptually it shouldn't be
> different than the repository changing compared to an outstanding modified
> svn working copy.
> 

I don't think using svn-git solves the problem that updates from the remote server are going to take time. That lag will still be there.

Isn't there a way to set up some type of write through proxy when using svnsync? 

Of course, the other solution is get a faster/bigger pipe to reduce the transfer times.

BOb


Re: Subversion and Remote Repository Synchronization

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Oct 6, 2011 at 4:23 PM, David Weintraub <qa...@gmail.com> wrote:
>> On Thu, Oct 6, 2011 at 1:51 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>>
>>>> >> What about using "svngit"? We could have an automated process that
>>>> >> pulls data from the Subversion repository in the U.S. and creates a
>>>> >> local Git repository in India using "svngit'. This could be done when
>>>> >> there's no one in the Indian office. Developers could then checkout
>>>> >> and commit their changes to their local Git repository. In the middle
>>>> >> of the night, the Git repository could then push its changes to
>>>> >> Subversion using "gitsvn" Is this a possibility?
>
> On Thu, Oct 6, 2011 at 3:49 PM, Les Mikesell <le...@gmail.com> wrote:
>> I don't have experience with using git-svn myself, but it seems to be
>> designed to handle that scenario.
>
> Svn-git was mainly designed to allow you to checkpoint your work
> without checking in code you know might not work.
>
> What I am thinking is something a bit more radical: Using Git as a
> local repository for a remote Subversion repository. Now, you're
> talking about 10 to 20 or more users in your remote site using Git,
> pushing their changes to the local central Git repository, and then
> having that central Git repository sync itself back to the Subversion
> repository. You're hope is that the site that has the Subversion
> repository isn't doing to many changes while the remote site is
> active. And, the remote site isn't active while the local site is
> committing changes into Subversion.

Have you seen this?

http://subgit.com/

It is something the guys that make SVNKit have been working on.


-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Subversion and Remote Repository Synchronization

Posted by David Weintraub <qa...@gmail.com>.
> On Thu, Oct 6, 2011 at 1:51 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>
>>> >> What about using "svngit"? We could have an automated process that
>>> >> pulls data from the Subversion repository in the U.S. and creates a
>>> >> local Git repository in India using "svngit'. This could be done when
>>> >> there's no one in the Indian office. Developers could then checkout
>>> >> and commit their changes to their local Git repository. In the middle
>>> >> of the night, the Git repository could then push its changes to
>>> >> Subversion using "gitsvn" Is this a possibility?

On Thu, Oct 6, 2011 at 3:49 PM, Les Mikesell <le...@gmail.com> wrote:
> I don't have experience with using git-svn myself, but it seems to be
> designed to handle that scenario.

Svn-git was mainly designed to allow you to checkpoint your work
without checking in code you know might not work.

What I am thinking is something a bit more radical: Using Git as a
local repository for a remote Subversion repository. Now, you're
talking about 10 to 20 or more users in your remote site using Git,
pushing their changes to the local central Git repository, and then
having that central Git repository sync itself back to the Subversion
repository. You're hope is that the site that has the Subversion
repository isn't doing to many changes while the remote site is
active. And, the remote site isn't active while the local site is
committing changes into Subversion.

I like the Perforce proxy. It's practically invisible to the user and
really speeds things up.

I was thinking that I could script a MultiSite solution with
Subversion without too much difficulty. I started working on the
project and realized that there was a major problem. With ClearCase
MultiSite, your repositories will match version for version. In my
Subversion imitation, I was getting the sync done correctly, and the
trees and changes matched. However, the revision numbering was off
between the two sites.

Since most people use the Subversion revision numbering as an
alternative to tags, the mismatch in revision numbers would be
unacceptable. And, the "svn logs" don't match, so it becomes difficult
to match the history.

-- 
David Weintraub
qazwart@gmail.com

Re: Subversion and Remote Repository Synchronization

Posted by Les Mikesell <le...@gmail.com>.
On Thu, Oct 6, 2011 at 1:51 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:

>> >> What about using "svngit"? We could have an automated process that
>> >> pulls data from the Subversion repository in the U.S. and creates a
>> >> local Git repository in India using "svngit'. This could be done when
>> >> there's no one in the Indian office. Developers could then checkout
>> >> and commit their changes to their local Git repository. In the middle
>> >> of the night, the Git repository could then push its changes to
>> >> Subversion using "gitsvn" Is this a possibility?
>> >
>> > And what do you do when the push step fails due to the Subversion
>> > repository having changed after the pull?
>>
>> I think you are supposed to branch for your local git work, then
>> 'rebase' the svn copy (equivalent to upate) before merging your branch
>> and using dcommit to push it back to the svn master.  Conceptually it
>> shouldn't be different than the repository changing compared to an
>> outstanding modified svn working copy.
>>
>
> I thought David described a solution that implied machine merging, so
> I wanted to point out that that Doesn't Always Work.  Of course, if
> a developer does the merging then my concern doesn't apply.

I don't have experience with using git-svn myself, but it seems to be
designed to handle that scenario.  However, I think the main value
would be the ability to do more work 'offline' with the commits back
to svn done in batches.  If you are trying to coordinate changes
between different teams working on the same project, you'll probably
want frequent commits anyway.   If the work is mostly/all being done
at the remote site you could move the live repository there and use an
automated snvsync to keep a local read-only copy as a backup and for
test builds.

-- 
   Les Mikesell
     lesmikesell@gmail.com

Re: Subversion and Remote Repository Synchronization

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Les Mikesell wrote on Thu, Oct 06, 2011 at 13:25:58 -0500:
> On Thu, Oct 6, 2011 at 12:55 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > David Weintraub wrote on Thu, Oct 06, 2011 at 12:22:33 -0400:
> >> What about using "svngit"? We could have an automated process that
> >> pulls data from the Subversion repository in the U.S. and creates a
> >> local Git repository in India using "svngit'. This could be done when
> >> there's no one in the Indian office. Developers could then checkout
> >> and commit their changes to their local Git repository. In the middle
> >> of the night, the Git repository could then push its changes to
> >> Subversion using "gitsvn" Is this a possibility?
> >
> > And what do you do when the push step fails due to the Subversion
> > repository having changed after the pull?
> 
> I think you are supposed to branch for your local git work, then
> 'rebase' the svn copy (equivalent to upate) before merging your branch
> and using dcommit to push it back to the svn master.  Conceptually it
> shouldn't be different than the repository changing compared to an
> outstanding modified svn working copy.
> 

I thought David described a solution that implied machine merging, so
I wanted to point out that that Doesn't Always Work.  Of course, if
a developer does the merging then my concern doesn't apply.

> -- 
>   Les Mikesell
>     lesmikesell@gmail.com

Re: Subversion and Remote Repository Synchronization

Posted by Les Mikesell <le...@gmail.com>.
On Thu, Oct 6, 2011 at 12:55 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> David Weintraub wrote on Thu, Oct 06, 2011 at 12:22:33 -0400:
>> What about using "svngit"? We could have an automated process that
>> pulls data from the Subversion repository in the U.S. and creates a
>> local Git repository in India using "svngit'. This could be done when
>> there's no one in the Indian office. Developers could then checkout
>> and commit their changes to their local Git repository. In the middle
>> of the night, the Git repository could then push its changes to
>> Subversion using "gitsvn" Is this a possibility?
>
> And what do you do when the push step fails due to the Subversion
> repository having changed after the pull?

I think you are supposed to branch for your local git work, then
'rebase' the svn copy (equivalent to upate) before merging your branch
and using dcommit to push it back to the svn master.  Conceptually it
shouldn't be different than the repository changing compared to an
outstanding modified svn working copy.

-- 
  Les Mikesell
    lesmikesell@gmail.com

Re: Subversion and Remote Repository Synchronization

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
David Weintraub wrote on Thu, Oct 06, 2011 at 12:22:33 -0400:
> What about using "svngit"? We could have an automated process that
> pulls data from the Subversion repository in the U.S. and creates a
> local Git repository in India using "svngit'. This could be done when
> there's no one in the Indian office. Developers could then checkout
> and commit their changes to their local Git repository. In the middle
> of the night, the Git repository could then push its changes to
> Subversion using "gitsvn" Is this a possibility?

And what do you do when the push step fails due to the Subversion
repository having changed after the pull?

Re: Subversion and Remote Repository Synchronization

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Oct 6, 2011 at 12:22 PM, David Weintraub <qa...@gmail.com> wrote:
> Let's say I have a team in the U.S. where my Subversion repository is
> kept, and I have a remote team in India. The remote team in India is
> complaining about the length of time for checkouts and commits. Is
> there a solution to this particular issue in Subversion?
>
> I could create a local Svnsync repository, but that's read-only. Is it
> possible for the Indian users to checkout from the SvnSync repository,
> then do a relocate to the U.S. main repository, and then check in
> their changes? Would this be any faster than directly checking out
> from the U.S. repository?

There is a simple low-tech option.  Setup a CRON job that does svn
update on a local checkout, tars the result and posts it on a server
in the LAN in India.  When developers need to do a checkout they can
download and extract the tar and just run svn update to catch up to
HEAD quickly.  They can easily use switch to change to a branch pretty
efficiently.  I know a lot of users that use this successfully.

If you use Apache, you can use the WebDAV proxy.  This takes your
svnsync idea a step further by making the local server act as a proxy
for the master on write operations.  There are several advantages of
this approach over your suggestion.

1) Developers do not need to use switch.  They just checkout from
their local mirror and work as normal.  The mirror handles proxying
requests back to the master when needed.

2) Since developers do not need to use switch, they can use the mirror
for more than checkout.  Commands like log/diff/update/switch/merge
are all processed by their local mirror and give the performance
benefits from this,

This is what the ASF uses for their repository.  They have a mirror in
Europe that committers in Europe connect through when working with the
ASF repository.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/