You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by listman <li...@burble.net> on 2006/08/25 18:02:12 UTC
subversion performance issues (was perforce Vs subversion)
Hi, I wanted to summarize some discussions I've been having with some
of the SVN developers offline as well as the discussion we've started
to have here.
The issue:
Subversion management of large binary files can be very slow
The Subversion assumption:
Subversion assumes that the network bandwidth requirements of
handling large binary files is such that its more efficient to diff
the current and current-1
versions and transmit the delta, the argument being that if you're on
a slow modem connection or a flakey US-India cable you'd prefer to
deal with the
diff compute times than the time taken to transmit the large files.
Why this doesn't make sense in many situations:
1. Unfortunately a lot of binary db's don't diff very well with
incremental changes to the user input. The diffs can often be as
large as the original file
2. Often all the users are on a local network or the repositories are
mirrored between sites and the available network bandwidth is very good.
3. For large files the diffs can take an extremely long time, much
much more time than the time taken to transmit the entire file, even
under high
network load situations.
Another complication:
After talking to various developer types it seems that Subversion is
actually doing a binary diff at the client and the server end, which
is redundant.
so we have 2x the number of (time consuming) diffs we need.
The fix:
a) we need to remove the redundant diff operations that currently occur
b) one of the developers needs to profile Subversion and determine
the bottle necks under the following scenarios
i) doing an initial import of a large binary file to a fresh repository
ii) committing a new version of a large binary file to an existing
repos
this will likely throw up a list of other possible improvements.
c) On the user group list, Talden suggested that a new prop gets
added to subversion that allows users to designate files that
shouldn't be diffed
"svn:diffasnew" was his suggested keyword, and instruct both client
and server to treat the file as a complete replacement. This seems
like a
good suggestion.
I'm willing to pay a bounty for developers that are interested in
working on this. Please contact me for more details.
Subversion is my preferred tool for my software activities, I'd love
to be able to use for all my design data, but unfortunately its just
not an option
at the moment..
>
Re: subversion performance issues (was perforce Vs subversion)
Posted by Daniel Berlin <db...@dberlin.org>.
listman wrote:
>
> Hi, I wanted to summarize some discussions I've been having with some of
> the SVN developers offline as well as the discussion we've started to
> have here.
>
> *The issue:*
> Subversion management of large binary files can be very slow
>
> *The Subversion assumption: *
> Subversion assumes that the network bandwidth requirements of handling
> large binary files is such that its more efficient to diff the current
> and current-1
> versions and transmit the delta, the argument being that if you're on a
> slow modem connection or a flakey US-India cable you'd prefer to deal
> with the
> diff compute times than the time taken to transmit the large files.
>
> *Why this doesn't make sense in many situations:*
> 1. Unfortunately a lot of binary db's don't diff very well with
> incremental changes to the user input. The diffs can often be as large
> as the original file
> 2. Often all the users are on a local network or the repositories are
> mirrored between sites and the available network bandwidth is very good.
> 3. For large files the diffs can take an extremely long time, much much
> more time than the time taken to transmit the entire file, even under high
> network load situations.
>
> *Another complication:*
> After talking to various developer types it seems that Subversion is
> actually doing a binary diff at the client and the server end, which is
> redundant.
> so we have 2x the number of (time consuming) diffs we need.
>
> *The fix**:*
> a) we need to remove the redundant diff operations that currently occur
>
> b) one of the developers needs to profile Subversion and determine the
> bottle necks under the following scenarios
> i) doing an initial import of a large binary file to a fresh repository
> ii) committing a new version of a large binary file to an existing repos
>
> this will likely throw up a list of other possible improvements.
>
> c) On the user group list, Talden suggested that a new prop gets added
> to subversion that allows users to designate files that shouldn't be diffed
> "svn:diffasnew" was his suggested keyword, and instruct both client and
> server to treat the file as a complete replacement. This seems like a
> good suggestion.
>
> I'm willing to pay a bounty for developers that are interested in
> working on this. Please contact me for more details.
>
> Subversion is my preferred tool for my software activities, I'd love to
> be able to use for all my design data, but unfortunately its just not an
> option
> at the moment..
>
>
>
>>
If you give me repos and files i can use to simulate the problems, i
will make it take less time :)
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org