You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Joshua Jensen <jj...@workspacewhiz.com> on 2002/05/25 18:28:59 UTC

Perforce/Subversion Timing Statistics #2

This test handles 10x as many files in the repository.  It is an example
of how the numbers scale with growing lists of files.

All operations are performed using the following directory using a BRAND
NEW repository on an Athlon 1.4ghz Windows XP Professional machine with
7200 rpm hard drives and 512 mb RAM and 98% CPU time available.  The
Perforce statistics are using the free Perforce 2-user server available
on their website.

14 binaries
172 text files
53 dirs
919kb total disk space

-------------------
Adding to database
-------------------
svn import file:///e:/svn/repos TestDir      *** 1 set: 33 seconds -- 10
sets: 4:26 ***

The database for 1 set is 4 megs (919kb original footprint).  The
database for 10 sets is 28,751kb (9,191kb original footprint).

Perforce: 					*** 1 set: 3 seconds --
10 sets: 0:31 ***

The database for 10 sets is 14.7mb including journal (9,191kb original
footprint).

-------------------------------------
First time retrieval from repository
-------------------------------------
svn co file:///e:/svn/repos 		*** 1 set: 12 seconds -- 10
sets: 2:23 ***

Perforce: p4 sync ...			*** 1 set: <2 seconds -- 10
sets: 0:12 ***

-------------------------------------------------------------
Grabbing the latest updates from repository (there are none)
-------------------------------------------------------------
svn update 					*** 1 set: 2-3 seconds
-- 10 sets: 0.03 ***

Perforce: p4 sync ...			*** 1 set: INSTANT -- 10 sets:
INSTANT ***

------------------
Deleting a file
------------------
svn delete File.zip (215k)		*** INSTANT ***
svn commit					*** 1 set: 2 seconds --
10 sets: INSTANT ***

p4 delete File.zip (215k)		*** INSTANT ***
p4 submit					*** 1 set: INSTANT -- 10
sets: INSTANT ***

-------------------------
Branching the whole tree
-------------------------
svn copy Dir1 Dir2			*** 10 sets: 2:46 ***
svn commit					*** 10 sets: 2:21 ***

p4 integrate Dir1 Dir2			*** 10 sets: 0:44 *** Includes
every filename printed to stdout
p4 submit					*** 10 sets: 2.5 seconds
***

No local file copy:
p4 integrate -v Dir1 Dir2		*** 10 sets: 2.78 seconds ***
Includes every filename printed to stdout
p4 submit					*** 10 sets: 2.5 seconds
***

Joshua Jensen


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by Mark Brown <br...@sirena.org.uk>.
On Sat, May 25, 2002 at 06:08:03PM -0600, Joshua Jensen wrote:

> In a Perforce environment, you don't even need to be connected to the
> server to edit a file.  You can just as easily make the file writable
> and run a script the next time you connect to the server to check out
> the appropriate files.  I do it all the time.  This is a fast operation,
> too, as it only looks at writable files.

You *do* need to be connected to the server to run revision control
operations, though.  Depending on how you're using things this can be a
noticable problem.

-- 
"You grabbed my hand and we fell into it, like a daydream - or a fever."

Re: Perforce/Subversion Timing Statistics #2

Posted by Ben Collins-Sussman <su...@collab.net>.
"Stephen C. Tweedie" <sc...@redhat.com> writes:

> I was quite surprised that the copy took that long:
> 
> $ time svn copy file:///home/rcs/SVN/kernel/{base,linux-2.4.19-pre9} 
> 
> Committed revision 6.
> 
> real	1m9.354s
> user	0m19.060s
> sys	0m1.290s

Yikes, I just reproduced this.  It's *definitely* a bug.  The whole
design of our repository was to make tagging a constant-time
operation.  It should only take a couple seconds at most... it's just
adding a single copied node.

It turns out that this is a bug (cmpilato thinks) from a recent change
to the way we deltify previous revisions after a commit.

However, I don't think we'll fix this bug;  we're about to have the
entire guts of libsvn_fs rewritten this week.  It's not worth it.


> 
> Even doing a simple "svn up" null operation where the entire wc is
> already at the latest version is taking a long time --- about 18
> seconds if the entire tree is already in cache, 1 minute 5 seconds
> from cold, and strace is showing a lot of write activity inside the wc
> for what should be a noop.

It is a noop from the server's point of view, but not the client's.

'svn up' has to stat every single file in your working copy no matter
what -- it has to send a complete description of the working copy to
the server.  If your working copy is truly up-to-date, then the server
sends no data back.  (CVS works the same way... except that our
state-reporting is actually much faster.)

Nevertheless, oftentimes you may have just committed the latest
repository revision... meaning that a few files in your wc are at
HEAD, and all the rest are a HEAD-1.  When you 'svn up', the server
still sends back no data, but every item in your wc must have its
working revision "bumped" up to HEAD.  This means reading and
re-writing every .svn/entries file in the whole tree.  That's the
write activity you're seeing.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by "Stephen C. Tweedie" <sc...@redhat.com>.
Hi,

On Tue, May 28, 2002 at 12:50:16PM -0500, Ben Collins-Sussman wrote:
 
>    svn cp wc-path1 wc-path2
>    svn commit
> 
> I think you'll see a near-instantaneous result.

I tried that earlier today, after importing and tagging the new
linux-2.4.19-pre9 release into svn.  The import took over 6 minutes.
The tag (a simple server-side url copy) took just over a minute.  

I was quite surprised that the copy took that long:

$ time svn copy file:///home/rcs/SVN/kernel/{base,linux-2.4.19-pre9} 

Committed revision 6.

real	1m9.354s
user	0m19.060s
sys	0m1.290s

Even doing a simple "svn up" null operation where the entire wc is
already at the latest version is taking a long time --- about 18
seconds if the entire tree is already in cache, 1 minute 5 seconds
from cold, and strace is showing a lot of write activity inside the wc
for what should be a noop.

--Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
>
>
> Again, this is just a matter of Berkeley DB.  Not a surprise that
> a BDB database is bigger than a perforce DB.  Someday svn will be
> using any number of SQL back-ends... they'll probably be even bigger.
> :-)
>

Count on it :-)

gat




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by Ben Collins-Sussman <su...@collab.net>.
"Joshua Jensen" <jj...@workspacewhiz.com> writes:

> > Easy. You've been misled about 'svn cp' :-)
> > Neener neener. We fooled you :-)
> 
> Thanks for clearing this up.  When I first started using svn cp, I
> thought that you had to commit it (and you did, since it was client
> side).  When the server version was introduced to me with no commit, it
> SEEMED as if the client version didn't commit, either.  Instead, they
> have two separate behaviors.  I've got that now.

Josh, I know there isn't any formal documentation for svn yet, but I
think you might have a lot of questions answered by running 'svn
help'.  We at least try to keep our interface descriptions up-to-date
within the client.  :-)

(For example, 'svn help cp' is quite verbose.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Perforce/Subversion Timing Statistics #2

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> Easy. You've been misled about 'svn cp' :-)
> Neener neener. We fooled you :-)

Thanks for clearing this up.  When I first started using svn cp, I
thought that you had to commit it (and you did, since it was client
side).  When the server version was introduced to me with no commit, it
SEEMED as if the client version didn't commit, either.  Instead, they
have two separate behaviors.  I've got that now.

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Perforce/Subversion Timing Statistics #2

Posted by Greg Stein <gs...@lyra.org>.
On Fri, May 31, 2002 at 02:11:28AM -0600, Joshua Jensen wrote:
>...
> But not before the mistake has already been made.  How do you, for
> example, perform a project reorganization without hitting the server?

Easy. You've been misled about 'svn cp' :-)

> I'm moving around, let's say, 20 files between different directories.
> In order to do it the "right" way and maintain history, I need to use
> 'svn cp'.  However, svn cp instantly commits the change to the
> repository.

Nope. 'svn cp http://example.com/repos/foo http://example.com/repos/bar'
will hit the server. But 'svn cp my-working-copy/foo my-working-copy/bar'
will definitely NOT hit the server.

The basic difference is this:

* svn cp URL-src URL-dst

  This operation occurs entirely on the server. The client doesn't have to
  make copies of content within the working copy (heck: a working copy
  doesn't even need to exist!).

* svn cp WC-src WC-dst

  This operation occurs entirely on the client. The client will copy the
  source to the destination, and record that a copy was made. When you
  finally go and 'svn commit', the appropriate information is transferred to
  the server (which includes the copy history).

Note that you can also copy from the WC up to the server (which does an
auto-commit), or from the server down into your WC (which is just a local
modification for later commit).

> I need to email my team and tell them I'm locking down the
> repository for some amount of time to not only perform the proper
> Subversion operations, but also to change makefiles and any number of
> other project configuration parameters.

Nope :-)

>...
> My point is merely this... most other Subversion commands have to be
> committed.  Why does SVN cp get to do an auto-commit?

Neener neener. We fooled you :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: RE: Perforce/Subversion Timing Statistics #2

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> In order to fully mirror what svn cp lets you do, you need to 
> also time the cost of the appropriate Perforce commit operation. 
> 
> i.e. does the following Perforce command sequence work:
> p4 integrate -v <src> <target>
> p4 commit

The Perforce submission is near instant in this case.  Let me iterate
this again... with the p4 integrate -v option and the subsequent p4
submit, there is no working copy data going across the network and the
branch happens entirely on the server.

I just branched 64 megs of data in Perforce through 935 files and 191
directories.  The integrate was the cost of printing 935 files to stdout
(which was just a few seconds).  The submit was the cost of printing 935
files to stdout (which was even faster).  That being said, the Perforce
operation is technically as fast as the Subversion operation is supposed
to be (when the code is fixed).

> Even if "svn copy" were delayed until a commit, it still 
> should be substantially faster than Perforce. (If we've 
> understood Perforce's storage model correctly.)

The timings speak for themselves.  Perforce's branching and Subversion's
branching (when it is fixed) are near identical in speed.

> > Before I commit, if I don't like the way I branched it, I 
> just revert 
> > the change.  All is done locally until I submit.  That 
> includes the p4 
> > integrate -v option.  This is a much safer way, in my opinion, than
> the
> > instant update on the server implied above (but not tested 
> by me, so I 
> > could just be spouting a bunch of crap... ;) ).
> > 
> 
> SVN also lets you instantly fix the problem as well:
> svn delete <target>
> svn commit

But not before the mistake has already been made.  How do you, for
example, perform a project reorganization without hitting the server?
I'm moving around, let's say, 20 files between different directories.
In order to do it the "right" way and maintain history, I need to use
'svn cp'.  However, svn cp instantly commits the change to the
repository.  I need to email my team and tell them I'm locking down the
repository for some amount of time to not only perform the proper
Subversion operations, but also to change makefiles and any number of
other project configuration parameters.

Since Perforce does the commit as a separate operation, I can move my
files freely, revert them when I make a mistake, take three days to do
the whole reorganization, test it thoroughly, and THEN hand it off to my
team in a single commit.

My point is merely this... most other Subversion commands have to be
committed.  Why does SVN cp get to do an auto-commit?

Thanks,
Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: RE: Perforce/Subversion Timing Statistics #2

Posted by Bill Tutt <ra...@lyra.org>.
> From: Joshua Jensen [mailto:jjensen@workspacewhiz.com]
> 
> > Perforce stores files in RCS layout, and the metadata in db format.
> 
> In fact, the metadata is supposedly (according to the Perforce mailing
> list) in Berkeley DB format with a proprietary schema/format...
> 
> > > > * Branching files is significantly faster than Subversion.  This
> > > > operation will happen far less frequently, though, but
> > still takes a
> > > > long, long time.
> >
> > I didn't look closely at the test, but are you sure this is
> > correct? Perforce has a different way of branching than any
> > other setup I have seen. Requires two steps, one to create
> > the branch (which is just a description on the server) and
> > the other is to sync/commit the branch. That took a
> > considerable amount of time for me using perforce (basically
> > a full checkout, then commit).
> >
> > As BenS points out, server-side branching (the only kind I
> > will ever do) is a 2 second operation regardless of the size
> > of the dataset being branched. Subversion clearly wins in this case.
> 
> I provided both a local and network branch test for Perforce.  By
adding
> the -v option to the p4 integrate command line, it does the branch "on
> the server."  At best, they come close to tieing (which is fantastic).
> 

In order to fully mirror what svn cp lets you do, you need to also time
the cost of the appropriate Perforce commit operation. 

i.e. does the following Perforce command sequence work:
p4 integrate -v <src> <target>
p4 commit

The Subversion folks are guessing that even if it does work without
requiring an intervening "p4 sync" that the time frame to complete both
of those operations is O(N) as opposed to O(1). Regardless of whether or
not Perforce's constant factor is much smaller given how it knows your
clients current state.

> One of the reasons I like the delayed approach is it fits in way
better
> with the atomic commits.  For example... I'm preparing a new tree that
> is a combination of three trees.  In Perforce, I branch each tree
where
> I want it.  I check the physical representation of it on my hard
drive,
> if I wish.  Then I commit the change, and the branch is created in one
> atomic transaction.
> 

Even if "svn copy" were delayed until a commit, it still should be
substantially faster than Perforce. (If we've understood Perforce's
storage model correctly.)

> Before I commit, if I don't like the way I branched it, I just revert
> the change.  All is done locally until I submit.  That includes the p4
> integrate -v option.  This is a much safer way, in my opinion, than
the
> instant update on the server implied above (but not tested by me, so I
> could just be spouting a bunch of crap... ;) ).
> 

SVN also lets you instantly fix the problem as well:
svn delete <target>
svn commit

Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Perforce/Subversion Timing Statistics #2

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> Perforce stores files in RCS layout, and the metadata in db format.

In fact, the metadata is supposedly (according to the Perforce mailing
list) in Berkeley DB format with a proprietary schema/format...

> > > * Branching files is significantly faster than Subversion.  This 
> > > operation will happen far less frequently, though, but 
> still takes a 
> > > long, long time.
> 
> I didn't look closely at the test, but are you sure this is 
> correct? Perforce has a different way of branching than any 
> other setup I have seen. Requires two steps, one to create 
> the branch (which is just a description on the server) and 
> the other is to sync/commit the branch. That took a 
> considerable amount of time for me using perforce (basically 
> a full checkout, then commit).
> 
> As BenS points out, server-side branching (the only kind I 
> will ever do) is a 2 second operation regardless of the size 
> of the dataset being branched. Subversion clearly wins in this case.

I provided both a local and network branch test for Perforce.  By adding
the -v option to the p4 integrate command line, it does the branch "on
the server."  At best, they come close to tieing (which is fantastic).

One of the reasons I like the delayed approach is it fits in way better
with the atomic commits.  For example... I'm preparing a new tree that
is a combination of three trees.  In Perforce, I branch each tree where
I want it.  I check the physical representation of it on my hard drive,
if I wish.  Then I commit the change, and the branch is created in one
atomic transaction.

Before I commit, if I don't like the way I branched it, I just revert
the change.  All is done locally until I submit.  That includes the p4
integrate -v option.  This is a much safer way, in my opinion, than the
instant update on the server implied above (but not tested by me, so I
could just be spouting a bunch of crap... ;) ).

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by Ben Collins <bc...@debian.org>.
On Tue, May 28, 2002 at 12:50:16PM -0500, Ben Collins-Sussman wrote:
> 
> I'm not sure why I'm replying to this, as it seems like
> flamebait... however Joshua has been very polite in his findings.  :-)
> 
> "Joshua Jensen" <jj...@workspacewhiz.com> writes:
> 
> > No, I don't believe my tests are missing the whole picture.  I have
> > proven that:
> > 
> > * Adding files to Perforce is significantly faster than Subversion.
> 
> This seems true... Berkeley DB is probably the culprit here.  I
> wouldn't be surprised to learn that BDB is slower than whatever
> proprietary DB perforce is using.

Perforce stores files in RCS layout, and the metadata in db format.

> > * Branching files is significantly faster than Subversion.  This
> > operation will happen far less frequently, though, but still takes a
> > long, long time.

I didn't look closely at the test, but are you sure this is correct?
Perforce has a different way of branching than any other setup I have
seen. Requires two steps, one to create the branch (which is just a
description on the server) and the other is to sync/commit the branch.
That took a considerable amount of time for me using perforce (basically
a full checkout, then commit).

As BenS points out, server-side branching (the only kind I will ever do)
is a 2 second operation regardless of the size of the dataset being
branched. Subversion clearly wins in this case.

-- 
Debian     - http://www.debian.org/
Linux 1394 - http://linux1394.sourceforge.net/
Subversion - http://subversion.tigris.org/
Deqo       - http://www.deqo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by Ben Collins-Sussman <su...@collab.net>.
"Joshua Jensen" <jj...@workspacewhiz.com> writes:

> > > * Retrieving new files from Perforce is significantly faster than
> > > Subversion.
> > 
> > The svn client must 'walk' the entire working copy in order
> > to report its state to the server.  Perforce doesn't need to do this.
> 
> I can't imagine this is it.  There isn't a working copy yet.  All files
> being transferred are new.  No directories have been created.

Oh sorry, I was describing how 'svn up' works.

For checkouts, we're building a whole lot of metadata in .svn, and
we're currently very inefficient about reading/writing our entries
files.  My guess is that at least 50% of the time required for 'svn
co' is just lots of disk-diddling within .svn/ areas as the files
arrive.  Perforce, as I understand it, is storing the metadata
server-side, so it has a big advantage there.  (Of course, svn still
has tons of room for optimization here -- since CVS checkouts are
still much faster.)


> > Try creating a branch server-side:
> > 
> >    svn cp URL1 URL2
> 
> I will try this later.
> 

Actually, this feature accidentally broke about a week ago, so don't
try it ATM.  It will be fixed when we switch to our new repository
code RSN!  :-)


> > 'svn merge' is receiving diffs from the server, and applying
> > them to the working copy.  The slowness you're seeing is in 
> > the commit, as you say.  
> > 
> > Just as with updates, it takes 30 seconds for svn to commit,
> > because it must 'walk' over the entire working copy, 
> > searching for changed files.  Perforce doesn't need to do this.
> 
> Could be, except other commits only took a few seconds, as I previously
> reported, on the same directory structure.  Something must be different.

Huh?  Not sure what data you're referring to here.  Maybe you're
witnessing OS caching of files on second runs?

> Assuming everyone is going to have large amounts of disk space is like
> assuming all servers will have 2+ gigs of RAM.  It isn't going to
> happen.  Do you really want Subversion to be confined to a niche market
> or do you want everyone to use it?  I'm far more likely to use a piece
> of software that makes efficient use of the resources it is given.  Just
> my opinion...

Careful in your accusations -- svn may seem like its making
"inefficient use" of disk resources, but it's actually a very
deliberate *tradeoff* decision we've made.  The tradeoff is that svn
makes significantly less use of network than CVS does -- 'svn status'
requires no network access, and neither does 'svn revert'.  The client
also sends diffs to the server during commits.  The tradeoff, of
course, is that your working copy is twice as big.

Similarly, using BDB or some giant SQL db on the server is indeed much
larger than a tree of RCS files, but the tradeoff is huge
flexibility.  Once we have SQL going, you'll be able to query the
repository in ways undreamed of in CVS.

It all depends on your priorities.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Perforce/Subversion Timing Statistics #2

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> I'm not sure why I'm replying to this, as it seems like
> flamebait... however Joshua has been very polite in his findings.  :-)

I'm not sure whether to take offense to that or not... ;)

The point is, a programmer's productivity is of utmost concern to me in
my job.  Subversion is a very important product to me, but I can't
recommend a product (when it hits 1.0) if it is going to waste a lot of
time.

I've just published some new statistics (including CVS) from my machine.
All I can say is I must've set up the Apache server wrong or something,
because the numbers for the activities I've tested thus far are not very
good for Subversion.  :(

> This seems true... Berkeley DB is probably the culprit here.
> I wouldn't be surprised to learn that BDB is slower than 
> whatever proprietary DB perforce is using.

Perforce uses Berkeley DB with a proprietary schema for its database
files and some form of RCS for everything else.

> > * Retrieving new files from Perforce is significantly faster than
> > Subversion.
> 
> The svn client must 'walk' the entire working copy in order
> to report its state to the server.  Perforce doesn't need to do this.

I can't imagine this is it.  There isn't a working copy yet.  All files
being transferred are new.  No directories have been created.

> > * Branching files is significantly faster than Subversion.  This
> > operation will happen far less frequently, though, but 
> still takes a
> > long, long time.
> 
> Try creating a branch server-side:
> 
>    svn cp URL1 URL2

I will try this later.

> 'svn merge' is receiving diffs from the server, and applying
> them to the working copy.  The slowness you're seeing is in 
> the commit, as you say.  
> 
> Just as with updates, it takes 30 seconds for svn to commit,
> because it must 'walk' over the entire working copy, 
> searching for changed files.  Perforce doesn't need to do this.

Could be, except other commits only took a few seconds, as I previously
reported, on the same directory structure.  Something must be different.

> Again, this is just a matter of Berkeley DB.  Not a surprise
> that a BDB database is bigger than a perforce DB.  Someday 
> svn will be using any number of SQL back-ends... they'll 
> probably be even bigger.
> :-)

I'm not a system admin or anything, but I do know this.  It is a pain in
the neck to continually upgrade hard drives at my place of work.  It is
downtime for everyone.  It happens unexpectedly, because there is no
dedicated admin monitoring the disk space every three seconds.  It is
frustrating.

Assuming everyone is going to have large amounts of disk space is like
assuming all servers will have 2+ gigs of RAM.  It isn't going to
happen.  Do you really want Subversion to be confined to a niche market
or do you want everyone to use it?  I'm far more likely to use a piece
of software that makes efficient use of the resources it is given.  Just
my opinion...

> Honestly, Joshua, these are the benchmarks we care most
> about.  At this time, Subversion's goal is to replace CVS... 
> *not* to be explicitly better than Perforce, Bitkeeper, 
> Clearcase, etc.

Subversion's features largely make it on par with Perforce.  That's why
I provide (and will continue to provide) the Perforce comparison.

> If you repeat your tests comparing CVS and SVN, we'd be very
> interested.  It would be even better if you can run your 
> tests with a network separating the client from the repository.

The timings are posted.  I was unable to have the repository on a
separate network.  I don't believe those timings are as valid, though,
as network bandwidth can dramatically affect the outcome.  If it isn't
efficient locally, it definitely won't be efficient remotely.

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by Ben Collins-Sussman <su...@collab.net>.
I'm not sure why I'm replying to this, as it seems like
flamebait... however Joshua has been very polite in his findings.  :-)

"Joshua Jensen" <jj...@workspacewhiz.com> writes:

> No, I don't believe my tests are missing the whole picture.  I have
> proven that:
> 
> * Adding files to Perforce is significantly faster than Subversion.

This seems true... Berkeley DB is probably the culprit here.  I
wouldn't be surprised to learn that BDB is slower than whatever
proprietary DB perforce is using.

> * Retrieving new files from Perforce is significantly faster than
> Subversion.

The svn client must 'walk' the entire working copy in order to report
its state to the server.  Perforce doesn't need to do this.

> * Branching files is significantly faster than Subversion.  This
> operation will happen far less frequently, though, but still takes a
> long, long time.

Try creating a branch server-side:

   svn cp URL1 URL2

instead of client side (as you did):

   svn cp wc-path1 wc-path2
   svn commit

I think you'll see a near-instantaneous result.


> * Merging files across branches is way faster in Perforce.  I've been
> benchmarking this, and the merge and commit is just a few seconds in
> Perforce for a small number of files.  The merge itself in Subversion is
> very quick, but EVERY time I commit, it takes OVER 30 seconds (that's
> when I quit looking at the clock and just hope it will finish).

'svn merge' is receiving diffs from the server, and applying them to
the working copy.  The slowness you're seeing is in the commit, as you
say.  

Just as with updates, it takes 30 seconds for svn to commit, because
it must 'walk' over the entire working copy, searching for changed
files.  Perforce doesn't need to do this.

> * From an administration standpoint, server disk space is far less in
> Perforce.

Again, this is just a matter of Berkeley DB.  Not a surprise that
a BDB database is bigger than a perforce DB.  Someday svn will be
using any number of SQL back-ends... they'll probably be even bigger.
:-)

>  Even worse, I did some benchmarks with CVS (and how I hate
> that software) earlier today (at the request of a CVS user), and CVS was
> WAY faster than Subversion, too.

Honestly, Joshua, these are the benchmarks we care most about.  At
this time, Subversion's goal is to replace CVS... *not* to be
explicitly better than Perforce, Bitkeeper, Clearcase, etc.

If you repeat your tests comparing CVS and SVN, we'd be very
interested.  It would be even better if you can run your tests with a
network separating the client from the repository.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Perforce/Subversion Timing Statistics #2

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> From: Ben Collins [mailto:bcollins@debian.org] 
> On Sat, May 25, 2002 at 12:28:59PM -0600, Joshua Jensen wrote:
> > This test handles 10x as many files in the repository.  It is an 
> > example of how the numbers scale with growing lists of files.
> 
> One advantage (or disadvantage depending on your 
> perspective), is that perforce retains all working copy 
> metadata on the server end. So the server "knows" where the 
> client sits. If you edit a file, you tell the server. So 
> everything can be deduced without any transfer of data from 
> the client to the server.

When it comes to performance, this is a great way to do it.  In fact, a
combination of server metadata and client metadata would make the
operation even faster.

In a Perforce environment, you don't even need to be connected to the
server to edit a file.  You can just as easily make the file writable
and run a script the next time you connect to the server to check out
the appropriate files.  I do it all the time.  This is a fast operation,
too, as it only looks at writable files.

> This is good for obvious performance gains. Not to mention 
> that the server can gain more statistics and information on 
> the clients (for administration purposes, this is a win). 
> However, it is extrememly costly for the client to validate 
> itself. If you need to compare your working copy against what 
> the server thinks you have, you are looking at a good bit of 
> time and cpu consumption.

I work in an environment where we have thousands of clients accessing
gigabytes and gigabytes of data via the appropriate source control
servers daily.  The number of times any given client needs to do full
validations are very, very few (and in my opinion, only done when an
administrator is present).  I question how much time storing a client
side extra copy of the server side file is actually going to save.  I
have gigabytes of data from source control on my machine.  Space might
be cheap, but having to go through bureaucratic mumbo jumbo to get hard
drive upgrades is time consuming in itself.

That being said, I like the idea of having the option in this case.
Being able to svn status or svn revert (something Perforce CAN'T do)
while not connected to the server is a very cool operation.  But for
those individuals who don't have the hard drive space to handle it, it
would be nice to offer the ability to turn the client side storage
on/off (post 1.0, of course).

> So, IMO, your tests are really missing the whole picture. 
> Sure, Perforce can win on key command sequences. However, in 
> the end, the overall performance of Subversion is likely 
> better, taking into account the less costly maint. of a 
> working copy. For large repositories, this is critical.

No, I don't believe my tests are missing the whole picture.  I have
proven that:

* Adding files to Perforce is significantly faster than Subversion.
* Retrieving new files from Perforce is significantly faster than
Subversion.
* Branching files is significantly faster than Subversion.  This
operation will happen far less frequently, though, but still takes a
long, long time.
* Merging files across branches is way faster in Perforce.  I've been
benchmarking this, and the merge and commit is just a few seconds in
Perforce for a small number of files.  The merge itself in Subversion is
very quick, but EVERY time I commit, it takes OVER 30 seconds (that's
when I quit looking at the clock and just hope it will finish).
* From an administration standpoint, server disk space is far less in
Perforce.

So what performance in Subversion are you talking about?  My most used
operations are retrieving updates to files, editing files, and merging
files across branches.  Oh, and for commits, I tend to revert unchanged
files so they don't check in with everything else... however, since
Perforce knows exactly which files it needs to diff (and starts with
timestamp), this is still way faster than Subversion determining what
needs to be checked in, too, except in the case of having every file in
the repository checked out.

Do others use different operations more commonly than retrieving files
and editing files?  If that is what Subversion is for, then by all
means, I've got it wrong.

But before making a claim that the overall performance of Subversion is
likely better, back it up with hard statistics.  I was SO EXCITED about
Subversion.  I was super excited when I could jump in, compile it, and
run it.  But all my excitement waned when I actually started the time
benchmarks.  Even worse, I did some benchmarks with CVS (and how I hate
that software) earlier today (at the request of a CVS user), and CVS was
WAY faster than Subversion, too.

Again, I realize Subversion is pre-alpha.  I am merely bringing this to
the attention of the primary developers of Subversion.  You want to be
the next CVS, but if you tell a CVS user, "Sure, we're better.  CVS
might be able to do a cvs import of the data in Josh's test case in 18
seconds, but don't mind that (and the fact that Subversion took 4.5
MINUTES).  We have atomic commits!  CVS doesn't!"

For what it's worth, I haven't given up on Subversion.  Other than the
potential improvements being made to CVSNT that might sway me when they
work, it is the only free alternative to Perforce that has the potential
to be as powerful.  I'm now poking through code trying to understand
where the bottlenecks are.  I have high hopes for the product, and I
will continue to spend the necessary time investigating it.

Thanks,
Joshua Jensen


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Perforce/Subversion Timing Statistics #2

Posted by Ben Collins <bc...@debian.org>.
On Sat, May 25, 2002 at 12:28:59PM -0600, Joshua Jensen wrote:
> This test handles 10x as many files in the repository.  It is an example
> of how the numbers scale with growing lists of files.
> 
> All operations are performed using the following directory using a BRAND
> NEW repository on an Athlon 1.4ghz Windows XP Professional machine with
> 7200 rpm hard drives and 512 mb RAM and 98% CPU time available.  The
> Perforce statistics are using the free Perforce 2-user server available
> on their website.

One advantage (or disadvantage depending on your perspective), is that
perforce retains all working copy metadata on the server end. So the
server "knows" where the client sits. If you edit a file, you tell the
server. So everything can be deduced without any transfer of data from
the client to the server.

This is good for obvious performance gains. Not to mention that the
server can gain more statistics and information on the clients (for
administration purposes, this is a win). However, it is extrememly
costly for the client to validate itself. If you need to compare your
working copy against what the server thinks you have, you are looking at
a good bit of time and cpu consumption.

So, IMO, your tests are really missing the whole picture. Sure, Perforce
can win on key command sequences. However, in the end, the overall
performance of Subversion is likely better, taking into account the less
costly maint. of a working copy. For large repositories, this is
critical.

-- 
Debian     - http://www.debian.org/
Linux 1394 - http://linux1394.sourceforge.net/
Subversion - http://subversion.tigris.org/
Deqo       - http://www.deqo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org