You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Joshua Jensen <jj...@workspacewhiz.com> on 2002/05/30 06:41:22 UTC

CVS/Subversion/Perforce Timing Statistics (Take 3)

This just gets weirder and weirder...

First, my versions of everything:

CVS 1.11.1.3 (Build 57d)
Subversion (Build 2038)
Perforce 2002.1

All repositories were LOCAL, although some tests accessed them via
'localhost'.  Anything herein referred to as network means localhost.

Machine = AMD Athlon 1.4ghz with 512 megs DDR RAM, oodles of hard drive
space.

So others can duplicate my tests, I used the following source code
archive: http://www.magic-software.com/ZipFiles/MSWindows1p3Free.zip

It was duped twice onto my G: drive, which is a separate hard drive from
the repository.  The unzipped archive was duped once and named as
follows:

G:\Dirs
G:\Dirs\MagicSoftware1
G:\Dirs\MagicSoftware2

There are 1,264 files in 116 directories, comprising 8,574,128 bytes.

Instructions are below.

I MUST be doing something wrong, because the Perforce timings are not
very good.

------------------
Timings
------------------
Subversion Import (Network)  : 12:26 (that's 12 MINUTES, 26 seconds)
Subversion Import (Local)    :  3:20
Subversion Checkout (Network): 20:27 (this one is weird... see below)
Subversion Checkout (Local)  :  8:03 (same here)
Subversion Update (Network)  :  0:04 (sigh... a good number!)
Subversion Update (Local)    :  0:04

CVS Import (Network)         :  0:57
CVS Import (Local)           :  0:18
CVS Checkout (Network)       :  0:50
CVS Checkout (Local)         :  0:31
CVS Update (Network)         :  0:14
CVS Update (Local)           :  0:06

Perforce Import (Network)    :  0:41
Perforce Checkout (Network)  :  0:20
Perforce Update (Network)    :  INSTANT

----------------------------------
Creation of the repositories
----------------------------------
Subversion:
-----------
[E:\]svnadmin create e:\svn

Apache was set up according to the provided Subversion instructions.
The latest build was used.

CVS:
----
The repository was created with CVSNT's service, although it probably
equated to:
cvs -d e:\cvs\test init

A password was created with:

set cvsroot=:ntserver:MyComputerName:/TEST
cvs passwd -a "Joshua Jensen"

set cvsroot=:pserver:Joshua Jensen@MyComputerName:/TEST

Perforce:
---------
A repository was set up according to this page:
http://www.perforce.com/perforce/technotes/note035.html

The section called "Procedural example for creating a second Perforce NT
service" was used.

The repository resides at e:\pf


----------------------------------
Importing into a fresh repository
----------------------------------
Subversion**********
Dirs has to exist in E:\svntest\Dirs for this to work.
Network [E:\svntest]svn import http://localhost/svn Dirs -m "Test"
Local   [E:\svntest]svn import file:///svn Dirs -m "Test"

CVS*****************
Dirs/ has to exist in E:\cvstest\Dirs for this to work.
Network [E:\cvstest]cvs import -m "test" Dirs vendor release
Local   [E:\cvstest]cvs -d e:\cvs import -m "test" Dirs vendor release

Perforce************
Dirs has to exist in E:\pftest\Dirs for this to work.
Network [E:\pftest]dir /b /s /a-d | p4 -x - add


------------------------------------
Checking out from a CLEAN directory
------------------------------------
Subversion**********
Network [E:\svntest]svn co http://localhost/svn Dirs
Local   [E:\svntest]svn co file:///svn Dirs

Oddly enough, Subversion 2038 actually checks Dirs out TWICE and does
NOT properly populate the .svn directory with the client side clones of
the local files.  Any ideas?

CVS*****************
Network [E:\svntest]cvs co Dirs
Local   [E:\svntest]cvs -d e:\cvs\test co Dirs

Perforce************
Network [E:\pftest]p4 sync Dirs/...
        [E:\pftest]p4 edit Dirs/...

The additional 'p4 edit' simulates Subversion's and CVS's checkout,
although most Perforce users wouldn't do this.


--------------------------------------------
Updating from the above retrieved directory
--------------------------------------------
Subversion**********
Network [E:\svntest\svn]svn update
Local   [E:\svntest\svn]svn update

CVS*****************
Network [E:\cvstest\Dirs]cvs update
Local   [E:\cvstest\Dirs]cvs -d e:\cvs\test update


Perforce************
Network [E:\pftest]p4 sync Dirs/...

(And possibly a p4 resolve, but there are no changes at this point)

--------------------------------------------------------------------

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: CVS/Subversion/Perforce Timing Statistics (Take 3)

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> > Oddly enough, Subversion 2038 actually checks Dirs out 
> TWICE and does 
> > NOT properly populate the .svn directory with the client 
> side clones 
> > of the local files.  Any ideas?
> 
> Yes.  'svn co' checks out each argument, and you've given it two args.
> 
> You need to pass the -d (--destination) flag to checkout, 
> just like CVS's -d flag:
> 
>    svn co http://localhost/svn -d Dirs
>    
> If you leave off the -d argument, it just creates a working 
> copy subdir named after the basename of the URL.

Oops... that's what two hours of late night testing does to you.  I knew
that, too.  I'll rerun the Subversion checkout timings and repost the
results.

<whack> Bad me.

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: CVS/Subversion/Perforce Timing Statistics (Take 3)

Posted by Ben Collins-Sussman <su...@collab.net>.
"Joshua Jensen" <jj...@workspacewhiz.com> writes:

> Subversion**********
> Network [E:\svntest]svn co http://localhost/svn Dirs
> Local   [E:\svntest]svn co file:///svn Dirs
> 
> Oddly enough, Subversion 2038 actually checks Dirs out TWICE and does
> NOT properly populate the .svn directory with the client side clones of
> the local files.  Any ideas?

Yes.  'svn co' checks out each argument, and you've given it two args.

You need to pass the -d (--destination) flag to checkout, just like
CVS's -d flag:

   svn co http://localhost/svn -d Dirs
   
If you leave off the -d argument, it just creates a working copy
subdir named after the basename of the URL.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: CVS/Subversion/Perforce Timing Statistics (Take 3)

Posted by mark benedetto king <bk...@answerfriend.com>.
On Thu, May 30, 2002 at 02:11:18AM -0700, Greg Stein wrote:
> Certainly, there are some spot-improvements that could be done right now
> (e.g. rather than open/close .svn/README to test for an admin dir, we could
> just stat() the file). Finding those spots will need a different type of
> analysis tho. Finding larger types of fixes would probably be best in
> another month or so.
> 

A week or so ago, I straced an "svn co" of the svn trunk, and compiled
some numbers. 177 of 224 wall-clock seconds were spent in select(), waiting
on the server.  The next highest was write(), at 1 wall-clock second.

The total amount of time spent in all system calls was 186 seconds.

That means that making the server infinitely fast would reduce wall-clock
time to 47 seconds, only 9 of which would be in write/open/close/unlink/mkdir
etc.

To summarize: less than 20% of the client's work was syscall-bound on
my test, and the total amount of time in the client was less than 21% of
the total time for the co.  This means to me that optimizations on the
client should focus on the 80%, non-syscall related portion.  I'll do
some poking around with gprof to see where this might be.

Soon, I'll try with a local httpd in order to reduce the effects of network
throughput and latency.  I suspect this will cut the 80:20 server:client
ratio down.

I also want to measure using ra_local, but that (obviously) makes separating
wc profiling work from ra profiling work tricky.  The client/server boundary
provides a nice wall that gprof cannot cross. :-)

Another interesting bit of data from strace:  we are reading and rereading
and rerereading .svn/entries files like *crazy*.  The open/read is cheap
because the fs caches, but I wonder what the performance implications of
parsing them again and again are.  A few of the worst offenders:

svn/subversion/libsvn_fs/.svn/entries: 438 times
svn/subversion/clients/cmdline/.svn/entries: 380 times
svn/subversion/bindings/java/jni/.svn/entries: 378 times
svn/subversion/clients/win32/WinSVN/.svn/entries: 315 times

There are 88 .svn/entries files, and they are opened (and I presume
subsequently read) a total of 8153 times.

I suppose I need to take a look at the checkout process itself to
understand why they're being read and reread so many times.

--ben


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: CVS/Subversion/Perforce Timing Statistics (Take 3)

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> People have said this before, but let me make it perfectly clear:
> 
>     The numbers are bad. Plain and simple. Nothing that 
> you've done wrong on
>     your part. Why?
>     
>     YOU'RE TESTING PRE-ALPHA SOFTWARE

I said this same thing in my earlier timing emails.  A good software
engineer will always keep optimization in mind when developing but, as
the saying goes, "Premature optimization is the root of all evil. :)"
In knowing there are some major time fluctuations in Subversion, the
developers of the project can keep their eyes out for anything that
doesn't seem quite right.  Extensive profiling will need to be done at a
later point, but just the knowledge there is a problem is usually
sufficient to get the ball rolling.

> [ of course, take all this with the caveat of "hey, it's open 
> source. I can
>   spend my time doing whatever I'd interests me..."  :-) ]

:)

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: CVS/Subversion/Perforce Timing Statistics (Take 3)

Posted by Greg Stein <gs...@lyra.org>.
On Thu, May 30, 2002 at 01:10:45AM -0600, Joshua Jensen wrote:
> > I MUST be doing something wrong, because the Perforce timings 
> > are not very good.
> 
> Heh... I meant to say Subversion timings.  Two hours of time tests do
> bad things to your mind...

People have said this before, but let me make it perfectly clear:

    The numbers are bad. Plain and simple. Nothing that you've done wrong on
    your part. Why?
    
    YOU'RE TESTING PRE-ALPHA SOFTWARE


We have done *very* little performance analysis and tuning. Kirby Bohling
has written some great tools, to let us concentrate on some of our
client-side syscall usage, but we haven't used them extensively yet. Why?
Because we aren't done with the feature set yet. It would be silly to work
on the performance now. About six weeks ago, we completely revamped the
commit model. We're in the process of revamping the repository structure.
Any of that work could immediately alter the entire performance model of the
software, so it is useless to try to optimize before it is done.

I don't want to discourage your timing. I'd love to see more of it. But it
seems rather invalid to make comparisons between pre-alpha software and
software that has been completed and refined for years.

Certainly, there are some spot-improvements that could be done right now
(e.g. rather than open/close .svn/README to test for an admin dir, we could
just stat() the file). Finding those spots will need a different type of
analysis tho. Finding larger types of fixes would probably be best in
another month or so.

[ of course, take all this with the caveat of "hey, it's open source. I can
  spend my time doing whatever I'd interests me..."  :-) ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: CVS/Subversion/Perforce Timing Statistics (Take 3)

Posted by Joshua Jensen <jj...@workspacewhiz.com>.
> I MUST be doing something wrong, because the Perforce timings 
> are not very good.

Heh... I meant to say Subversion timings.  Two hours of time tests do
bad things to your mind...

Josh


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org