You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Steve Seremeth <su...@seremeth.com> on 2005/03/15 21:34:45 UTC
Best Practices: Performance on Large Repositories?
Hello -
We have a repository of about 11000 files totalling about 1200 MB. A
commit or a status done at the trunk dir level takes as much as 30
minutes whether using the svn command line (Windows XP or AIX) or
Tortoise -- and seems to use very little CPU. Also, the server (RH ES 3
on 2.4 Ghz Xeon using apache/berkeley db) just idles most of the time
and seems to be doing nothing during any of these actions.
What is the best way to improve performance? Since the server is idling
I'm assuming it's not a software problem on the repository server. Are
there benchmarks somewhere that we can look at to see how far off our
setup is or is there anything we can do to semi-officially test
performance against known good data?
We end up doing a lot of "time svn commit trunk".
Any advice would be greatly appreciated. If we're doing something
wrong, that would be welcome news. If we are not doing something wrong,
perhaps relative performance information should be posted in the FAQ?
TIA -
Steve
P.S. I know we would be doing better if we were working on smaller
pieces of the repository at any one time. No one interested in "cvs
modules"-style functionality?
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Best Practices: Performance on Large Repositories?
Posted by Eric Gillespie <ep...@pretzelnet.org>.
Steve Seremeth <su...@seremeth.com> writes:
> We have a repository of about 11000 files totalling about 1200 MB.
The repository is perfectly happy with data sets far larger than
that. The issue is with large working copies. Crawling a
working copy is very expensive.
> A commit or a status done at the trunk dir level takes as much
> as 30 minutes whether using the svn command line (Windows XP or
> AIX)
We have working copies of 70,000 files, and it "only" takes 10 -
15 minutes to crawl them. Maybe the difference is accounted for
by different operating systems (Linux, NetBSD, and FreeBSD here).
> What is the best way to improve performance?
Don't let svn crawl your working copy; tell it what to commit
('svn commit foo/bar/baz.c' vs. 'svn commit').
--
Eric Gillespie <*> epg@pretzelnet.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Best Practices: Performance on Large Repositories?
Posted by Steve Seremeth <su...@seremeth.com>.
Followup: this issue has been somewhat resolved.
It seems that if not much else is running on a Windows system that is
using TortoiseSVN or the svn command line interface -- we have been able
to see major performance improvements. Sounds obvious, but please allow
me to elaborate.
We had been seeing 20-30 minute times to "check for modifications" (i.e.
status) when run against the top of our working copy tree of about
11,000 version-controlled files (plus another 44,000 svn control files)
on certain systems. We thought everyone was feeling this pain here, but
that was not the case -- only a few of us. Also, the amount of data in
the working copy is closer to 2 GB on disk instead of the 1200 MB we
thought we had.
Ok - so back to the details (or what we _do_ know). We would regularly
kick off a "Check for modifications" at the top of the tree and see
TortoiseProc just linger around in Task Manager's process list not using
much cpu _at all_. We would see similar things with svn.exe when run at
the command line. When sorting on cpu use, often times the only thing
in the process list taking more (ok, almost all) the cpu was "System
Idle Process". Very frustrating.
On one system, we killed a vmware instance that was lingering around and
got status checks generally down to about 12 seconds a shot -- VERY
liveable performance. We still have a couple systems experiencing very
slow Tortoise/Subversion performance, but clearly it's some other
process on those boxes causing the issue. We have seen McAffee AV
causing pain from time to time, but we have not concluded where the real
devil is. Some of us have Windows Indexing on and some of have it off
with mixed performance degradation.
So while we're not positive what is causing our problems, we are
starting to decide:
1. It's not Subversion related (although it's clear TortoiseSVN doesn't
help with screen draws in Windows Explorer).
2. Subversion is fine performance-wise even with our fairly large
repository.
Is there a way to run some sort of debug build under windows to see what
the slowness is (like wait i/o, etc.)? Maybe if it could generate a log
that had timestamps for each part of a transaction?
Thanks for everyone's input.
Regards,
Steve
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Best Practices: Performance on Large Repositories?
Posted by Erik Huelsmann <e....@gmx.net>.
> Hello -
>
> We have a repository of about 11000 files totalling about 1200 MB. A
> commit or a status done at the trunk dir level takes as much as 30
> minutes whether using the svn command line (Windows XP or AIX) or
> Tortoise -- and seems to use very little CPU. Also, the server (RH ES 3
> on 2.4 Ghz Xeon using apache/berkeley db) just idles most of the time
> and seems to be doing nothing during any of these actions.
>
> What is the best way to improve performance? Since the server is idling
> I'm assuming it's not a software problem on the repository server. Are
> there benchmarks somewhere that we can look at to see how far off our
> setup is or is there anything we can do to semi-officially test
> performance against known good data?
>
> We end up doing a lot of "time svn commit trunk".
>
> Any advice would be greatly appreciated. If we're doing something
> wrong, that would be welcome news. If we are not doing something wrong,
> perhaps relative performance information should be posted in the FAQ?
The time required to do a commit also has a lot to do with filesystem
performance of the filesystem on which your working copy is located.
bye,
Erik.
--
DSL Komplett von GMX +++ Superg�nstig und stressfrei einsteigen!
AKTION "Kein Einrichtungspreis" nutzen: http://www.gmx.net/de/go/dsl
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org