You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Johan Corveleyn <jo...@uz.kuleuven.ac.be> on 2009/04/29 14:28:24 UTC

svn log slowness

I know that this isn't earth-shattering news, and it's not really a show-stopper, but it's bugging me: "svn log" is slow. 

We're in the process of migrating from CVS to SVN (1.5.4 on Solaris 10), and this is one of the issues that caught my attention. Don't get me wrong, I really like SVN, and I'm convinced it's a Good Thing for us to migrate. It's just one of those things that might annoy some people (especially since we request the log of some files quite often (or more particularly: "Show history" in IntelliJ)).

Now, before you all bombard me with "use svnserve, it's much faster" or "make sure SVNPathAuthz is off in your Apache config": yes, I've read the mailinglist archives, and no, that doesn't help. I have tried both svnserve and mod_dav_svn, both with SVNPathAuthz on and off (and with/without SSL). I have also tried local access via the file:// protocol. There are some small differences, but nothing remotely serious.

More concretely: I'm asking the log of a particular file that we change a lot in development. It's an XML file of about 2 Mb that's had about 5500 changes (!) over the years. It's been there since r1210, and has had its last change in r95848. The file is still changed several times a day. Asking the log for a file with a small number of changes (say < 100) is ok, it's just when the file has thousands of changes.

----
Some info about my repo:
SVN 1.5.4 on Solaris 10 (used package from sunfreeware.com)
Server hardware: Sun SPARC-Enterprise-T5120 with 32 processors (I don't think that matters in this test, since we'll exercise only 1 processor)
FSFS backend mounted via NFS from a NetApp device (supposedly very high-end equipment)
~95000 revisions
~70000 files
~3.5 Gb disk usage by the repository ("du -rsh <repo>")

----
Some statistics from a couple of tests (all executed on the SVN server itself with its svn command line client, or via SlikSVN command line client from my remote windows machine; doesn't make much of a difference):
1) file://
3m45s ("warmup", I'm guessing some of the IO gets cached hereafter)
1m32s
1m25s
1m24s

2) http (SVNPathAuthz on) 
(executed shortly after file:// test, so we see no warmup effect)
1m36s
1m38s

3) https (SVNPathAuthz off) 
(executed shortly after http:// test, so we see no warmup effect)
1m24s
1m23s

4) svn+ssh
3m30s (warmup again, I tested this an hour later)
1m26s
1m26s

5) CVS log for the same file (for comparison)
5,5s
3,4s
4,2s
1,5s
3,4s

Notes: 
- In each test, I redirected all output to /dev/null, to eliminate any performance impact of writing to stdout.
- About the "warmup": I guess this is because the operation is mainly IO bound on the server. The first time all the revisions (or revprops?) need to be read from disk. The second time, they are (partly) in disk cache.
- I'm guessing that the SVNPathAuthz accounts for the extra 12-15 seconds in the case of the http test (compared to the https test without SVNPathAuthz).

----
So the SVN record is 1m23s, over https (no SVNPathAuthz), and only after the "warmup" (i.e. if you're lucky that someone else has requested the same log less than an hour ago). If you get the "warmup" hit, you have to wait for at least 3,5 minutes to get the full log. Compare that to the 5 seconds max that we were used to with CVS. Just to re-iterate: that doesn't mean we're ditching SVN in favor of CVS (quite the contrary), but it still hurts :(.

Are these normal numbers? Anyone else seeing these sort of figures with files with thousands of changes in a large repo? Is there anything that I can do about this? Any suggestions on further diagnosis or options I can try? 

I guess I can still try with BDB, or FSFS with the repo on a local disk (this is not really an option for us on the longer term) to eliminate some potential bottlenecks. Or test with 1.6.1 and packed shards (or memcached maybe, anyone have any experience with that?). However, I feel that these will give me only minor improvements at best (never a factor 50 which I would like). Gut feeling tells me this is really a limitation of the way SVN works currently. Compared to CVS, which just has to extract and send a part of the RCS file, I guess SVN has to crawl the entire repository to get all the info. 

I seem to remember having read some discussion about buffering/caching in the SVN repository some of the metadata with the files that are affected to speed up things like this (sorry, can't find it again, but I think it was in the comments of some bug report). However, I think the idea of caching was rejected. But if such caching would solve this issue, I would really like to see such a feature appear in SVN. Or are there other dev ideas which could improve this?

Oh, and one more thing: some "workarounds" I've kind of eliminated:
- Using "--limit 100" and the like: not really an option for me, because we use it via the IntelliJ IDE (its subversion plugin uses SVNKit), so we have no direct access to the commands IntelliJ executes.
- I guess we could rename (svn mv) the current file to "file.old.2009" and start with a clean history with a new file. But this would be a pain in the beginning (you'll have to look sometimes at the log of file.old.2009 to get the right info; and after a year the new file will again have 1000 revisions, so its "svn log" will start to slow down again).


Thanks for reading, and sorry for the long post (I felt I could just as well give most details immediately).

Regards,
Johan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1980568

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: svn log slowness

Posted by Stefan Sperling <st...@elego.de>.
On Sun, May 10, 2009 at 12:42:19AM +0200, Johan Corveleyn wrote:
> > > Is there any specific reason why the CollabNet binary for Solaris
> > > doesn't include BDB support? Is it just because FSFS is the default,
> > > and the easiest for maintenance?
> > 
> > I don't think you want to use a BDB repository mounted over NFS.
> > FSFS has been designed for that, but BDB is supposed to have problems
> > with it.
> 
> Ok, thanks for reminding me. But apart from that, is there a specific
> reason why BDB support is not built into the subversion binaries for
> Solaris (don't know about binaries for other platforms)?

No idea. You might be better off asking the people who build those
binaries directly if you don't get an answer on this list.

Stefan

Re: svn log slowness

Posted by Mark Phippard <ma...@gmail.com>.
On Sat, May 9, 2009 at 6:42 PM, Johan Corveleyn
<jo...@uz.kuleuven.ac.be> wrote:
>> > Is there any specific reason why the CollabNet binary for Solaris
>> > doesn't include BDB support? Is it just because FSFS is the default,
>> > and the easiest for maintenance?
>>
>> I don't think you want to use a BDB repository mounted over NFS.
>> FSFS has been designed for that, but BDB is supposed to have problems
>> with it.
>
> Ok, thanks for reminding me. But apart from that, is there a specific reason why BDB support is not built into the subversion binaries for Solaris (don't know about binaries for other platforms)?

None of the CollabNet-certified binaries include BDB support.  It is
not just Solaris.  This is primarily for support reasons.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2190010

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].

RE: svn log slowness

Posted by Johan Corveleyn <jo...@uz.kuleuven.ac.be>.
> > Is there any specific reason why the CollabNet binary for Solaris
> > doesn't include BDB support? Is it just because FSFS is the default,
> > and the easiest for maintenance?
> 
> I don't think you want to use a BDB repository mounted over NFS.
> FSFS has been designed for that, but BDB is supposed to have problems
> with it.

Ok, thanks for reminding me. But apart from that, is there a specific reason why BDB support is not built into the subversion binaries for Solaris (don't know about binaries for other platforms)?

Johan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2153596

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: svn log slowness

Posted by Stefan Sperling <st...@elego.de>.
On Sat, May 09, 2009 at 09:56:49PM +0200, Johan Corveleyn wrote:
> In any case, it made me think that packing revprop shards (as was
> discussed here some time ago) could make a big difference for svn log
> with FSFS. Switching to BDB might also help a lot, but unfortunately
> that's also not really possible for us currently, because the
> available Solaris binaries of SVN don't include support for BDB (we've
> checked CollabNet and sunfreeware). And we don't really have the
> experience nor time to build it from source.
> 
> Is there any specific reason why the CollabNet binary for Solaris
> doesn't include BDB support? Is it just because FSFS is the default,
> and the easiest for maintenance?

I don't think you want to use a BDB repository mounted over NFS.
FSFS has been designed for that, but BDB is supposed to have problems
with it.

See http://svnbook.red-bean.com/en/1.5/svn.reposadmin.planning.html#svn.reposadmin.basics.backends

Stefan

RE: svn log slowness

Posted by Johan Corveleyn <jo...@uz.kuleuven.ac.be>.
>Is it possible that your server is I/O bound? Logs are stored in
>revprops, one per revision. So if you're pulling the history for a
>large range of revisions, you're doing a lot of access on many small
>files - some filesystems are very poor in this usage scenario.

Just recording some more info for posterity: yes, it seems to be largely IO bound. This became apparent during some experiments, where we put the FSFS repository on a local disk instead of on the NFS-mounted netapp. It improved svn log speed by more than 50%:

For our big file with 5500 revisions over https (SvnPathAuthz off)
- NFS-mounted netapp: ~3m20s
- local disk: ~1m30s
Still a long way from the 3-5 seconds we had with CVS, but it's already a big improvement.

Unfortunately, this is not a viable production option for us (needs to be on the netapp, to have all the high-availability and backup options available). But it prompted us to further investigate IO optimization options, configuration of the netapp, NFS mount options, ... I'll post here if we have any progress with these.

In any case, it made me think that packing revprop shards (as was discussed here some time ago) could make a big difference for svn log with FSFS. Switching to BDB might also help a lot, but unfortunately that's also not really possible for us currently, because the available Solaris binaries of SVN don't include support for BDB (we've checked CollabNet and sunfreeware). And we don't really have the experience nor time to build it from source.

Is there any specific reason why the CollabNet binary for Solaris doesn't include BDB support? Is it just because FSFS is the default, and the easiest for maintenance?

Regards,
Johan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2151012

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: svn log slowness

Posted by Staffan Tjernstrom <su...@staffan.tjernstrom.name>.
On Wed, Apr 29, 2009 at 12:17:34PM -0400, Andy Levy wrote:
>On Wed, Apr 29, 2009 at 11:53, Johan Corveleyn
><jo...@uz.kuleuven.ac.be> wrote:
>> Thanks for the suggestions. Our clients are on Windows.
>>
>> I've played with TortoiseSVN, and it solves this problem very nicely (because it defaults to only getting the last 100 entries of the log (--limit 100), and offers buttons to get the next/previous 100). I don't know about the caching though (I'll check it out, also for SmartSVN).
>>
>> However, in our usual workflow this doesn't help much, as we use IntelliJ (Java IDE), with its integrated Version Control support (with SVN plugin through SVNKit). I guess I could try to contribute to IntelliJ (to the SVN plugin) to implement similar functionality as TortoiseSVN (i.e. --limit 100 and offer buttons to get more), or to load the history in the background and show it while it comes in. There are actually two IntelliJ issues open for this "improvement": http://www.jetbrains.net/jira/browse/IDEA-11092 and http://www.jetbrains.net/jira/browse/IDEA-22245. But I'd really, ultimately like this to be solved by the SVN server (all other solutions are patchwork).
>>
>> I guess the options are (in order of my preference):
>> - SVN server could be faster in retrieving this info
>> - SVN could implement caching on the client-side in its metadata (in wc-ng?)
>> - Maybe SVNKit could support some kind of caching client-side somewhere (??)
>> - Ultimately, the GUI client could take care of this, like TortoiseSVN or SmartSVN, and like maybe IntelliJ's SVN plugin should
>
>Is it possible that your server is I/O bound? Logs are stored in
>revprops, one per revision. So if you're pulling the history for a
>large range of revisions, you're doing a lot of access on many small
>files - some filesystems are very poor in this usage scenario.
>
Maybe unrelated, but I have seen (elsewhere) reports that depending on the version of SVNKit being
used, performance can be very sluggish for large repositories. For instance,
we recently upgraded our FishEye instance, and browsing performance improved
by at least an order of magnitude.


Re: svn log slowness

Posted by Andy Levy <an...@gmail.com>.
On Wed, Apr 29, 2009 at 11:53, Johan Corveleyn
<jo...@uz.kuleuven.ac.be> wrote:
> Thanks for the suggestions. Our clients are on Windows.
>
> I've played with TortoiseSVN, and it solves this problem very nicely (because it defaults to only getting the last 100 entries of the log (--limit 100), and offers buttons to get the next/previous 100). I don't know about the caching though (I'll check it out, also for SmartSVN).
>
> However, in our usual workflow this doesn't help much, as we use IntelliJ (Java IDE), with its integrated Version Control support (with SVN plugin through SVNKit). I guess I could try to contribute to IntelliJ (to the SVN plugin) to implement similar functionality as TortoiseSVN (i.e. --limit 100 and offer buttons to get more), or to load the history in the background and show it while it comes in. There are actually two IntelliJ issues open for this "improvement": http://www.jetbrains.net/jira/browse/IDEA-11092 and http://www.jetbrains.net/jira/browse/IDEA-22245. But I'd really, ultimately like this to be solved by the SVN server (all other solutions are patchwork).
>
> I guess the options are (in order of my preference):
> - SVN server could be faster in retrieving this info
> - SVN could implement caching on the client-side in its metadata (in wc-ng?)
> - Maybe SVNKit could support some kind of caching client-side somewhere (??)
> - Ultimately, the GUI client could take care of this, like TortoiseSVN or SmartSVN, and like maybe IntelliJ's SVN plugin should

Is it possible that your server is I/O bound? Logs are stored in
revprops, one per revision. So if you're pulling the history for a
large range of revisions, you're doing a lot of access on many small
files - some filesystems are very poor in this usage scenario.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1981936

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


RE: svn log slowness

Posted by Johan Corveleyn <jo...@uz.kuleuven.ac.be>.
Thanks for the suggestions. Our clients are on Windows. 

I've played with TortoiseSVN, and it solves this problem very nicely (because it defaults to only getting the last 100 entries of the log (--limit 100), and offers buttons to get the next/previous 100). I don't know about the caching though (I'll check it out, also for SmartSVN).

However, in our usual workflow this doesn't help much, as we use IntelliJ (Java IDE), with its integrated Version Control support (with SVN plugin through SVNKit). I guess I could try to contribute to IntelliJ (to the SVN plugin) to implement similar functionality as TortoiseSVN (i.e. --limit 100 and offer buttons to get more), or to load the history in the background and show it while it comes in. There are actually two IntelliJ issues open for this "improvement": http://www.jetbrains.net/jira/browse/IDEA-11092 and http://www.jetbrains.net/jira/browse/IDEA-22245. But I'd really, ultimately like this to be solved by the SVN server (all other solutions are patchwork).

I guess the options are (in order of my preference):
- SVN server could be faster in retrieving this info
- SVN could implement caching on the client-side in its metadata (in wc-ng?)
- Maybe SVNKit could support some kind of caching client-side somewhere (??)
- Ultimately, the GUI client could take care of this, like TortoiseSVN or SmartSVN, and like maybe IntelliJ's SVN plugin should

Come to think of it, this is exactly in order of lower level to higher level solutions.

Regards,
Johan

-----Oorspronkelijk bericht-----
Van: Tyler Roscoe [mailto:tyler@cryptio.net] 
Verzonden: woensdag 29 april 2009 17:23
Aan: Bob Archer
CC: Johan Corveleyn; users@subversion.tigris.org
Onderwerp: Re: svn log slowness

On Wed, Apr 29, 2009 at 10:35:58AM -0400, Bob Archer wrote:
> I hope you don't consider this "bombardment". But, if you are using
> Windows clients you might want to look into using the TortoiseSVN
> client. It does log caching which really does speed things up if you
> are looking at the logs a lot.
> 
> If you are not on Windows... never mind.

SmartSVN has similar log caching functionality, and it's written in
Java, so it runs on "all" platforms.

tyler

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1981651

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: svn log slowness

Posted by Tyler Roscoe <ty...@cryptio.net>.
On Wed, Apr 29, 2009 at 10:35:58AM -0400, Bob Archer wrote:
> I hope you don't consider this "bombardment". But, if you are using
> Windows clients you might want to look into using the TortoiseSVN
> client. It does log caching which really does speed things up if you
> are looking at the logs a lot.
> 
> If you are not on Windows... never mind.

SmartSVN has similar log caching functionality, and it's written in
Java, so it runs on "all" platforms.

tyler

RE: svn log slowness

Posted by Bob Archer <Bo...@infor.com>.
> I know that this isn't earth-shattering news, and it's not really a show-
> stopper, but it's bugging me: "svn log" is slow.
> 
> Now, before you all bombard me with "use svnserve, it's much faster" or
> "make sure SVNPathAuthz is off in your Apache config": yes, I've read the

I hope you don't consider this "bombardment". But, if you are using Windows clients you might want to look into using the TortoiseSVN client. It does log caching which really does speed things up if you are looking at the logs a lot.

If you are not on Windows... never mind.

BOb

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1980748

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].