You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "D.J. Heap" <dj...@shadyvale.net> on 2005/05/08 15:57:58 UTC
Quick analysis of svn log --limit
I'm not terribly familiar with the code in libsvn_repos, but there are two
places that seem to be the source of slowness even with --limit turned on:
subversion/libsvn_repos/log.c:svn_repos_get_logs3 -
1. When passing one path such as /trunk (or / which mod_dav_svn always
seems to do even if no deeper path was specified) it performs the
svn_repos_history2 call and spends a lot of time there gathering all revs
for the path (which is nearly all the revs in the repo for /trunk and *is*
all revs in the repo for /). So, svn_repos_history needs to be revved to
take a limit parameter?
2. Further down in the for loop to process the revs, there is a check to
see if the rev was one identified by the previous svn_repos_history2 call
(line 348). This check is a brute force linear search. While probably not
a huge issue, it seems like it could changed to use a binary search since
the revs array has been sorted above? It could make a noticeable difference
on repos with many thousands of revisions.
It seems like #1 is the major perf killer since it is essentially reading
almost all revs in the common case of 'svn log --limit 10' on a working copy
of trunk.
This is confirmed by the fact that 'svn log --limit 10
svn://localhost/svn-repo' returns nearly instantly, where as 'svn log
--limit 10 svn://localhost/svn-repo/trunk' takes about 10 seconds for me.
The first case is an empty path in svn_repos_get_logs3 and so skips the
svn_repos_history2 call. Curiously, mod_dav_svn always passes at least '/'
in the path and so never bypasses the svn_repos_history2 call.
DJ
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Quick analysis of svn log --limit
Posted by Chia-liang Kao <cl...@clkao.org>.
Peter N. Lundblad <peter <at> famlundblad.se> writes:
> > A potentially better change would be to rewrite svn_repos_get_logs3 to
> > not use svn_repos_history at all. The new design would use the
> > svn_fs_history interface directly. It would create a history object for
> > each provided path, and would step through them in parallel to produce a
> > sequence of revisions, stopping when it hits the limit. I think that
> > would require the least amount of I/O work.
> >
> That's the solution I've had in mind for some time... I like it. You need
> a pair of iteration pools per path, don't you?
I think you can that the repos_history to respect return value
from the callback, which you store the limit in baton. this seems
more general and is what I had:
http://cpansearch.bulknews.net/markup/SVK-0.994/lib/SVK/Util.pm#l784
Cheers,
CLK
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Quick analysis of svn log --limit
Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sun, 8 May 2005, Greg Hudson wrote:
> A potentially better change would be to rewrite svn_repos_get_logs3 to
> not use svn_repos_history at all. The new design would use the
> svn_fs_history interface directly. It would create a history object for
> each provided path, and would step through them in parallel to produce a
> sequence of revisions, stopping when it hits the limit. I think that
> would require the least amount of I/O work.
>
That's the solution I've had in mind for some time... I like it. You need
a pair of iteration pools per path, don't you?
Best,
//Peter
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Quick analysis of svn log --limit
Posted by Greg Hudson <gh...@MIT.EDU>.
On Sun, 2005-05-08 at 09:57 -0600, D.J. Heap wrote:
> subversion/libsvn_repos/log.c:svn_repos_get_logs3 -
> 1. When passing one path such as /trunk (or / which mod_dav_svn always
> seems to do even if no deeper path was specified) it performs the
> svn_repos_history2 call and spends a lot of time there gathering all revs
> for the path (which is nearly all the revs in the repo for /trunk and *is*
> all revs in the repo for /). So, svn_repos_history needs to be revved to
> take a limit parameter?
That would probably be the simplest change, although it's not as
efficient as it could be. (If I provide fifty paths and one of them
is /trunk, and the limit is 10, there's no reason to chase down ten
revisions of /trunk/this/file/hasnt/been/changed/in/ages.)
A potentially better change would be to rewrite svn_repos_get_logs3 to
not use svn_repos_history at all. The new design would use the
svn_fs_history interface directly. It would create a history object for
each provided path, and would step through them in parallel to produce a
sequence of revisions, stopping when it hits the limit. I think that
would require the least amount of I/O work.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Quick analysis of svn log --limit
Posted by Garrett Rooney <ro...@electricjellyfish.net>.
D.J. Heap wrote:
> Garrett Rooney wrote:
> [snip]
>
>>
>> That seems likely to fix the problem, or if we don't want to rev
>> svn_repos_history2 (perhaps so that such a change could be merged into
>> a subsequent 1.2.x release) we could also change svn_repos_get_logs3
>> to call svn_repos_history2 iteratively with progressively further back
>> ranges of revisions until it finds enough revisions to satisfy the
>> limit argument.
>>
>> -garrett
>
>
> Yes, both your and Greg's ideas would remove the need for an interface
> change and significantly improve efficiency. It seems like Greg's
> method would be somewhat more efficient although a slightly more complex
> change?
Yes, that seems to be the case.
-garrett
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Quick analysis of svn log --limit
Posted by "D.J. Heap" <dj...@shadyvale.net>.
Garrett Rooney wrote:
[snip]
>
> That seems likely to fix the problem, or if we don't want to rev
> svn_repos_history2 (perhaps so that such a change could be merged into a
> subsequent 1.2.x release) we could also change svn_repos_get_logs3 to
> call svn_repos_history2 iteratively with progressively further back
> ranges of revisions until it finds enough revisions to satisfy the limit
> argument.
>
> -garrett
Yes, both your and Greg's ideas would remove the need for an interface
change and significantly improve efficiency. It seems like Greg's method
would be somewhat more efficient although a slightly more complex change?
I will work on this, but it will likely not be ready for a couple of days --
I have some experimenting/learning to do. So if someone else wants to get
it done sooner, feel free.
DJ
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Quick analysis of svn log --limit
Posted by Garrett Rooney <ro...@electricjellyfish.net>.
D.J. Heap wrote:
> I'm not terribly familiar with the code in libsvn_repos, but there are
> two places that seem to be the source of slowness even with --limit
> turned on:
>
> subversion/libsvn_repos/log.c:svn_repos_get_logs3 -
> 1. When passing one path such as /trunk (or / which mod_dav_svn
> always seems to do even if no deeper path was specified) it performs the
> svn_repos_history2 call and spends a lot of time there gathering all
> revs for the path (which is nearly all the revs in the repo for /trunk
> and *is* all revs in the repo for /). So, svn_repos_history needs to be
> revved to take a limit parameter?
That seems likely to fix the problem, or if we don't want to rev
svn_repos_history2 (perhaps so that such a change could be merged into a
subsequent 1.2.x release) we could also change svn_repos_get_logs3 to
call svn_repos_history2 iteratively with progressively further back
ranges of revisions until it finds enough revisions to satisfy the limit
argument.
-garrett
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org