You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@alice-dsl.de> on 2010/10/01 08:50:58 UTC

Re: Some tips on profiling

  On 30.09.2010 04:28, Ramkumar Ramachandra wrote:
> Hi Stefan,
>
> Stefan Fuhrmann writes:
>> My measurements seem to support what Philip wrote:
>> The expensive part is run on the server. Even with my
>> optimized server, the svnrdump CPU usage is less than
>> the time taken by the server. Some numbers (hot file
>> cache):
>>
>> svnadmin dump
>>      1.7 trunk 70s real  66s user 4s system
>>      perf-branch 30s real 28s user 2s system
>>
>> 1.7 trunk svnrdump
>>      ra-local 88s real 81s user 7s system
>>      svn: (1.7 trunk) 99s real 6s user 4s system
>>      svn: (perf-branch, cold)  72s real 5s user 6s system
>>      svn: (perf-branch, hot)  17s real 5s user 5s system
>>
>> Thus, svnrdump is slower only for ra-local where it is
>> of no particular use in the first place. To really speed
>> things up, the caching infrastructure from the performance
>> branch should be merged into /trunk.
> Wow, that looks awesome. Yeah, Daniel suggested that I run svnrdump
> against your perf branch yesterday, but I wasn't able to get your
> branch to build:
> subversion/libsvn_subr/svn_file_handle_cache.c:890: error: 'svn_file_handle_cache_t' has no member named 'mutex'
You obviously don't have APR threads enabled.
Thanks for finding this. Fixed in r1003430.
> What exactly are the changes that "hot" introduces- can you point me
> to the specific revision numbers that we intend to merge?
The server caches almost everything. So, the
first ("cold") run demonstrates the worst-case,
the second run ("hot") the best case. Real
world performance depends on server memory
and load.

For the merge part: please review the "integrate-membuffer"
branch (only 3 patches). You may also review and
merge any of the patches listed in /branches/performance/STATUS.

>>> @Stefan: Thoughts on hacking APR hashtables directly?
>>>
>> Are you sure?! Which operation is the most expensive one
>> and how often is it called? Who calls it and why?
> My bad. I got muddled up when using ra_local: when I saw lots of MD5
> functions, I assumed it had to do something with the hashtable since
> the checksum was caculated by server-side. The profiler profiled both
> server-side and client-side. Over svn://, I got a much cleaner
> output. The new results indicate:
> 1. Server-client IO is the major bottleneck, as Philip and you pointed
>     out.
> 2. On the client side, the major bottlneck is the IO during textdelta
>     application. David (CC'ed now) and I are trying some experiments
>     with a single temporary file. Thoughts?
For the MD5 stuff, try r986459 (performance branch).
It should eliminate 1 of the 3 MD5 calculations.

As for the temp files, there are some improvements
listed in /branches/performance/STATUS. These would
also reduce your "system" CPU time.

-- Stefan^2.

Re: Some tips on profiling

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Sun, Oct 24, 2010 at 20:55:04 +0200:
> On 04.10.2010 16:58, Ramkumar Ramachandra wrote:
>> Hi Stefan,
> Sorry for the late reply. This has been the first weekend
> that I didn't spend in office for some time.
>> Stefan Fuhrmann writes:
>>>> I enabled it, but there's still some issue:
>>>> subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
>>>>
>>> It builds here. Did you run autogen.sh before ./configure?
>> Yep, I did. I tried it several times again; same issue. Is the
>> build.conf broken?
> Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010).

Is it linking against the installed libraries by accident?  This can
happen on Debian an older Subversion (that doesn't have that function)
is on the search path (e.g., by virtue of being the --prefix directory).

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010).

I'll check this -- must be some issue at my end; saw Daniel's response
too. Thanks.

> That would be awesome! Any weekend would do ;)

:)

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
On 04.10.2010 16:58, Ramkumar Ramachandra wrote:
> Hi Stefan,
Sorry for the late reply. This has been the first weekend
that I didn't spend in office for some time.
> Stefan Fuhrmann writes:
>>> I enabled it, but there's still some issue:
>>> subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
>>>
>> It builds here. Did you run autogen.sh before ./configure?
> Yep, I did. I tried it several times again; same issue. Is the
> build.conf broken?
Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010).
>
>> As soon as a larger number of patches got reviewed and merged,
>> I will prepare further changes for integration. So far, nobody had
>> free cycles to do the reviews.
> I'm being stretched really thin myself- I sometimes have to sacrifice
> several hours of sleep to keep up :| I'll try my best but I can't
> promise. Also, there's the additional overhead of having to wait for
> approvals- if I can't pull it off, I'll request a full committer to
> take over.
>
>>> I had the chance to check them out and apply them just now. They work
>>> as expected. I'll submit some (uneducated) patch reviews to the list
>>> and request a merge. I don't have access to the areas your patches
>>> touch.
>> I really appreciate that. It would be great if someone had the time
>> to review the 3 commits to the membuffer cache integration branch.
>> The review should not require too much context knowledge. An
>> in-depth review will take a full day or so (like for an average sized
>> C++ class).
> Thanks for the estimate- Instead of jumping between classes and
> attempting to review it bit-by-bit, I'll try to allocate a Saturday or
> Sunday to this task.
That would be awesome! Any weekend would do ;)

-- Stefan^2.

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> >I enabled it, but there's still some issue:
> >subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
> >
> It builds here. Did you run autogen.sh before ./configure?

Yep, I did. I tried it several times again; same issue. Is the
build.conf broken?

> >>For the MD5 stuff, try r986459 (performance branch).
> >>It should eliminate 1 of the 3 MD5 calculations.
> >Why doesn't STATUS document this and everything else consistently?
> >
> Because there is no simple mapping rev->feature / improvement.
> In particular, there are a number of intermediate steps that were
> replaced by new code later on. There is no point in reviewing
> these earlier revisions but the newer ones can't be reviewed and
> merged on their own. Hence the integration branch for the first
> major change.

Ah, I saw that.

> As soon as a larger number of patches got reviewed and merged,
> I will prepare further changes for integration. So far, nobody had
> free cycles to do the reviews.

I'm being stretched really thin myself- I sometimes have to sacrifice
several hours of sleep to keep up :| I'll try my best but I can't
promise. Also, there's the additional overhead of having to wait for
approvals- if I can't pull it off, I'll request a full committer to
take over.

> >I had the chance to check them out and apply them just now. They work
> >as expected. I'll submit some (uneducated) patch reviews to the list
> >and request a merge. I don't have access to the areas your patches
> >touch.
> I really appreciate that. It would be great if someone had the time
> to review the 3 commits to the membuffer cache integration branch.
> The review should not require too much context knowledge. An
> in-depth review will take a full day or so (like for an average sized
> C++ class).

Thanks for the estimate- Instead of jumping between classes and
attempting to review it bit-by-bit, I'll try to allocate a Saturday or
Sunday to this task.

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
  On 01.10.2010 15:56, Ramkumar Ramachandra wrote:
> Hi Stefan,
>
> Stefan Fuhrmann writes:
>> You obviously don't have APR threads enabled.
>> Thanks for finding this. Fixed in r1003430.
> I enabled it, but there's still some issue:
> subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
>
It builds here. Did you run autogen.sh before ./configure?
>> The server caches almost everything. So, the
>> first ("cold") run demonstrates the worst-case,
>> the second run ("hot") the best case. Real
>> world performance depends on server memory
>> and load.
> Ah. Thanks for the clarification.
>
>> For the merge part: please review the "integrate-membuffer"
>> branch (only 3 patches). You may also review and
>> merge any of the patches listed in /branches/performance/STATUS.
> integrate-cache-membuffer, you mean? Thanks! Your emails contains
> references exactly to the patches I'm looking for :)
>
>> For the MD5 stuff, try r986459 (performance branch).
>> It should eliminate 1 of the 3 MD5 calculations.
> Why doesn't STATUS document this and everything else consistently?
>
Because there is no simple mapping rev->feature / improvement.
In particular, there are a number of intermediate steps that were
replaced by new code later on. There is no point in reviewing
these earlier revisions but the newer ones can't be reviewed and
merged on their own. Hence the integration branch for the first
major change.

As soon as a larger number of patches got reviewed and merged,
I will prepare further changes for integration. So far, nobody had
free cycles to do the reviews.
>> As for the temp files, there are some improvements
>> listed in /branches/performance/STATUS. These would
>> also reduce your "system" CPU time.
> I had the chance to check them out and apply them just now. They work
> as expected. I'll submit some (uneducated) patch reviews to the list
> and request a merge. I don't have access to the areas your patches
> touch.
I really appreciate that. It would be great if someone had the time
to review the 3 commits to the membuffer cache integration branch.
The review should not require too much context knowledge. An
in-depth review will take a full day or so (like for an average sized
C++ class).

-- Stefan^2.

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> You obviously don't have APR threads enabled.
> Thanks for finding this. Fixed in r1003430.

I enabled it, but there's still some issue:
subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'

> The server caches almost everything. So, the
> first ("cold") run demonstrates the worst-case,
> the second run ("hot") the best case. Real
> world performance depends on server memory
> and load.

Ah. Thanks for the clarification.

> For the merge part: please review the "integrate-membuffer"
> branch (only 3 patches). You may also review and
> merge any of the patches listed in /branches/performance/STATUS.

integrate-cache-membuffer, you mean? Thanks! Your emails contains
references exactly to the patches I'm looking for :)

> For the MD5 stuff, try r986459 (performance branch).
> It should eliminate 1 of the 3 MD5 calculations.

Why doesn't STATUS document this and everything else consistently?

> As for the temp files, there are some improvements
> listed in /branches/performance/STATUS. These would
> also reduce your "system" CPU time.

I had the chance to check them out and apply them just now. They work
as expected. I'll submit some (uneducated) patch reviews to the list
and request a merge. I don't have access to the areas your patches
touch.

-- Ram