You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ramkumar Ramachandra <ar...@gmail.com> on 2010/09/27 19:44:10 UTC

Some tips on profiling

Hi Stefan,

Could you tell me which tools you use to profile the various
applications in trunk? I'm looking to profile svnrdump to fix some
perf issues, but OProfile doesn't seem to work for me.

Thanks.

-- Ram

Re: Some tips on profiling

Posted by Daniel Näslund <da...@longitudo.com>.
On Thu, Sep 30, 2010 at 07:58:45AM +0530, Ramkumar Ramachandra wrote:
> Stefan Fuhrmann writes:

> > At least the results are much more useful when there is
> > a tool like Kcachegrind that allows easy navigation though
> > the huge amount of information that was gathered.
> 
> Yep! The visualizer is quite awesome :)

On a tangential note; I spotted OprofileUI [1] the other day when
googling for a graphical frontend to Oprofile. I haven't tested it
myself.

[1] http://labs.o-hand.com/oprofileui/ 

Daniel

Re: Some tips on profiling

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Sun, Oct 24, 2010 at 20:55:04 +0200:
> On 04.10.2010 16:58, Ramkumar Ramachandra wrote:
>> Hi Stefan,
> Sorry for the late reply. This has been the first weekend
> that I didn't spend in office for some time.
>> Stefan Fuhrmann writes:
>>>> I enabled it, but there's still some issue:
>>>> subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
>>>>
>>> It builds here. Did you run autogen.sh before ./configure?
>> Yep, I did. I tried it several times again; same issue. Is the
>> build.conf broken?
> Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010).

Is it linking against the installed libraries by accident?  This can
happen on Debian an older Subversion (that doesn't have that function)
is on the search path (e.g., by virtue of being the --prefix directory).

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010).

I'll check this -- must be some issue at my end; saw Daniel's response
too. Thanks.

> That would be awesome! Any weekend would do ;)

:)

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
On 04.10.2010 16:58, Ramkumar Ramachandra wrote:
> Hi Stefan,
Sorry for the late reply. This has been the first weekend
that I didn't spend in office for some time.
> Stefan Fuhrmann writes:
>>> I enabled it, but there's still some issue:
>>> subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
>>>
>> It builds here. Did you run autogen.sh before ./configure?
> Yep, I did. I tried it several times again; same issue. Is the
> build.conf broken?
Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010).
>
>> As soon as a larger number of patches got reviewed and merged,
>> I will prepare further changes for integration. So far, nobody had
>> free cycles to do the reviews.
> I'm being stretched really thin myself- I sometimes have to sacrifice
> several hours of sleep to keep up :| I'll try my best but I can't
> promise. Also, there's the additional overhead of having to wait for
> approvals- if I can't pull it off, I'll request a full committer to
> take over.
>
>>> I had the chance to check them out and apply them just now. They work
>>> as expected. I'll submit some (uneducated) patch reviews to the list
>>> and request a merge. I don't have access to the areas your patches
>>> touch.
>> I really appreciate that. It would be great if someone had the time
>> to review the 3 commits to the membuffer cache integration branch.
>> The review should not require too much context knowledge. An
>> in-depth review will take a full day or so (like for an average sized
>> C++ class).
> Thanks for the estimate- Instead of jumping between classes and
> attempting to review it bit-by-bit, I'll try to allocate a Saturday or
> Sunday to this task.
That would be awesome! Any weekend would do ;)

-- Stefan^2.

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> >I enabled it, but there's still some issue:
> >subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
> >
> It builds here. Did you run autogen.sh before ./configure?

Yep, I did. I tried it several times again; same issue. Is the
build.conf broken?

> >>For the MD5 stuff, try r986459 (performance branch).
> >>It should eliminate 1 of the 3 MD5 calculations.
> >Why doesn't STATUS document this and everything else consistently?
> >
> Because there is no simple mapping rev->feature / improvement.
> In particular, there are a number of intermediate steps that were
> replaced by new code later on. There is no point in reviewing
> these earlier revisions but the newer ones can't be reviewed and
> merged on their own. Hence the integration branch for the first
> major change.

Ah, I saw that.

> As soon as a larger number of patches got reviewed and merged,
> I will prepare further changes for integration. So far, nobody had
> free cycles to do the reviews.

I'm being stretched really thin myself- I sometimes have to sacrifice
several hours of sleep to keep up :| I'll try my best but I can't
promise. Also, there's the additional overhead of having to wait for
approvals- if I can't pull it off, I'll request a full committer to
take over.

> >I had the chance to check them out and apply them just now. They work
> >as expected. I'll submit some (uneducated) patch reviews to the list
> >and request a merge. I don't have access to the areas your patches
> >touch.
> I really appreciate that. It would be great if someone had the time
> to review the 3 commits to the membuffer cache integration branch.
> The review should not require too much context knowledge. An
> in-depth review will take a full day or so (like for an average sized
> C++ class).

Thanks for the estimate- Instead of jumping between classes and
attempting to review it bit-by-bit, I'll try to allocate a Saturday or
Sunday to this task.

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
  On 01.10.2010 15:56, Ramkumar Ramachandra wrote:
> Hi Stefan,
>
> Stefan Fuhrmann writes:
>> You obviously don't have APR threads enabled.
>> Thanks for finding this. Fixed in r1003430.
> I enabled it, but there's still some issue:
> subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'
>
It builds here. Did you run autogen.sh before ./configure?
>> The server caches almost everything. So, the
>> first ("cold") run demonstrates the worst-case,
>> the second run ("hot") the best case. Real
>> world performance depends on server memory
>> and load.
> Ah. Thanks for the clarification.
>
>> For the merge part: please review the "integrate-membuffer"
>> branch (only 3 patches). You may also review and
>> merge any of the patches listed in /branches/performance/STATUS.
> integrate-cache-membuffer, you mean? Thanks! Your emails contains
> references exactly to the patches I'm looking for :)
>
>> For the MD5 stuff, try r986459 (performance branch).
>> It should eliminate 1 of the 3 MD5 calculations.
> Why doesn't STATUS document this and everything else consistently?
>
Because there is no simple mapping rev->feature / improvement.
In particular, there are a number of intermediate steps that were
replaced by new code later on. There is no point in reviewing
these earlier revisions but the newer ones can't be reviewed and
merged on their own. Hence the integration branch for the first
major change.

As soon as a larger number of patches got reviewed and merged,
I will prepare further changes for integration. So far, nobody had
free cycles to do the reviews.
>> As for the temp files, there are some improvements
>> listed in /branches/performance/STATUS. These would
>> also reduce your "system" CPU time.
> I had the chance to check them out and apply them just now. They work
> as expected. I'll submit some (uneducated) patch reviews to the list
> and request a merge. I don't have access to the areas your patches
> touch.
I really appreciate that. It would be great if someone had the time
to review the 3 commits to the membuffer cache integration branch.
The review should not require too much context knowledge. An
in-depth review will take a full day or so (like for an average sized
C++ class).

-- Stefan^2.

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> You obviously don't have APR threads enabled.
> Thanks for finding this. Fixed in r1003430.

I enabled it, but there's still some issue:
subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config'

> The server caches almost everything. So, the
> first ("cold") run demonstrates the worst-case,
> the second run ("hot") the best case. Real
> world performance depends on server memory
> and load.

Ah. Thanks for the clarification.

> For the merge part: please review the "integrate-membuffer"
> branch (only 3 patches). You may also review and
> merge any of the patches listed in /branches/performance/STATUS.

integrate-cache-membuffer, you mean? Thanks! Your emails contains
references exactly to the patches I'm looking for :)

> For the MD5 stuff, try r986459 (performance branch).
> It should eliminate 1 of the 3 MD5 calculations.

Why doesn't STATUS document this and everything else consistently?

> As for the temp files, there are some improvements
> listed in /branches/performance/STATUS. These would
> also reduce your "system" CPU time.

I had the chance to check them out and apply them just now. They work
as expected. I'll submit some (uneducated) patch reviews to the list
and request a merge. I don't have access to the areas your patches
touch.

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
  On 30.09.2010 04:28, Ramkumar Ramachandra wrote:
> Hi Stefan,
>
> Stefan Fuhrmann writes:
>> My measurements seem to support what Philip wrote:
>> The expensive part is run on the server. Even with my
>> optimized server, the svnrdump CPU usage is less than
>> the time taken by the server. Some numbers (hot file
>> cache):
>>
>> svnadmin dump
>>      1.7 trunk 70s real  66s user 4s system
>>      perf-branch 30s real 28s user 2s system
>>
>> 1.7 trunk svnrdump
>>      ra-local 88s real 81s user 7s system
>>      svn: (1.7 trunk) 99s real 6s user 4s system
>>      svn: (perf-branch, cold)  72s real 5s user 6s system
>>      svn: (perf-branch, hot)  17s real 5s user 5s system
>>
>> Thus, svnrdump is slower only for ra-local where it is
>> of no particular use in the first place. To really speed
>> things up, the caching infrastructure from the performance
>> branch should be merged into /trunk.
> Wow, that looks awesome. Yeah, Daniel suggested that I run svnrdump
> against your perf branch yesterday, but I wasn't able to get your
> branch to build:
> subversion/libsvn_subr/svn_file_handle_cache.c:890: error: 'svn_file_handle_cache_t' has no member named 'mutex'
You obviously don't have APR threads enabled.
Thanks for finding this. Fixed in r1003430.
> What exactly are the changes that "hot" introduces- can you point me
> to the specific revision numbers that we intend to merge?
The server caches almost everything. So, the
first ("cold") run demonstrates the worst-case,
the second run ("hot") the best case. Real
world performance depends on server memory
and load.

For the merge part: please review the "integrate-membuffer"
branch (only 3 patches). You may also review and
merge any of the patches listed in /branches/performance/STATUS.

>>> @Stefan: Thoughts on hacking APR hashtables directly?
>>>
>> Are you sure?! Which operation is the most expensive one
>> and how often is it called? Who calls it and why?
> My bad. I got muddled up when using ra_local: when I saw lots of MD5
> functions, I assumed it had to do something with the hashtable since
> the checksum was caculated by server-side. The profiler profiled both
> server-side and client-side. Over svn://, I got a much cleaner
> output. The new results indicate:
> 1. Server-client IO is the major bottleneck, as Philip and you pointed
>     out.
> 2. On the client side, the major bottlneck is the IO during textdelta
>     application. David (CC'ed now) and I are trying some experiments
>     with a single temporary file. Thoughts?
For the MD5 stuff, try r986459 (performance branch).
It should eliminate 1 of the 3 MD5 calculations.

As for the temp files, there are some improvements
listed in /branches/performance/STATUS. These would
also reduce your "system" CPU time.

-- Stefan^2.

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> My measurements seem to support what Philip wrote:
> The expensive part is run on the server. Even with my
> optimized server, the svnrdump CPU usage is less than
> the time taken by the server. Some numbers (hot file
> cache):
> 
> svnadmin dump
>     1.7 trunk 70s real  66s user 4s system
>     perf-branch 30s real 28s user 2s system
> 
> 1.7 trunk svnrdump
>     ra-local 88s real 81s user 7s system
>     svn: (1.7 trunk) 99s real 6s user 4s system
>     svn: (perf-branch, cold)  72s real 5s user 6s system
>     svn: (perf-branch, hot)  17s real 5s user 5s system
> 
> Thus, svnrdump is slower only for ra-local where it is
> of no particular use in the first place. To really speed
> things up, the caching infrastructure from the performance
> branch should be merged into /trunk.

Wow, that looks awesome. Yeah, Daniel suggested that I run svnrdump
against your perf branch yesterday, but I wasn't able to get your
branch to build:
subversion/libsvn_subr/svn_file_handle_cache.c:890: error: 'svn_file_handle_cache_t' has no member named 'mutex'

What exactly are the changes that "hot" introduces- can you point me
to the specific revision numbers that we intend to merge?

> >@Stefan: Thoughts on hacking APR hashtables directly?
> >
> Are you sure?! Which operation is the most expensive one
> and how often is it called? Who calls it and why?

My bad. I got muddled up when using ra_local: when I saw lots of MD5
functions, I assumed it had to do something with the hashtable since
the checksum was caculated by server-side. The profiler profiled both
server-side and client-side. Over svn://, I got a much cleaner
output. The new results indicate:
1. Server-client IO is the major bottleneck, as Philip and you pointed
   out.
2. On the client side, the major bottlneck is the IO during textdelta
   application. David (CC'ed now) and I are trying some experiments
   with a single temporary file. Thoughts?

> At least the results are much more useful when there is
> a tool like Kcachegrind that allows easy navigation though
> the huge amount of information that was gathered.

Yep! The visualizer is quite awesome :)

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
  On 29.09.2010 10:45, Ramkumar Ramachandra wrote:
> Hi Philip,
>
> Philip Martin writes:
>> The performance of svnrdump is likely to be dominated by IO from the
>> repository, network or disk depending on the RA layer.  strace is a
>> useful tool to see opens/reads/writes.  You can see what order the
>> calls occur, how many there are, how big they are and how long they
>> take.
> Ah, thanks for the tip.
My measurements seem to support what Philip wrote:
The expensive part is run on the server. Even with my
optimized server, the svnrdump CPU usage is less than
the time taken by the server. Some numbers (hot file
cache):

svnadmin dump
     1.7 trunk 70s real  66s user 4s system
     perf-branch 30s real 28s user 2s system

1.7 trunk svnrdump
     ra-local 88s real 81s user 7s system
     svn: (1.7 trunk) 99s real 6s user 4s system
     svn: (perf-branch, cold)  72s real 5s user 6s system
     svn: (perf-branch, hot)  17s real 5s user 5s system

Thus, svnrdump is slower only for ra-local where it is
of no particular use in the first place. To really speed
things up, the caching infrastructure from the performance
branch should be merged into /trunk.

>> Valgrind/Callgrind is good and doesn't require you to instrument the
>> code, but it does help to build with debug information.  It does
>> impose a massive runtime overhead.
> I don't mind -- I'm mostly using some remote machines to gather the
> profiling data :)
>
>> This is what I get when dumping 1000 revisions from a local mirror of
>> the Subversion repository over ra_neon:
>>
>> CPU: Core 2, speed 1200 MHz (estimated)
>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
>> samples  %        app name                 symbol name
>> 4738     41.1893  no-vmlinux               (no symbols)
>> 1037      9.0150  libxml2.so.2.6.32        (no symbols)
>> 700       6.0854  libneon.so.27.1.2        (no symbols)
>> 238       2.0690  libc-2.7.so              _int_malloc
>> 228       1.9821  libc-2.7.so              memcpy
>> 221       1.9212  libc-2.7.so              memset
>> 217       1.8865  libc-2.7.so              strlen
>> 191       1.6604  libsvn_subr-1.so.0.0.0   decode_bytes
>> 180       1.5648  libc-2.7.so              vfprintf
>> 171       1.4866  libc-2.7.so              strcmp
>> 153       1.3301  libapr-1.so.0.2.12       apr_hashfunc_default
>> 134       1.1649  libapr-1.so.0.2.12       apr_vformatter
>> 130       1.1301  libapr-1.so.0.2.12       apr_palloc
>>
>> That's on my Debian desktop.  At the recent Apache Retreat I tried to
>> demonstrate OProfile on my Ubuntu laptop and could not get it to work
>> properly, probably because I forgot about -fno-omit-frame-pointer.
> Ah, now I see why it didn't work for me. The data from Callgrind is
> very interesting- it seems to suggest that APR hashtables are
> prohibitively expensive.
>
> @Stefan: Thoughts on hacking APR hashtables directly?
>
Are you sure?! Which operation is the most expensive one
and how often is it called? Who calls it and why?

>> Finally there is traditional gprof.  It's a long time since I used it
>> so I don't remember the details.  You instrument the code at compile
>> time using CFLAGS=-pg.  If an instrumented function foo calls into a
>> library bar that is not instrumented then bar is invisible, all you
>> see is how long foo took to execute.
> Yes, I used gprof initially. Callgrind is WAY more useful.
>
At least the results are much more useful when there is
a tool like Kcachegrind that allows easy navigation though
the huge amount of information that was gathered.

-- Stefan^2.

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Philip,

Philip Martin writes:
> The performance of svnrdump is likely to be dominated by IO from the
> repository, network or disk depending on the RA layer.  strace is a
> useful tool to see opens/reads/writes.  You can see what order the
> calls occur, how many there are, how big they are and how long they
> take.

Ah, thanks for the tip.

> Valgrind/Callgrind is good and doesn't require you to instrument the
> code, but it does help to build with debug information.  It does
> impose a massive runtime overhead.

I don't mind -- I'm mostly using some remote machines to gather the
profiling data :)

> This is what I get when dumping 1000 revisions from a local mirror of
> the Subversion repository over ra_neon:
> 
> CPU: Core 2, speed 1200 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
> samples  %        app name                 symbol name
> 4738     41.1893  no-vmlinux               (no symbols)
> 1037      9.0150  libxml2.so.2.6.32        (no symbols)
> 700       6.0854  libneon.so.27.1.2        (no symbols)
> 238       2.0690  libc-2.7.so              _int_malloc
> 228       1.9821  libc-2.7.so              memcpy
> 221       1.9212  libc-2.7.so              memset
> 217       1.8865  libc-2.7.so              strlen
> 191       1.6604  libsvn_subr-1.so.0.0.0   decode_bytes
> 180       1.5648  libc-2.7.so              vfprintf
> 171       1.4866  libc-2.7.so              strcmp
> 153       1.3301  libapr-1.so.0.2.12       apr_hashfunc_default
> 134       1.1649  libapr-1.so.0.2.12       apr_vformatter
> 130       1.1301  libapr-1.so.0.2.12       apr_palloc
> 
> That's on my Debian desktop.  At the recent Apache Retreat I tried to
> demonstrate OProfile on my Ubuntu laptop and could not get it to work
> properly, probably because I forgot about -fno-omit-frame-pointer.

Ah, now I see why it didn't work for me. The data from Callgrind is
very interesting- it seems to suggest that APR hashtables are
prohibitively expensive.

@Stefan: Thoughts on hacking APR hashtables directly?

> Finally there is traditional gprof.  It's a long time since I used it
> so I don't remember the details.  You instrument the code at compile
> time using CFLAGS=-pg.  If an instrumented function foo calls into a
> library bar that is not instrumented then bar is invisible, all you
> see is how long foo took to execute.

Yes, I used gprof initially. Callgrind is WAY more useful.

-- Ram

Re: Some tips on profiling

Posted by Philip Martin <ph...@wandisco.com>.
Stefan Fuhrmann <st...@alice-dsl.de> writes:

>  On 27.09.2010 21:44, Ramkumar Ramachandra wrote:
>> Could you tell me which tools you use to profile the various
>> applications in trunk? I'm looking to profile svnrdump to fix some
>> perf issues, but OProfile doesn't seem to work for me.
>
> Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
> That gives me a good idea of what code gets executed too often, how
> often a jump (loop or condition) has been taken etc. It will not show the
> non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
> execution be a factor of 100 (YMMV).

The performance of svnrdump is likely to be dominated by IO from the
repository, network or disk depending on the RA layer.  strace is a
useful tool to see opens/reads/writes.  You can see what order the
calls occur, how many there are, how big they are and how long they
take.

Valgrind/Callgrind is good and doesn't require you to instrument the
code, but it does help to build with debug information.  It does
impose a massive runtime overhead.

OProfile will give you CPU usage with far lower overhead than
valgrind/callgrind.  Like valgrind/callgrind you don't need to
instrument the code but it works better with debug information and
with modern gcc if you use -O2 then you need -fno-omit-frame-pointer
for the callgraph stuff to work.  I use it like so:

opcontrol --init
opcontrol --no-vmlinux --separate=library --callgraph=32
opcontrol --start
opcontrol --reset
subversion/svnrdump/svnrdump ...
opcontrol --stop
opcontrol --dump
opreport --merge all -l image:/path/to/lt-svnrdump

This is what I get when dumping 1000 revisions from a local mirror of
the Subversion repository over ra_neon:

CPU: Core 2, speed 1200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        app name                 symbol name
4738     41.1893  no-vmlinux               (no symbols)
1037      9.0150  libxml2.so.2.6.32        (no symbols)
700       6.0854  libneon.so.27.1.2        (no symbols)
238       2.0690  libc-2.7.so              _int_malloc
228       1.9821  libc-2.7.so              memcpy
221       1.9212  libc-2.7.so              memset
217       1.8865  libc-2.7.so              strlen
191       1.6604  libsvn_subr-1.so.0.0.0   decode_bytes
180       1.5648  libc-2.7.so              vfprintf
171       1.4866  libc-2.7.so              strcmp
153       1.3301  libapr-1.so.0.2.12       apr_hashfunc_default
134       1.1649  libapr-1.so.0.2.12       apr_vformatter
130       1.1301  libapr-1.so.0.2.12       apr_palloc

That's on my Debian desktop.  At the recent Apache Retreat I tried to
demonstrate OProfile on my Ubuntu laptop and could not get it to work
properly, probably because I forgot about -fno-omit-frame-pointer.

Finally there is traditional gprof.  It's a long time since I used it
so I don't remember the details.  You instrument the code at compile
time using CFLAGS=-pg.  If an instrumented function foo calls into a
library bar that is not instrumented then bar is invisible, all you
see is how long foo took to execute.

-- 
Philip

Re: Some tips on profiling

Posted by Ramkumar Ramachandra <ar...@gmail.com>.
Hi Stefan,

Stefan Fuhrmann writes:
> Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
> That gives me a good idea of what code gets executed too often, how
> often a jump (loop or condition) has been taken etc. It will not show the
> non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
> execution be a factor of 100 (YMMV).

Ah, they're brilliant! Just what I was looking for :D I'm yet to
interpret the output and interpret it to opimize svnrdump though-
might need some more help there; are you available on #svn-dev or
something sometime? Will keep you updated on this either way.

> Under Windows, I'm using the VisualStudio sampling profiler. The
> measurements
> are pretty accurate and the overhead is low. It does not tell me, how often
> a certain code path has been executed. Due to the low overhead, it is
> well-suited for long running (>1 min) operations.

Interesting. I didn't know Visual Studio had a profiler. Then again,
I've not used Windows for several years now.

> Also, I find a simple "time <command>" very useful to get a first impression
> whether my code is bound by user-runtime, I/O wait or system runtime.

Ah, I want to know how to interpret this correctly. For equivalent
operations
svnrdump says: 7.49s user 1.34s system 91% cpu 9.682 total
svnadmin says: 4.84s user 0.44s system 97% cpu 5.417 total

> To collect data on I/O activity, compression rates and other not readily
> available information, I use simple static counters and printf()
> their values
> at the end of each client call, for instance.

I see.

> Hope that helps.

A ton! Thanks a lot :)

-- Ram

Re: Some tips on profiling

Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
  On 27.09.2010 21:44, Ramkumar Ramachandra wrote:
> Hi Stefan,
>
> Could you tell me which tools you use to profile the various
> applications in trunk? I'm looking to profile svnrdump to fix some
> perf issues, but OProfile doesn't seem to work for me.
>
> Thanks.
>
> -- Ram

Hi Ram,

Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
That gives me a good idea of what code gets executed too often, how
often a jump (loop or condition) has been taken etc. It will not show the
non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
execution be a factor of 100 (YMMV).

Under Windows, I'm using the VisualStudio sampling profiler. The 
measurements
are pretty accurate and the overhead is low. It does not tell me, how often
a certain code path has been executed. Due to the low overhead, it is
well-suited for long running (>1 min) operations.

Also, I find a simple "time <command>" very useful to get a first impression
whether my code is bound by user-runtime, I/O wait or system runtime.

To collect data on I/O activity, compression rates and other not readily
available information, I use simple static counters and printf() their 
values
at the end of each client call, for instance.

Hope that helps.

-- Stefan^2.