You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@wandisco.com> on 2014/07/14 01:54:28 UTC

Performance Results on Windows

After almost a week of continuous measurements and 4320
individual data point, the results are in now. Summary:

* on cold disks, 'svn log -v' is 2x..3x as fast with packed f7 than with
  packed f6 in default config and up to 8x with advanced options
* on cold disks, 'svn export' with packed f7 is 40%.50% faster in
  default config and 2x as fast with advanced options
* ra_serf has an anomaly making "fast" config up to 100x slower
  than ra_svn - independent of repo format

Details to how I set up the repositories and measurements plus
the rationale behind it can be found in our wiki:

https://wiki.apache.org/subversion/MeasuringRepositoryPerformance

Tools and test script have been committed in r1610264. System
setup is close to what Ivan used in his tests, except for having
4GB to accommodate the much larger BSD repo as well.

Original measurement log and detailed results can be found in the
.zip file attached. The spreadsheet has been upgraded and now
displays f6/f7 because "+400%" is easier to understand than "-80%".
It now also highlights data cells with particularly high variation.

I'm currently waiting for the 'svnadmin dump' results to come in.
Those will take about 3 days but the expectation is that packed f7
may be slightly slower here due to the "unusual" access pattern.
After that, I'll try to isolate the trigger for the ra_serf anomaly.

-- Stefan^2.

Re: Performance Results on Windows

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Mon, Jul 21, 2014 at 3:26 AM, Stefan Fuhrmann
<st...@wandisco.com> wrote:
> So, it turned out to be a problem in the Subversion libs and
> got fixed by r1611379. When multiple connections to the same
> repo were opened concurrently (ra_serf over loopback), our file
> API workarounds for Windows (retry for up to 10-ish seconds)
> would interact badly with the initialization serialization code we
> use for the "revprop caching" feature.

Good to hear that you sorted this out and that ra_serf wasn't directly
at fault!  =P  -- justin

Re: Performance Results on Windows

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, Jul 15, 2014 at 4:09 AM, Justin Erenkrantz
<ju...@erenkrantz.com> wrote:
> On Mon, Jul 14, 2014 at 9:09 AM, Stefan Fuhrmann
> <st...@wandisco.com> wrote:
>> On the same machine actually (which may be a contributing factor).
>> The client is svn-bench that simply handles the editor drive but
>> discards incoming file contents etc.
>
> We've historically seen oddity in loopback scenarios on Windows - the
> loopback network drivers on Windows have always seemed a bit shaky to
> me.  If/when you get a chance, it's worth seeing if you see it happen
> with a remote box.  -- justin

So, it turned out to be a problem in the Subversion libs and
got fixed by r1611379. When multiple connections to the same
repo were opened concurrently (ra_serf over loopback), our file
API workarounds for Windows (retry for up to 10-ish seconds)
would interact badly with the initialization serialization code we
use for the "revprop caching" feature.

As a result, the feature initialization would fail (hurts performance
but is not a major problem) but it would also waste 10+ seconds
until it gives up. This fully explains why runtimes would fluctuate
between e.g. 21, 31 and 42 seconds.

I'm currently running a few extra tests and will post the final
results once they come in.

-- Stefan^2.

Re: Performance Results on Windows

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Mon, Jul 14, 2014 at 9:09 AM, Stefan Fuhrmann
<st...@wandisco.com> wrote:
> On the same machine actually (which may be a contributing factor).
> The client is svn-bench that simply handles the editor drive but
> discards incoming file contents etc.

We've historically seen oddity in loopback scenarios on Windows - the
loopback network drivers on Windows have always seemed a bit shaky to
me.  If/when you get a chance, it's worth seeing if you see it happen
with a remote box.  -- justin

Re: Performance Results on Windows

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Mon, Jul 14, 2014 at 12:46 PM, Justin Erenkrantz
<ju...@erenkrantz.com> wrote:
> On Sun, Jul 13, 2014 at 7:54 PM, Stefan Fuhrmann
> <st...@wandisco.com> wrote:
>> * ra_serf has an anomaly making "fast" config up to 100x slower
>>   than ra_svn - independent of repo format
> ...
>> After that, I'll try to isolate the trigger for the ra_serf anomaly.
>
> Huh.  Very curious to read more about this when you get to it.

I'm curios as well. It's completely unclear what causes it as every
individual aspect (OS, cache size, enabled features, RA layer, client
speed, cache state etc.) works well but but somehow the combination
is a problem.

No ad hoc hypothesis seems particularly convincing. Maybe something
to do with timing / connection management (when to open additional
connections etc.).

> Was the client also Windows?

On the same machine actually (which may be a contributing factor).
The client is svn-bench that simply handles the editor drive but
discards incoming file contents etc.

-- Stefan^2.

Re: Performance Results on Windows

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Sun, Jul 13, 2014 at 7:54 PM, Stefan Fuhrmann
<st...@wandisco.com> wrote:
> * ra_serf has an anomaly making "fast" config up to 100x slower
>   than ra_svn - independent of repo format
...
> After that, I'll try to isolate the trigger for the ra_serf anomaly.

Huh.  Very curious to read more about this when you get to it.

Was the client also Windows?

Thanks.  -- justin

Re: Performance Results on Windows

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Aug 13, 2014 at 11:03 PM, Mark Phippard <ma...@gmail.com> wrote:

> On Wed, Aug 13, 2014 at 4:45 PM, Stefan Fuhrmann <
> stefan.fuhrmann@wandisco.com> wrote:
>
> It seems that CollabNET and other hosting providers possibly have one
>>
>>> of the worst configuration for log-adressing feature. Multiple users
>>> access over HTTP to a number of gigabyte-sized repositories (so there
>>> are no enough memory for enormous caches).
>>
>>
>> Well, the key would be GB-sized projects (data
>> transferred during export).
>>
>> I don't know how I feel about getting CollabNet involved.
>> My concern is that they simply won't have the time to
>> do it because we are not talking about "please, would
>> you run those 2 commands for me?" but rather a two
>> week effort.
>>
>>
> Not to speak for Ivan, but I think he is simply saying that sites that are
> hosting a significant number of SVN repositories are unlikely to be able to
> benefit from some of the things like the large cache sizes.
>

And that is perfectly fine. I only wanted to make sure that
those caches are actually too small to cover most of the
data. E.g. the SVN project itself is 100MB in a 50GB repo
and can easily be cached.


> I am not against making SVN faster in special controlled situations where
> you can fine tune a server for performance.  Just please do not make it
> slower than it already is for the rest of us that cannot do that.
>

Agreed. Measurements so far indicate that the cold setup
has no performance regressions. It is only the (completely)
hot OS cache bits where CPU overhead is visible.


> To give an idea from a recent server I am dealing with, there are
> approximately 22K repositories with about 5.5 TB of data.  They are served
> via Apache with prefork MPM.  There is no room for adding some massive RAM
> cache to this, and I doubt it would help anyway.
>

Format 7 is designed for speeding up the *non-cached*
data access. The recent fine tuning ensured that it will
cope with the small-ish default caches for its temporary
state (e.g. you don't want to read the same directory
over and over again).

To benefit from format 7, you only need to pack your
repositories. Non-packed repos don't see any significant
difference (<5% slower due to more data being read for
c/o, ~20% faster for log) with default settings.


> FWIW, if the new format is primarily faster in specific situations, then
> why don't we just not make it the default format and instead make you
> specify a specific option when creating the repository to use the format?
>  Then people can choose to opt-in to the format if they have an environment
> where it will be useful and their server will be tuned accordingly?
>

The specific situation is "pack repo and not all data in
cache". Configuring larger caches etc. will help f7 repos
more than f6 - but this is added benefit.

People can chose their repository format versions already.
Not making format 7 the default is a possibility but does
not change the range of choice that admins have. And since
we don't require people to upgrade their repositories, they
have full control if, when and where they roll out the new
format.

-- Stefan^2.

Re: Performance Results on Windows

Posted by Mark Phippard <ma...@gmail.com>.
On Wed, Aug 13, 2014 at 4:45 PM, Stefan Fuhrmann <
stefan.fuhrmann@wandisco.com> wrote:

It seems that CollabNET and other hosting providers possibly have one
>
>> of the worst configuration for log-adressing feature. Multiple users
>> access over HTTP to a number of gigabyte-sized repositories (so there
>> are no enough memory for enormous caches).
>
>
> Well, the key would be GB-sized projects (data
> transferred during export).
>
> I don't know how I feel about getting CollabNet involved.
> My concern is that they simply won't have the time to
> do it because we are not talking about "please, would
> you run those 2 commands for me?" but rather a two
> week effort.
>
>
Not to speak for Ivan, but I think he is simply saying that sites that are
hosting a significant number of SVN repositories are unlikely to be able to
benefit from some of the things like the large cache sizes.

I am not against making SVN faster in special controlled situations where
you can fine tune a server for performance.  Just please do not make it
slower than it already is for the rest of us that cannot do that.

To give an idea from a recent server I am dealing with, there are
approximately 22K repositories with about 5.5 TB of data.  They are served
via Apache with prefork MPM.  There is no room for adding some massive RAM
cache to this, and I doubt it would help anyway.

FWIW, if the new format is primarily faster in specific situations, then
why don't we just not make it the default format and instead make you
specify a specific option when creating the repository to use the format?
 Then people can choose to opt-in to the format if they have an environment
where it will be useful and their server will be tuned accordingly?

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Performance Results on Windows

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Aug 13, 2014 at 8:21 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 23 July 2014 17:19, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > Updated and final results:
> >
> > * svnadmin dump results now included, f6 / f7 is +/-20%
> >
> > * fixed anomaly with ra_serf, results consistent with previous findings
> >   'null-export' tests have been rerun and old results been replaced
> >
> > * added a page on how 'null-export' reacts on cache configuration
> >   to get a better picture of how cache size, ra layer and block-read
> >   interact when caches are hot. Data for small or cold caches has
> >   already been in covered in other tests.
> >
> I just want to note that your conclusions don't point the fact that
> there is performance degradation in many case:
>

Thank you for taking the time to look at the detailed results.


> 1. 'svn log -v' for bsd-nopack repository over svn:// in 'FAST'
> configuration is 51% slower
>

That is basically the rarest system state that you may be in:

* It is a repeated log request (OS caches are hot)
* The servers have been properly configured (caches on and large enough)
* Yet, the SVN caches are cold because you just restarted the server

How often do you restart your server application?

In the "fast" config, you will rarely hit the "only OS
is hot" case on Windows. There is only one, large
server process (i.e. OS caches are not much larger)
and if data can't be served from its caches, it will
often not be found in the OS caches as well.

If you take a look at the absolute execution times,
you will see that even at an 90% OS hit rate, the
cold read latency will dominate the total execution
time.

Finally, people that care enough to run a "fast" config
can also be expected to pack their repos when we
tell them that it will boost their performance. They
will then end up with 3.5s (f7 packed) instead of 10.2s
(f6 non-packed) in your edge case system state.


> 2. 'svn export' for ruby-nopack repository over file://  in 'FAST'
> configuration is 23% slower
>

I assume that you are referring to the "working copy
on server" test case. As you can see, there is a more
than 20% variation (red marker) the individual
measurements. Simply compare the "hot OS" and
"hot SVN" values - they should be identical for file://
The undisturbed null-export shows you something
like a 5..10% performance loss in the hot case for
basically all configurations and combinations.

Again F7 becomes faster even over file:// and with
no specific cache settings when you pack your repo
and don't have perfect cache hit rates.


>
> So I ask for unbiased performance tests. As far as I remember, you
> advertised the 2x-10x performance improvement on the hackaton in
> Berlin (that is supposed to be already achieved at the time of
> presentation).


Yes, as a rule of thumb, 2x speed for export and
something like 10x for log -v. I achieved these in
my configurations (packed repo, Linux, local disks
in server). And it is clear that these only apply to
cold reads. SVN cached data is format agnostic.
Looking at the Windows results, they are in line
with those rules of thumb.

And there is also a svnadmin verify quick check
option now that uncovers external corruption on
f7 repos about 100x (fast disk array required, YMMV)
faster than a full check.

I also assumed that Windows might benefit from
things like block-read more than Unix as we safe
of fopen() operations. Later measurements suggested
that the block-read mainly improves cold read
throughput and that OS file API overhead is not
an issue on Windows.


> Then we have found (and fixed) several cases of
> performance degradation. But you didn't tell us about these cases. So
> I consider your performance measurements as biased.
>

That is not what I remember. A big discussion started
once I said "repos need to be packed and servers
properly configured". And that exactly implies that all
other cases may perform worse than before.

During the course of the discussion the following
requirements were added:

* There shall be no significant penalty for non-packed f7 vs. f6.
* There shall be no significant penalty for f7 vs. f6 over file://.
* There shall be no significant penalty for f7 vs. f6 in default
  server configurations (mainly cache settings).
* A 20% performance penalty is "significant".

These plus Windows-specific issues that were found
since then is what I fixed in the weeks following the
Berlin hackathon. I suspect that those changes hardly
improved the results of my original configuration
(merging index and revision data does not make
a great difference for packed repos).

However, those changes extended the range of
benefiting configurations to virtually every scenario
where we read data from disk. Getting to 2x speed,
however, requires specific server setup. People that
care less either don't get hurt (non-packed) or get
a freebie (packed) when upgrading to f7.

I'm still getting bad numbers in my tests.


That may well be. Have you read my Wiki page yet?
https://wiki.apache.org/subversion/MeasuringRepositoryPerformance

Without taking specific measures, you are pretty
much bound to produce either unfair or unrealistic
test data. I spent a few days just to create the tools
(links into our repo are in the wiki) and methodology
that will work in less controlled environments than
my Linux home server setup.


> But obviously, my tests can
> be considered as a biased too because I am strongly against this
> feature for many reasons. That's why we need an unbiased performance
> test.
>

Well, start with the wiki page and tell me whether
you agree the methodology. I might be missing
important aspects.


> I think that the unbiased tester should pay attention to ensure both
> the following:
> 1) there is *significant* performance improvement for some realistic cases
>

> 50% is significant. Cold read is significant.
Hot scenarios are client or network limited
(exception: log over 1Gb network).


> 2) there are no performance degradation in *all the typical Subversion
> usage configurations* (including the worst ones).
>

"No degradation, never" is certainly not a reasonable
requirement. In Berlin, people seemed fine with a
20% penalty *in some cases*. The variation between
individual test runs alone is often higher than that.

I mean, we *know* that f7 repos are 2..5% larger
due to the index data. It is perfectly reasonable to
assume that their performance "baseline" is worse
than f6 by roughly the same amount and some of
the "hot OS" runs show exactly that.

The key is to get significantly faster in many typical
scenarios.


> It seems that CollabNET and other hosting providers possibly have one
> of the worst configuration for log-adressing feature. Multiple users
> access over HTTP to a number of gigabyte-sized repositories (so there
> are no enough memory for enormous caches).


Well, the key would be GB-sized projects (data
transferred during export).

I don't know how I feel about getting CollabNet involved.
My concern is that they simply won't have the time to
do it because we are not talking about "please, would
you run those 2 commands for me?" but rather a two
week effort.


>  Authorization is enabled
> (my tests shows that this is important).


Authz is only relevant to the degree that it turns
"log" into "log -v". And it adds a format-independent
constant (reading the file) and proportional overhead
(checking paths) to all ops. The larger the authz file,
the more "blurred" the results will be.


> Apache httpd uses prefork MPM
> (that should eliminate the caches).
>

Only the "hot" cases. It does not eliminate the impact
of e.g. revprop caching during log. The impact of ra_serf
using ~3 processes during export is hard to predict but
some caching should be beneficial (DAG node cache,
in particular).


> Also I'd like to note that your method to achieve 'Cold' state on
> Windows is totally wrong: you're basically allocating *all* available
> memory by 'ClearMemory' tool making OS starving on all resources [1].
>

Is it unfair? It should hurt all repo configurations equally.

How do you know it is starving the OS of all resources?
I simply allocate all free and cache *RAM*. I'm not even
forcing anything to be swapped or removed from the
swapped file cache.


> It's abnormal situation for operating system. So your 'Cold' state
> results are irrelevant.
>

Well, the alternative is to make sure that between
test cycles (i.e. when we come back to the first repo),
enough data has been read to make the caches cool
down far enough.

Since the total repo size is something like 50GB, I can
rerun the cold tests with undefined initial cache state.
The results may simply be more noisy.

-- Stefan^2.

Re: Performance Results on Windows

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 23 July 2014 17:19, Stefan Fuhrmann <st...@wandisco.com> wrote:
> Updated and final results:
>
> * svnadmin dump results now included, f6 / f7 is +/-20%
>
> * fixed anomaly with ra_serf, results consistent with previous findings
>   'null-export' tests have been rerun and old results been replaced
>
> * added a page on how 'null-export' reacts on cache configuration
>   to get a better picture of how cache size, ra layer and block-read
>   interact when caches are hot. Data for small or cold caches has
>   already been in covered in other tests.
>
I just want to note that your conclusions don't point the fact that
there is performance degradation in many case:
1. 'svn log -v' for bsd-nopack repository over svn:// in 'FAST'
configuration is 51% slower
2. 'svn export' for ruby-nopack repository over file://  in 'FAST'
configuration is 23% slower

So I ask for unbiased performance tests. As far as I remember, you
advertised the 2x-10x performance improvement on the hackaton in
Berlin (that is supposed to be already achieved at the time of
presentation). Then we have found (and fixed) several cases of
performance degradation. But you didn't tell us about these cases. So
I consider your performance measurements as biased.

I'm still getting bad numbers in my tests. But obviously, my tests can
be considered as a biased too because I am strongly against this
feature for many reasons. That's why we need an unbiased performance
test.

I think that the unbiased tester should pay attention to ensure both
the following:
1) there is *significant* performance improvement for some realistic cases
2) there are no performance degradation in *all the typical Subversion
usage configurations* (including the worst ones).

It seems that CollabNET and other hosting providers possibly have one
of the worst configuration for log-adressing feature. Multiple users
access over HTTP to a number of gigabyte-sized repositories (so there
are no enough memory for enormous caches).  Authorization is enabled
(my tests shows that this is important). Apache httpd uses prefork MPM
(that should eliminate the caches).

Also I'd like to note that your method to achieve 'Cold' state on
Windows is totally wrong: you're basically allocating *all* available
memory by 'ClearMemory' tool making OS starving on all resources [1].
It's abnormal situation for operating system. So your 'Cold' state
results are irrelevant.

[1] https://svn.apache.org/repos/asf/subversion/trunk/tools/dev/benchmarks/RepoPerf/ClearMemory.cpp

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: Performance Results on Windows

Posted by Stefan Fuhrmann <st...@wandisco.com>.
Updated and final results:

* svnadmin dump results now included, f6 / f7 is +/-20%

* fixed anomaly with ra_serf, results consistent with previous findings
  'null-export' tests have been rerun and old results been replaced

* added a page on how 'null-export' reacts on cache configuration
  to get a better picture of how cache size, ra layer and block-read
  interact when caches are hot. Data for small or cold caches has
  already been in covered in other tests.

-- Stefan^2.

On Mon, Jul 14, 2014 at 1:54 AM, Stefan Fuhrmann
<st...@wandisco.com> wrote:
> After almost a week of continuous measurements and 4320
> individual data point, the results are in now. Summary:
>
> * on cold disks, 'svn log -v' is 2x..3x as fast with packed f7 than with
>   packed f6 in default config and up to 8x with advanced options
> * on cold disks, 'svn export' with packed f7 is 40%.50% faster in
>   default config and 2x as fast with advanced options
> * ra_serf has an anomaly making "fast" config up to 100x slower
>   than ra_svn - independent of repo format
>
> Details to how I set up the repositories and measurements plus
> the rationale behind it can be found in our wiki:
>
> https://wiki.apache.org/subversion/MeasuringRepositoryPerformance
>
> Tools and test script have been committed in r1610264. System
> setup is close to what Ivan used in his tests, except for having
> 4GB to accommodate the much larger BSD repo as well.
>
> Original measurement log and detailed results can be found in the
> .zip file attached. The spreadsheet has been upgraded and now
> displays f6/f7 because "+400%" is easier to understand than "-80%".
> It now also highlights data cells with particularly high variation.
>
> I'm currently waiting for the 'svnadmin dump' results to come in.
> Those will take about 3 days but the expectation is that packed f7
> may be slightly slower here due to the "unusual" access pattern.
> After that, I'll try to isolate the trigger for the ra_serf anomaly.
>
> -- Stefan^2.