You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Evgeny Kotkov <ev...@visualsvn.com> on 2017/12/05 21:05:24 UTC

Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Julian Foad <ju...@apache.org> writes:

> After any issues raised in this discussion are resolved, we feel we should
> go ahead and produce RC1 as soon as possible.

I think that I am seeing a 1.10 regression in terms of httpd's memory
usage during large imports.

In my environment, when I `svn import` an unpacked version of Boost
(https://www.boost.org/) that consists of around 60,000 files, I see that
the memory consumption by the server process rapidly grows up to 1.5 GB.
The environment is a Windows 8.1 x64 machine with httpd 2.4.29 configured
to use short-circuit authz and HTTPS.  Apparently, this behavior is specific
to 1.10, as I cannot reproduce it with 1.9 binaries.

 (I haven't investigated the issue any further at this time, and it might as
  well be reproducible in other configurations.)

Thanks,
Evgeny Kotkov

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 05:15:46PM +0300, Evgeny Kotkov wrote:
> Unless I am missing something, this might be worth considering before the
> 1.10 GA release.

Evgeny, you were entirely right about calling this out as a release blocker.
I am sorry for having suggested otherwise.

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 07:21:15PM +0100, Stefan Sperling wrote:
> I am just questioning the usefulness of halting the presses and restarting
> the soak for another month for something that isn't a security / data
> corruption issue. I anticipate that problems of similar severity to this
> one will still be discovered after we release 1.10.0, regardless of
> whether we release 1.10.0 at end of March or later.
> 
> Though maybe my idea of the impact of this bug is wrong?
> If this really makes some repositories entirely unusable with authz enabled
> then of course it should be considered a blocker. Is it this severe?

I have misread our flowchart at:
http://subversion.apache.org/docs/community-guide/svn-soak-management.png

It seems for this kind of issue we'd extend soak by just one week instead
of four? I wouldn't mind a one-week extension for this kind of bug fix.
One week seems reasonable.

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Branko Čibej <br...@apache.org>.

On 02.03.2018 19:21, Stefan Sperling wrote:
> I am just questioning the usefulness of halting the presses and restarting
> the soak for another month for something that isn't a security / data
> corruption issue.

It's a potential DOS that only needs read access. Falls under security
by my definition.

-- Brane

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 09:02:02PM +0300, Evgeny Kotkov wrote:
> Stefan Sperling <st...@elego.de> writes:
> 
> > I'd rather ship 1.10.0 at the prospected release date followed closely
> > by 1.10.1 to fix bugs such as these, than delay general access to 1.10.0
> > even further.
> 
> While I do not have significant objections against such plan, I find the
> idea of shipping a performance feature that causes a massive slowdown
> instead of an improvement somewhat controversial.  (In other words,
> I am -0 for that.)
> 
> > You may not have realized this, but I have been waiting for 1.10.0 to
> > happen for *over a year* https://svn.haxx.se/dev/archive-2017-01/0043.shtml
> > For all this time, I have wanted the conflict resolver to get real world
> > exposure because I need feedback from users out there to improve it.
> > I kept mostly quiet because I didn't want to push too hard for this
> > release all by myself because of the relatively high share of burden
> > this would imply. So I waited for activity from the community to make
> > it happen as a true collective effort.
> 
> Not too sure about how this is connected to the soak period and to the
> release process — speaking of which, I would say that your e-mail may
> discourage people from reporting issues during the soak period.

I am not trying to discourage people from reporting and fixing problems.
I am sorry if what I wrote could be interpreted in this way.

I am just questioning the usefulness of halting the presses and restarting
the soak for another month for something that isn't a security / data
corruption issue. I anticipate that problems of similar severity to this
one will still be discovered after we release 1.10.0, regardless of
whether we release 1.10.0 at end of March or later.

Though maybe my idea of the impact of this bug is wrong?
If this really makes some repositories entirely unusable with authz enabled
then of course it should be considered a blocker. Is it this severe?

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Evgeny Kotkov <ev...@visualsvn.com>.

Stefan Sperling <st...@elego.de> writes:

> I'd rather ship 1.10.0 at the prospected release date followed closely
> by 1.10.1 to fix bugs such as these, than delay general access to 1.10.0
> even further.

While I do not have significant objections against such plan, I find the
idea of shipping a performance feature that causes a massive slowdown
instead of an improvement somewhat controversial.  (In other words,
I am -0 for that.)

> You may not have realized this, but I have been waiting for 1.10.0 to
> happen for *over a year* https://svn.haxx.se/dev/archive-2017-01/0043.shtml
> For all this time, I have wanted the conflict resolver to get real world
> exposure because I need feedback from users out there to improve it.
> I kept mostly quiet because I didn't want to push too hard for this
> release all by myself because of the relatively high share of burden
> this would imply. So I waited for activity from the community to make
> it happen as a true collective effort.

Not too sure about how this is connected to the soak period and to the
release process — speaking of which, I would say that your e-mail may
discourage people from reporting issues during the soak period.

> If this one bug really bothers you enough to hold the planned release back
> it makes me wonder why you didn't push for a fix much earlier. We have had
> plenty of time.

I haven't been and am not pushing for a fix.  Rather than that, I have just
included the additional information about the problem with a comment that
it might be viable to look into before the GA release.

Moreover, I reported the issue at the very moment I found it with an edge-case
reproduction.  Once I was asked to bisect for a specific revision, I should
have probably stated that I won't have time to do that.  But I have been
thinking that I would be able to find some.  When I stumbled across it again,
I found the revision and the simple reproduction — but as it seems, this
hasn't been the most appropriate time for including these new details.

Putting all that aside, I wouldn't say that it is productive to discuss issues
in such way.  In my opinion, we should be doing it the other way around by
actually encouraging reports of various problems during the soak period.

Regards,
Evgeny Kotkov

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 05:15:46PM +0300, Evgeny Kotkov wrote:
> Unless I am missing something, this might be worth considering before the
> 1.10 GA release.

Not about the actual bug, just a meta comment on the process:

I'd rather ship 1.10.0 at the prospected release date followed closely
by 1.10.1 to fix bugs such as these, than delay general access to 1.10.0
even further.

You may not have realized this, but I have been waiting for 1.10.0 to
happen for *over a year* https://svn.haxx.se/dev/archive-2017-01/0043.shtml
For all this time, I have wanted the conflict resolver to get real world
exposure because I need feedback from users out there to improve it.
I kept mostly quiet because I didn't want to push too hard for this
release all by myself because of the relatively high share of burden
this would imply. So I waited for activity from the community to make
it happen as a true collective effort.
I was glad when Julian volunteered to drive the process.

If this one bug really bothers you enough to hold the planned release back
it makes me wonder why you didn't push for a fix much earlier. We have had
plenty of time. I don't mean to disrespect the fact that you may not have
had time recently -- we are all constraint for time these days.
But I also believe we shouldn't panic over every bug that slips into this
.0 release. It is OK to ship 1.10.0 with known bugs that aren't very
serious security / data corruption issues. There's a section in the
release notes which lists known issues. I would prefer to document this
memory consumption problem there and patch it like any other regular bug.

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> All of those figures are the first run after starting Apache, i.e. when
> the OS cache and the Subversion cache is cold.  With a hot Subversion

I meant to write

 "the OS cache is hot and the Subversion cache is cold"

> cache the values are:
>
>             1.9: 6.0s
>    patched 1.10: 5.5s
>   reverted 1.10: 3.3s
>
> Again, 1.10 would be nearly twice as fast, but now rereading the authz
> removes most of that gain.
>
> -- 
> Philip

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 06:01:23PM +0100, Branko Čibej wrote:
> On 03.03.2018 17:44, Stefan Sperling wrote:
> > On Sat, Mar 03, 2018 at 04:32:35PM +0000, Philip Martin wrote:
> >> Stefan Sperling <st...@elego.de> writes:
> >>
> >>> Which leads me to believe that r1778923 may have been based on wrong
> >>> assumptions about performance. The new authz is not fast enough to
> >>> significantly reduce per-request overhead.
> >> My testing so far was with a very small authz file -- only a handful of
> >> rules and aliases.  If I add a few hundred trival rules to the file then
> >> 1.11 becomes signifcantly slower than 1.9 while reverting is still much
> >> faster:
> >>
> >>             1.9:  4.3s
> >>      trunk 1.11: 14.6s
> >>   reverted 1.11:  1.9s
> > Thanks for testing and confirming this.
> >
> > I think our best course of action is to revert the change on trunk
> > and in 1.10.x. Could you do that? (I could do it, too. I'm just asking
> > you since you've probably already prepared it in a local copy.)
> 
> So if I understand this debate correctly: The authz code is so much
> faster now that parsing the authz file and performing the authz lookups
> beats calculating its MD5 checksum?

No, the file is only parsed/checksummed once. What's making it faster
than 1.9 is that rule lookups happen on a cached copy of a parsed
representation which is more efficient than the representation used
in 1.9 (which uses svn_config_enumerate_sections2() to walk the ruleset).

Re-reading the file and calculating its checksum is still slow and
doing it per request really hurts if the file has a few hundred rules.

Re: Potential regression: high server-side memory consumption during import

Posted by Julian Foad <ju...@apache.org>.

Philip Martin wrote:
> Julian Foad writes:
>> Can someone file an issue in the tracker please. I need to refer to it
>> in discussion with potential rc1 trials.
> 
> https://issues.apache.org/jira/browse/SVN-4723

Thanks for that and SVN-4722.

- Julian

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Julian Foad <ju...@apache.org> writes:

> Can someone file an issue in the tracker please. I need to refer to it
> in discussion with potential rc1 trials.

https://issues.apache.org/jira/browse/SVN-4723

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Julian Foad <ju...@apache.org>.

Can someone file an issue in the tracker please. I need to refer to it 
in discussion with potential rc1 trials.

- Julian

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 06:28:48PM +0000, Philip Martin wrote:
> Branko Čibej <br...@apache.org> writes:
> 
> > In other words ... if we wanted to make authz changes have immediate
> > effect, we'd need a better (faster, or at least non-blocking) way to
> > determine that the rules changed than reading the authz file, even if
> 
> Relatively easy to do for in-repository authz.  We could make it a
> feature of in-repository authz and simply tell users to use that if they
> want the faster effect.

Very good point. Nice idea!

Re: Potential regression: high server-side memory consumption during import

Posted by Branko Čibej <br...@apache.org>.

On 03.03.2018 19:15, James McCoy wrote:
> On Sat, Mar 03, 2018 at 06:54:06PM +0100, Branko Čibej wrote:
>> P.S.: Running tests now with the patched 1.10.x, will vote on the
>> backport as soon as that's done. If it's approved, I believe we have to
>> move our expected release date from 28th March to 4th April?
> According to the flowchart posted elsewhere, the soak period only need
> be extended if this had been found in the final week of the soak.
>
> From our docs on releasing:
>
>     The stabilization period begins with a release candidate tarball with
>     the version A.B.0-rc1. Further release candidate tarballs may be made as
>     blocking bugs are fixed;
>     ...
>     Under some circumstances, the stabilization period will be extended:
>
>     * If a potentially destabilizing change must be made in order to fix a
>       bug, the entire four-week stabilization period is restarted.
>       ...
>     * If a critical bugfix is made during the final week of the
>       stabilization period, the final week is restarted. The final A.B.0
>       release is always identical to the release candidate made one week
>       before (with the exceptions discussed below).
>
> I don't see anything about this bug that would require extending the
> soak.

You're right: 28th March it is.

-- Brane

Re: Potential regression: high server-side memory consumption during import

Posted by James McCoy <ja...@jamessan.com>.

On Sat, Mar 03, 2018 at 06:54:06PM +0100, Branko Čibej wrote:
> P.S.: Running tests now with the patched 1.10.x, will vote on the
> backport as soon as that's done. If it's approved, I believe we have to
> move our expected release date from 28th March to 4th April?

According to the flowchart posted elsewhere, the soak period only need
be extended if this had been found in the final week of the soak.

From our docs on releasing:

    The stabilization period begins with a release candidate tarball with
    the version A.B.0-rc1. Further release candidate tarballs may be made as
    blocking bugs are fixed;
    ...
    Under some circumstances, the stabilization period will be extended:

    * If a potentially destabilizing change must be made in order to fix a
      bug, the entire four-week stabilization period is restarted.
      ...
    * If a critical bugfix is made during the final week of the
      stabilization period, the final week is restarted. The final A.B.0
      release is always identical to the release candidate made one week
      before (with the exceptions discussed below).

I don't see anything about this bug that would require extending the
soak.

Cheers,
-- 
James
GPG Key: 4096R/91BF BF4D 6956 BD5D F7B7  2D23 DFE6 91AE 331B A3DB

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Branko Čibej <br...@apache.org> writes:

> In other words ... if we wanted to make authz changes have immediate
> effect, we'd need a better (faster, or at least non-blocking) way to
> determine that the rules changed than reading the authz file, even if

Relatively easy to do for in-repository authz.  We could make it a
feature of in-repository authz and simply tell users to use that if they
want the faster effect.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Branko Čibej <br...@apache.org>.

On 03.03.2018 18:46, Philip Martin wrote:
> Branko Čibej <br...@apache.org> writes:
>
>> So if I understand this debate correctly: The authz code is so much
>> faster now that parsing the authz file and performing the authz lookups
>> beats calculating its MD5 checksum?
> More that reading/checksumming is still too slow to be done repeatedly.
>
> 1.9 reads the file once, per connection, and then does authz lookups on
> lots of paths.  The authz rules are fixed for the duration of the
> connection.
>
> 1.10 was reading and performing the checksum repeatedly as well as doing
> the authz lookups on lots of paths.  The authz rules can change during
> the connection lifetime. The authz lookups are faster than 1.9 but not
> enough to offset the repeated reading/checksumming.
>
> 1.11 goes back to reading the file once, and still does the same authz
> lookups.  The authz rules once again remain fixed for the duration of
> the connection.

Yes, I see the backport proposal now. Makes sense.

In other words ... if we wanted to make authz changes have immediate
effect, we'd need a better (faster, or at least non-blocking) way to
determine that the rules changed than reading the authz file, even if
just to verify its hash without actually parsing it. But that can be
done properly at a later date without causing a regression relative to
1.9 behaviour.

-- Brane

P.S.: Running tests now with the patched 1.10.x, will vote on the
backport as soon as that's done. If it's approved, I believe we have to
move our expected release date from 28th March to 4th April?

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Branko Čibej <br...@apache.org> writes:

> So if I understand this debate correctly: The authz code is so much
> faster now that parsing the authz file and performing the authz lookups
> beats calculating its MD5 checksum?

More that reading/checksumming is still too slow to be done repeatedly.

1.9 reads the file once, per connection, and then does authz lookups on
lots of paths.  The authz rules are fixed for the duration of the
connection.

1.10 was reading and performing the checksum repeatedly as well as doing
the authz lookups on lots of paths.  The authz rules can change during
the connection lifetime. The authz lookups are faster than 1.9 but not
enough to offset the repeated reading/checksumming.

1.11 goes back to reading the file once, and still does the same authz
lookups.  The authz rules once again remain fixed for the duration of
the connection.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Branko Čibej <br...@apache.org>.

On 03.03.2018 17:44, Stefan Sperling wrote:
> On Sat, Mar 03, 2018 at 04:32:35PM +0000, Philip Martin wrote:
>> Stefan Sperling <st...@elego.de> writes:
>>
>>> Which leads me to believe that r1778923 may have been based on wrong
>>> assumptions about performance. The new authz is not fast enough to
>>> significantly reduce per-request overhead.
>> My testing so far was with a very small authz file -- only a handful of
>> rules and aliases.  If I add a few hundred trival rules to the file then
>> 1.11 becomes signifcantly slower than 1.9 while reverting is still much
>> faster:
>>
>>             1.9:  4.3s
>>      trunk 1.11: 14.6s
>>   reverted 1.11:  1.9s
> Thanks for testing and confirming this.
>
> I think our best course of action is to revert the change on trunk
> and in 1.10.x. Could you do that? (I could do it, too. I'm just asking
> you since you've probably already prepared it in a local copy.)

So if I understand this debate correctly: The authz code is so much
faster now that parsing the authz file and performing the authz lookups
beats calculating its MD5 checksum?

-- Brane

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> r1825778 contains a revert of a change not in 1.10, so doesn't merge
> cleanly to 1.10.  The reverse merge of -c-r1779188,-1778923 from trunk
> does merge cleanly to 1.10.

However, r1825736 is already nominated, so I just need to add r1825778
to that :-)
-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> Stefan Sperling <st...@elego.de> writes:
>
>> I think our best course of action is to revert the change on trunk
>> and in 1.10.x. Could you do that? (I could do it, too. I'm just asking
>> you since you've probably already prepared it in a local copy.)
>
> OK, r1825778.
>
> Hmm, it doesn't merge cleanly into 1.10...

r1825778 contains a revert of a change not in 1.10, so doesn't merge
cleanly to 1.10.  The reverse merge of -c-r1779188,-1778923 from trunk
does merge cleanly to 1.10.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Stefan Sperling <st...@elego.de> writes:

> I think our best course of action is to revert the change on trunk
> and in 1.10.x. Could you do that? (I could do it, too. I'm just asking
> you since you've probably already prepared it in a local copy.)

OK, r1825778.

Hmm, it doesn't merge cleanly into 1.10...

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 04:32:35PM +0000, Philip Martin wrote:
> Stefan Sperling <st...@elego.de> writes:
> 
> > Which leads me to believe that r1778923 may have been based on wrong
> > assumptions about performance. The new authz is not fast enough to
> > significantly reduce per-request overhead.
> 
> My testing so far was with a very small authz file -- only a handful of
> rules and aliases.  If I add a few hundred trival rules to the file then
> 1.11 becomes signifcantly slower than 1.9 while reverting is still much
> faster:
> 
>             1.9:  4.3s
>      trunk 1.11: 14.6s
>   reverted 1.11:  1.9s

Thanks for testing and confirming this.

I think our best course of action is to revert the change on trunk
and in 1.10.x. Could you do that? (I could do it, too. I'm just asking
you since you've probably already prepared it in a local copy.)

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Stefan Sperling <st...@elego.de> writes:

> Which leads me to believe that r1778923 may have been based on wrong
> assumptions about performance. The new authz is not fast enough to
> significantly reduce per-request overhead.

My testing so far was with a very small authz file -- only a handful of
rules and aliases.  If I add a few hundred trival rules to the file then
1.11 becomes signifcantly slower than 1.9 while reverting is still much
faster:

            1.9:  4.3s
     trunk 1.11: 14.6s
  reverted 1.11:  1.9s

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 04:27:36PM +0100, Stefan Sperling wrote:
> Right now I have no idea how to make the current code in trunk any
> faster without reverting r1778923.

Thinking about this some more, I think reverting r1778923 is the
only way to make it faster.

The authz cache key is based on the checksum of the contents of the
authz rules file. This means the entire file needs to be read and
checksummed in order to perform a cache lookup.

It looks like this is where the overhead we're seeing comes from
relative to trunk with r1778923 reverted: The pre-r1778923 code
only reads and calculate the checksum of the file once per connection.

The cache provides a small benefit at connection setup becuase the
authz model doesn't need to be recalculated if already cached due to
another connection. But it doesn't really help on a per-request basis
because of the overhead involved in doing a cache lookup is too large.

Which leads me to believe that r1778923 may have been based on wrong
assumptions about performance. The new authz is not fast enough to
significantly reduce per-request overhead.

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 03:10:46PM +0100, Stefan Sperling wrote:
> Regardless, could you try one more test run with this diff against
> trunk to see if it has any impact already?

My new diff doesn't help, it brings the memory problem back.

I have done some more digging with APR POOL DEBUG.
It is the connection pool which is getting too large:

POOL DEBUG: [98085/19516226785696] PCALLOC (    319481/   2339944/  19247334) 0x11bf5c548a00 "transaction" <subversion/libsvn_repos/authz.c:1646> (10974/10974/1)
POOL DEBUG: [98085/19516226785696] PCALLOC (    319505/   2340523/  19247913) 0x11bf5c548a00 "transaction" <subversion/libsvn_repos/authz.c:166>
(10975/10975/1)

The first allocation is the svn_authz_t returned from svn_repos_auth_read3().
The second one is the result of construct_authz_key().

So the problem is that svn_repos_auth_read3() is allocating a new svn_authz_t
in the result_pool. We cannot store this object in the connection pool without
running out of memory with many requests. Which means the patch I already
committed is the "right" fix if we want to keep calling svn_repos_auth_read3()
for every request.

The authz cache already uses a separate pool, the 'authz_pool', to store
a parsed representation of the authz config. This pool seems fine.

Right now I have no idea how to make the current code in trunk any
faster without reverting r1778923.

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Stefan Sperling <st...@elego.de> writes:

> Regardless, could you try one more test run with this diff against
> trunk to see if it has any impact already?

That's slower than current trunk:

  6.2s
  4.1s
  4.5s
  4.7s
  4.7s
  4.6s
  4.7s

and memory use is back to several GB.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 01:06:49PM +0000, Philip Martin wrote:
> Stefan Sperling <st...@elego.de> writes:
> 
> > I may have found a way to store an svn_repos_t object on the connection.
> > Does it help? (authz_tests pass)
> 
> No significant improvement.
> 
> I am timing
> 
>   time svn log http://localhost:8888/repos/asf/subversion \
>      --username pm --password mp > /dev/null
> 
> and I realised my earlier timings had the apache LogLevel set to Debug
> which means apache spends a significant time logging.  Lowering the
> LogLevel to Crit improves the speed.  Timings for 7 runs after starting
> apache:
> 
>          trunk      patched     reverted
>           1.11         1.11         1.11        1.9
> 
>           5.2s         5.1s         3.1s        7.9s
>           3.4s         3.4s         1.3s        4.5s
>           3.7s         3.8s         1.7s        4.5s
>           3.9s         3.9s         1.8s        4.5s
>           4.0s         4.0s         1.9s        4.5s
>           3.9s         3.9s         1.8s        4.5s
>           3.8s         3.8s         1.8s        4.5s
> 
> Now that apache is logging less the performance hit is even clearer.
> 1.11 could be 2 to 3 times faster than 1.9 but the authz effect means it
> is only about 10% faster.
> 
> Note an odd effect in the above numbers.  The second run for 1.11 is
> always the fastest. The first run in each case is the slowest, the
> Subversion cache is cold.  The second run is faster, the Subversion
> cache is hot.  In 1.9 subsequent runs are comparable to the second, but
> in 1.11 the subsequent runs are a bit slower than the second.  I don't
> know what causes this effect.

I think I see why my previous patch didn't affect performance.

The cache lookup happens in authz_read() with svn_object_pool__lookup().
The authz config cache is allocated in an object pool which uses the
result_pool passed to svn_repos_authz_read3(). An item is evicted
from an object pool when the pool the item was allocated in gets
cleaned up (see add_object_ref() in libsvn_subr/object_pool.c).
My commit r1825736 reduced the lifetime of the result pool to the
lifetime of the request, which apparently means the cached authz
config is removed from the object pool during request pool cleanup,
and doesn't survive throughout the connection.

So it looks like we'll have to pass the connection pool as result
pool instead. And this should now be safe since only one allocation
should occur if we pass an svn_repos_t as well?

We will still allocate one svn_authz_t object per request on the
connection. But that struct is small, it contains only a few pointers.
This problem could be addressed with an API change to svn_repos_authz_read3()
which adds a third pool argument for use by the object pool cache.

It seems the intention of r1778923 was to use this object pool cache
but the config cache lookup didn't work because a NULL pointer was passed
for the svn_repos_t argument. I don't fully understand why. Did we get a
new "authz_id' in authz_read() on every request?

There might be another bug to fix here: It looks like get_repos_config()
in libsvn_repos/config_file.c is missing a cache lookup:

  /* Fetch checksum and see whether we already have a matching config */
  SVN_ERR(svn_fs_file_checksum(checksum, svn_checksum_md5, root, fs_path,
                               TRUE, access->pool));

Given the comment, I would expect a cache lookup based on the checksum
here, but the code below just does this:

  *stream = representation_stream(root, fs_path, access->pool);

  return SVN_NO_ERROR;
}


Regardless, could you try one more test run with this diff against
trunk to see if it has any impact already?

Index: subversion/mod_authz_svn/mod_authz_svn.c
===================================================================
--- subversion/mod_authz_svn/mod_authz_svn.c	(revision 1825762)
+++ subversion/mod_authz_svn/mod_authz_svn.c	(working copy)
@@ -399,7 +399,6 @@ resolve_repos_relative_url(const char **path, cons
  */
 static svn_authz_t *
 get_access_conf(request_rec *r, authz_svn_config_rec *conf,
-                apr_pool_t *result_pool,
                 apr_pool_t *scratch_pool)
 {
   const char *access_file;
@@ -409,6 +408,7 @@ get_access_conf(request_rec *r, authz_svn_config_r
   svn_authz_t *access_conf = NULL;
   svn_error_t *svn_err = SVN_NO_ERROR;
   dav_error *dav_err;
+  svn_repos_t *repos;
 
   dav_err = dav_svn_get_repos_path2(r, conf->base_path, &repos_path, scratch_pool);
   if (dav_err)
@@ -465,10 +465,25 @@ get_access_conf(request_rec *r, authz_svn_config_r
                     "Path to groups file is %s", groups_file);
     }
 
+  /* Store an svn_repos_t on this connection to allow authz config caching. */
+  repos = ap_get_module_config(r->connection->conn_config, &authz_svn_module);
+  if (repos == NULL)
+    {
+      svn_err = svn_repos_open3(&repos, repos_path, NULL, r->connection->pool,
+                                scratch_pool);
+      if (svn_err)
+        {
+          log_svn_error(APLOG_MARK, r, "Failed to open repository:",
+                        svn_err, scratch_pool);
+          return NULL;
+        }
+      ap_set_module_config(r->connection->conn_config, &authz_svn_module, repos);
+    }
+
   svn_err = svn_repos_authz_read3(&access_conf,
                                   access_file, groups_file,
-                                  TRUE, NULL,
-                                  result_pool, scratch_pool);
+                                  TRUE, repos,
+                                  r->connection->pool, scratch_pool);
 
   if (svn_err)
     {
@@ -688,7 +703,7 @@ req_check_access(request_rec *r,
     }
 
   /* Retrieve/cache authorization file */
-  access_conf = get_access_conf(r,conf, r->pool, r->pool);
+  access_conf = get_access_conf(r,conf, r->pool);
   if (access_conf == NULL)
     return DECLINED;
 
@@ -805,7 +820,7 @@ subreq_bypass2(request_rec *r,
     }
 
   /* Retrieve authorization file */
-  access_conf = get_access_conf(r, conf, scratch_pool, scratch_pool);
+  access_conf = get_access_conf(r, conf, scratch_pool);
   if (access_conf == NULL)
     return HTTP_FORBIDDEN;

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> Here are the cache stats before/after a run once the timing has
> stabilised:
>
> gets : 11770766, 4388248 hits (37.28%)
> sets : 7309448 (99.01% of misses)
> failures: 0
> used : 777 MB (94.56%) of 822 MB data cache / 958 MB total cache memory
> 2225228 entries (99.45%) of 2237592 total
>
> gets : 12914318, 4814238 hits (37.28%)
> sets : 8019703 (99.01% of misses)
> failures: 0
> used : 778 MB (94.69%) of 822 MB data cache / 958 MB total cache memory
> 2229936 entries (99.66%) of 2237592 total

Note how the sets goes up, that means each run is getting cache misses
that lead to sets.  In contrast, 1.9 sees no increase in sets after the
first run, so eveything is found in the cache and the hits percentage
gradually increases towards 100%.  So 1.11 is faster than 1.9 even
though 1.11 seems to fail to caching everything.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> Note an odd effect in the above numbers.  The second run for 1.11 is
> always the fastest. The first run in each case is the slowest, the
> Subversion cache is cold.  The second run is faster, the Subversion
> cache is hot.  In 1.9 subsequent runs are comparable to the second, but
> in 1.11 the subsequent runs are a bit slower than the second.  I don't
> know what causes this effect.

This is a cache size effect.  The early runs can write to the cache
without having to evict, later runs can only write if they first evict.
I happened to choose a cache size that allowed the first two runs to do
very little eviction, the third and subsequent runs need to evict more.
If I make the cache bigger I can postpone the eviction and the third run
becomes as fast as the second.

Here are the cache stats before/after a run once the timing has
stabilised:

gets : 11770766, 4388248 hits (37.28%)
sets : 7309448 (99.01% of misses)
failures: 0
used : 777 MB (94.56%) of 822 MB data cache / 958 MB total cache memory
2225228 entries (99.45%) of 2237592 total
58607 buckets with 7 entries
64670 buckets with 6 entries
58592 buckets with 5 entries
48780 buckets with 4 entries
38212 buckets with 3 entries
27996 buckets with 2 entries
22422 buckets with 1 entries
377 buckets with 0 entries

gets : 12914318, 4814238 hits (37.28%)
sets : 8019703 (99.01% of misses)
failures: 0
used : 778 MB (94.69%) of 822 MB data cache / 958 MB total cache memory
2229936 entries (99.66%) of 2237592 total
59173 buckets with 7 entries
65009 buckets with 6 entries
58773 buckets with 5 entries
48692 buckets with 4 entries
37862 buckets with 3 entries
27838 buckets with 2 entries
21926 buckets with 1 entries
383 buckets with 0 entries

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Stefan Sperling <st...@elego.de> writes:

> I may have found a way to store an svn_repos_t object on the connection.
> Does it help? (authz_tests pass)

No significant improvement.

I am timing

  time svn log http://localhost:8888/repos/asf/subversion \
     --username pm --password mp > /dev/null

and I realised my earlier timings had the apache LogLevel set to Debug
which means apache spends a significant time logging.  Lowering the
LogLevel to Crit improves the speed.  Timings for 7 runs after starting
apache:

         trunk      patched     reverted
          1.11         1.11         1.11        1.9

          5.2s         5.1s         3.1s        7.9s
          3.4s         3.4s         1.3s        4.5s
          3.7s         3.8s         1.7s        4.5s
          3.9s         3.9s         1.8s        4.5s
          4.0s         4.0s         1.9s        4.5s
          3.9s         3.9s         1.8s        4.5s
          3.8s         3.8s         1.8s        4.5s

Now that apache is logging less the performance hit is even clearer.
1.11 could be 2 to 3 times faster than 1.9 but the authz effect means it
is only about 10% faster.

Note an odd effect in the above numbers.  The second run for 1.11 is
always the fastest. The first run in each case is the slowest, the
Subversion cache is cold.  The second run is faster, the Subversion
cache is hot.  In 1.9 subsequent runs are comparable to the second, but
in 1.11 the subsequent runs are a bit slower than the second.  I don't
know what causes this effect.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Sat, Mar 03, 2018 at 11:43:31AM +0100, Stefan Sperling wrote:
> On Fri, Mar 02, 2018 at 09:40:28PM +0000, Philip Martin wrote:
> > Philip Martin <ph...@codematters.co.uk> writes:
> > 
> > > Again, 1.10 would be nearly twice as fast, but now rereading the authz
> > > removes most of that gain.
> > 
> > I think I see the underlying problem: the authz code now incorporates a
> > cache based on the md5 checksum of the rules, so when the rules are
> > unchanged the cached value can be reused.  This cache relies on the
> > caller being able to pass an svn_repos_t to svn_repos_authz_read3() and,
> > while svnserve does indeed pass such a pointer, mod_authz_svn is passing
> > NULL.  That means mod_authz_svn does not take advantage of the new authz
> > cache.
> > 
> > Stefan's pool patch helps, but I believe the authz rereading in
> > mod_authz_svn should be reverted from 1.10 unless we can make it take
> > advantage of the new authz cache.
> 
> Yes, the problem seems to be that mod_authz_svn has no way of storing
> per-connection state at present. The memory usage problem happened
> because it kept adding new access_conf objects to the per-connection
> pool since it has no way of knowing whether a previous request did
> already create the same object.
> 
> We could store per-connection data by using members of r->conn, such
> as r->conn->id or r->notes to index into a cache of svn_repos_t stored
> in r->conn's pool. But where could we store a pointer to this cache
> so that it could be read back from the request structure r?
> 
> Maybe we could use a global cache of svn_repos_t objects which live
> throughout the lifetime of mod_authz_svn and are shared across connections?
> It probably won't grow out of bounds as the number of repositories is
> relatively constant. But it's unclear how to ensure it stays in sync
> with the on-disk state. Hmm...

I may have found a way to store an svn_repos_t object on the connection.
Does it help? (authz_tests pass)

Index: subversion/mod_authz_svn/mod_authz_svn.c
===================================================================
--- subversion/mod_authz_svn/mod_authz_svn.c	(revision 1825762)
+++ subversion/mod_authz_svn/mod_authz_svn.c	(working copy)
@@ -409,6 +409,7 @@ get_access_conf(request_rec *r, authz_svn_config_r
   svn_authz_t *access_conf = NULL;
   svn_error_t *svn_err = SVN_NO_ERROR;
   dav_error *dav_err;
+  svn_repos_t *repos;
 
   dav_err = dav_svn_get_repos_path2(r, conf->base_path, &repos_path, scratch_pool);
   if (dav_err)
@@ -465,9 +466,24 @@ get_access_conf(request_rec *r, authz_svn_config_r
                     "Path to groups file is %s", groups_file);
     }
 
+  /* Store an svn_repos_t on this connection to allow authz config caching. */
+  repos = ap_get_module_config(r->connection->conn_config, &authz_svn_module);
+  if (repos == NULL)
+    {
+      svn_err = svn_repos_open3(&repos, repos_path, NULL, r->connection->pool,
+                                scratch_pool);
+      if (svn_err)
+        {
+          log_svn_error(APLOG_MARK, r, "Failed to open repository:",
+                        svn_err, scratch_pool);
+          return NULL;
+        }
+      ap_set_module_config(r->connection->conn_config, &authz_svn_module, repos);
+    }
+
   svn_err = svn_repos_authz_read3(&access_conf,
                                   access_file, groups_file,
-                                  TRUE, NULL,
+                                  TRUE, repos,
                                   result_pool, scratch_pool);
 
   if (svn_err)

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 09:40:28PM +0000, Philip Martin wrote:
> Philip Martin <ph...@codematters.co.uk> writes:
> 
> > Again, 1.10 would be nearly twice as fast, but now rereading the authz
> > removes most of that gain.
> 
> I think I see the underlying problem: the authz code now incorporates a
> cache based on the md5 checksum of the rules, so when the rules are
> unchanged the cached value can be reused.  This cache relies on the
> caller being able to pass an svn_repos_t to svn_repos_authz_read3() and,
> while svnserve does indeed pass such a pointer, mod_authz_svn is passing
> NULL.  That means mod_authz_svn does not take advantage of the new authz
> cache.
> 
> Stefan's pool patch helps, but I believe the authz rereading in
> mod_authz_svn should be reverted from 1.10 unless we can make it take
> advantage of the new authz cache.

Yes, the problem seems to be that mod_authz_svn has no way of storing
per-connection state at present. The memory usage problem happened
because it kept adding new access_conf objects to the per-connection
pool since it has no way of knowing whether a previous request did
already create the same object.

We could store per-connection data by using members of r->conn, such
as r->conn->id or r->notes to index into a cache of svn_repos_t stored
in r->conn's pool. But where could we store a pointer to this cache
so that it could be read back from the request structure r?

Maybe we could use a global cache of svn_repos_t objects which live
throughout the lifetime of mod_authz_svn and are shared across connections?
It probably won't grow out of bounds as the number of repositories is
relatively constant. But it's unclear how to ensure it stays in sync
with the on-disk state. Hmm...

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> Again, 1.10 would be nearly twice as fast, but now rereading the authz
> removes most of that gain.

I think I see the underlying problem: the authz code now incorporates a
cache based on the md5 checksum of the rules, so when the rules are
unchanged the cached value can be reused.  This cache relies on the
caller being able to pass an svn_repos_t to svn_repos_authz_read3() and,
while svnserve does indeed pass such a pointer, mod_authz_svn is passing
NULL.  That means mod_authz_svn does not take advantage of the new authz
cache.

Stefan's pool patch helps, but I believe the authz rereading in
mod_authz_svn should be reverted from 1.10 unless we can make it take
advantage of the new authz cache.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Stefan Sperling <st...@elego.de> writes:

> On Fri, Mar 02, 2018 at 06:20:48PM +0000, Philip Martin wrote:
>> 
>> Yes, that solves the memory use problem.
>
> Nice, I'll commit it then. This might not be a final fix but at least
> it's a step forward.
>
>> There is still a time penalty:
>> 
>> Your patch:
>> 
>>  $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null
>> 
>>  real	0m7.695s
>>  user	0m0.429s
>>  sys	0m0.283s
>> 
>>  apache httpd cpu: 7.30
>> 
>> Revert r1779188,1778923
>> 
>>  $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null
>> 
>>  real	0m5.102s
>>  user	0m0.460s
>>  sys	0m0.250s
>> 
>>  apache httpd cpu: 5.10
>
> If I understand it correctly, that's because 1778923 disabled caching
> on a per-connection basis? I don't really understand the log message.

For comparison, 1.9 gives

 $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null

 real	0m10.044s
 user	0m0.404s
 sys	0m0.064s

 apache httpd cpu: 9.98

1.10 would be nearly twice as fast as 1.9 but rereading the authz
removes over half of the gain.

All of those figures are the first run after starting Apache, i.e. when
the OS cache and the Subversion cache is cold.  With a hot Subversion
cache the values are:

            1.9: 6.0s
   patched 1.10: 5.5s
  reverted 1.10: 3.3s

Again, 1.10 would be nearly twice as fast, but now rereading the authz
removes most of that gain.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 06:20:48PM +0000, Philip Martin wrote:
> Stefan Sperling <st...@elego.de> writes:
> 
> > Hmmm. Does this help? The authz_tests pass with it.
> >
> > Index: subversion/mod_authz_svn/mod_authz_svn.c
> > ===================================================================
> > --- subversion/mod_authz_svn/mod_authz_svn.c	(revision 1825730)
> > +++ subversion/mod_authz_svn/mod_authz_svn.c	(working copy)
> > @@ -395,10 +395,11 @@ resolve_repos_relative_url(const char **path, cons
> >  }
> >  
> >  /*
> > - * Get the, possibly cached, svn_authz_t for this request.
> > + * Get the svn_authz_t for this request.
> >   */
> >  static svn_authz_t *
> >  get_access_conf(request_rec *r, authz_svn_config_rec *conf,
> > +                apr_pool_t *result_pool,
> >                  apr_pool_t *scratch_pool)
> >  {
> >    const char *access_file;
> > @@ -467,7 +468,7 @@ get_access_conf(request_rec *r, authz_svn_config_r
> >    svn_err = svn_repos_authz_read3(&access_conf,
> >                                    access_file, groups_file,
> >                                    TRUE, NULL,
> > -                                  r->connection->pool, scratch_pool);
> > +                                  result_pool, scratch_pool);
> >  
> >    if (svn_err)
> >      {
> > @@ -687,7 +688,7 @@ req_check_access(request_rec *r,
> >      }
> >  
> >    /* Retrieve/cache authorization file */
> > -  access_conf = get_access_conf(r,conf, r->pool);
> > +  access_conf = get_access_conf(r,conf, r->pool, r->pool);
> >    if (access_conf == NULL)
> >      return DECLINED;
> >  
> > @@ -804,7 +805,7 @@ subreq_bypass2(request_rec *r,
> >      }
> >  
> >    /* Retrieve authorization file */
> > -  access_conf = get_access_conf(r, conf, scratch_pool);
> > +  access_conf = get_access_conf(r, conf, scratch_pool, scratch_pool);
> >    if (access_conf == NULL)
> >      return HTTP_FORBIDDEN;
> 
> Yes, that solves the memory use problem.

Nice, I'll commit it then. This might not be a final fix but at least
it's a step forward.

> There is still a time penalty:
> 
> Your patch:
> 
>  $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null
> 
>  real	0m7.695s
>  user	0m0.429s
>  sys	0m0.283s
> 
>  apache httpd cpu: 7.30
> 
> Revert r1779188,1778923
> 
>  $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null
> 
>  real	0m5.102s
>  user	0m0.460s
>  sys	0m0.250s
> 
>  apache httpd cpu: 5.10

If I understand it correctly, that's because 1778923 disabled caching
on a per-connection basis? I don't really understand the log message.

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Stefan Sperling <st...@elego.de> writes:

> Hmmm. Does this help? The authz_tests pass with it.
>
> Index: subversion/mod_authz_svn/mod_authz_svn.c
> ===================================================================
> --- subversion/mod_authz_svn/mod_authz_svn.c	(revision 1825730)
> +++ subversion/mod_authz_svn/mod_authz_svn.c	(working copy)
> @@ -395,10 +395,11 @@ resolve_repos_relative_url(const char **path, cons
>  }
>  
>  /*
> - * Get the, possibly cached, svn_authz_t for this request.
> + * Get the svn_authz_t for this request.
>   */
>  static svn_authz_t *
>  get_access_conf(request_rec *r, authz_svn_config_rec *conf,
> +                apr_pool_t *result_pool,
>                  apr_pool_t *scratch_pool)
>  {
>    const char *access_file;
> @@ -467,7 +468,7 @@ get_access_conf(request_rec *r, authz_svn_config_r
>    svn_err = svn_repos_authz_read3(&access_conf,
>                                    access_file, groups_file,
>                                    TRUE, NULL,
> -                                  r->connection->pool, scratch_pool);
> +                                  result_pool, scratch_pool);
>  
>    if (svn_err)
>      {
> @@ -687,7 +688,7 @@ req_check_access(request_rec *r,
>      }
>  
>    /* Retrieve/cache authorization file */
> -  access_conf = get_access_conf(r,conf, r->pool);
> +  access_conf = get_access_conf(r,conf, r->pool, r->pool);
>    if (access_conf == NULL)
>      return DECLINED;
>  
> @@ -804,7 +805,7 @@ subreq_bypass2(request_rec *r,
>      }
>  
>    /* Retrieve authorization file */
> -  access_conf = get_access_conf(r, conf, scratch_pool);
> +  access_conf = get_access_conf(r, conf, scratch_pool, scratch_pool);
>    if (access_conf == NULL)
>      return HTTP_FORBIDDEN;

Yes, that solves the memory use problem.  There is still a time penalty:

Your patch:

 $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null

 real	0m7.695s
 user	0m0.429s
 sys	0m0.283s

 apache httpd cpu: 7.30

Revert r1779188,1778923

 $ time svn log -q http://localhost:8888/repo/asf/subversion > /dev/null

 real	0m5.102s
 user	0m0.460s
 sys	0m0.250s

 apache httpd cpu: 5.10

-- 
Philip

Re: Potential regression: high server-side memory consumption during import

Posted by Stefan Sperling <st...@elego.de>.

On Fri, Mar 02, 2018 at 03:56:40PM +0000, Philip Martin wrote:
> Evgeny Kotkov <ev...@visualsvn.com> writes:
> 
> > Perhaps, a simpler reproduction script would be to issue an 'svn log' for
> > a medium-sized repository.  In my environment, doing so for the trunk of
> > TortoiseSVN's repository with 25,000 revisions causes the httpd process
> > to consume up to a 1 GB of RAM while processing the request.  Overall,
> > the log takes around 11 seconds instead of 2, compared to 1.9.7.
> 
> I can reproduce the problem in my environment using my mirror of
> Subversion.  Running log with authz configured causes apache to use
> several GB of memory, that's for about 81,500 revisions.  Reverting
> r1779188 and r1778923 solves the memory growth problem.

Hmmm. Does this help? The authz_tests pass with it.

Index: subversion/mod_authz_svn/mod_authz_svn.c
===================================================================
--- subversion/mod_authz_svn/mod_authz_svn.c	(revision 1825730)
+++ subversion/mod_authz_svn/mod_authz_svn.c	(working copy)
@@ -395,10 +395,11 @@ resolve_repos_relative_url(const char **path, cons
 }
 
 /*
- * Get the, possibly cached, svn_authz_t for this request.
+ * Get the svn_authz_t for this request.
  */
 static svn_authz_t *
 get_access_conf(request_rec *r, authz_svn_config_rec *conf,
+                apr_pool_t *result_pool,
                 apr_pool_t *scratch_pool)
 {
   const char *access_file;
@@ -467,7 +468,7 @@ get_access_conf(request_rec *r, authz_svn_config_r
   svn_err = svn_repos_authz_read3(&access_conf,
                                   access_file, groups_file,
                                   TRUE, NULL,
-                                  r->connection->pool, scratch_pool);
+                                  result_pool, scratch_pool);
 
   if (svn_err)
     {
@@ -687,7 +688,7 @@ req_check_access(request_rec *r,
     }
 
   /* Retrieve/cache authorization file */
-  access_conf = get_access_conf(r,conf, r->pool);
+  access_conf = get_access_conf(r,conf, r->pool, r->pool);
   if (access_conf == NULL)
     return DECLINED;
 
@@ -804,7 +805,7 @@ subreq_bypass2(request_rec *r,
     }
 
   /* Retrieve authorization file */
-  access_conf = get_access_conf(r, conf, scratch_pool);
+  access_conf = get_access_conf(r, conf, scratch_pool, scratch_pool);
   if (access_conf == NULL)
     return HTTP_FORBIDDEN;

Re: Potential regression: high server-side memory consumption during import

Posted by Philip Martin <ph...@codematters.co.uk>.

Evgeny Kotkov <ev...@visualsvn.com> writes:

> Perhaps, a simpler reproduction script would be to issue an 'svn log' for
> a medium-sized repository.  In my environment, doing so for the trunk of
> TortoiseSVN's repository with 25,000 revisions causes the httpd process
> to consume up to a 1 GB of RAM while processing the request.  Overall,
> the log takes around 11 seconds instead of 2, compared to 1.9.7.

I can reproduce the problem in my environment using my mirror of
Subversion.  Running log with authz configured causes apache to use
several GB of memory, that's for about 81,500 revisions.  Reverting
r1779188 and r1778923 solves the memory growth problem.

-- 
Philip

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Evgeny Kotkov <ev...@visualsvn.com>.

Stefan Fuhrmann <st...@apache.org> writes:

> Would it be possible for you to bisect this to find the offending revision?
> My random guess would be that in the context of mod_dav_svn, we might use
> an unsuitable pool for authz caching.

While looking through the various 1.10-related topics, I remembered about
this issue.  I have been able to narrow it down in my environment to
https://svn.apache.org/r1778923 (Ensure that even long-lived DAV
connections use up-to-date authz.)

Perhaps, a simpler reproduction script would be to issue an 'svn log' for
a medium-sized repository.  In my environment, doing so for the trunk of
TortoiseSVN's repository with 25,000 revisions causes the httpd process
to consume up to a 1 GB of RAM while processing the request.  Overall,
the log takes around 11 seconds instead of 2, compared to 1.9.7.

Unless I am missing something, this might be worth considering before the
1.10 GA release.

Thanks,
Evgeny Kotkov

Re: Potential regression: high server-side memory consumption during import (was: Subversion 1.10 RC1?)

Posted by Stefan Fuhrmann <st...@apache.org>.

On 05.12.2017 22:05, Evgeny Kotkov wrote:
> Julian Foad <ju...@apache.org> writes:
> 
>> After any issues raised in this discussion are resolved, we feel we should
>> go ahead and produce RC1 as soon as possible.
> 
> I think that I am seeing a 1.10 regression in terms of httpd's memory
> usage during large imports.
> 
> In my environment, when I `svn import` an unpacked version of Boost
> (https://www.boost.org/) that consists of around 60,000 files, I see that
> the memory consumption by the server process rapidly grows up to 1.5 GB.
> The environment is a Windows 8.1 x64 machine with httpd 2.4.29 configured
> to use short-circuit authz and HTTPS.  Apparently, this behavior is specific
> to 1.10, as I cannot reproduce it with 1.9 binaries.
> 
>   (I haven't investigated the issue any further at this time, and it might as
>    well be reproducible in other configurations.)

Would it be possible for you to bisect this to
find the offending revision?  My random guess
would be that in the context of mod_dav_svn, we
might use an unsuitable pool for authz caching.

-- Stefan^2.