You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Evgeny Kotkov <ev...@visualsvn.com> on 2015/09/01 20:26:11 UTC

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Stefan Fuhrmann <st...@wandisco.com> writes:

> Yes. This is exactly why we can only use it when we have reasonable control
> over the stream's usage, i.e. we can use it in our CL tools because all the
> code that will be run is under our control. But we cannot make e.g.
> svn_stream_for_stdin() use it by default.

[...]

> The best solution seems to be to allow for explicit resource management as
> we do with other potentially "expensive" objects. r1700305 implements that.

I have several concerns about these changes (r1698359 and r1700305):

1) The new public API — svn_stream_wrap_buffered_read() — is dangerous to
   use, and wrong usage causes unbounded memory consumption in seemingly
   harmless situations.

   A svn_stream_mark() mark placed on this stream causes all data that's read
   afterwards to be stored in memory, even if the data was processed and
   immediately discarded.  Lifetime of the buffered data is bound to the
   lifetime of the pool where the mark object lives.  Hence, avoiding memory
   usage issues now requires either precise pool management or explicit control
   of where and when the svn_stream_mark_t objects are added and removed.

   For example, prior to r1700305, svnadmin load-revprops was actually causing
   unbounded memory usage.  Executing this command with a dump containing large
   revisions (e.g., .iso files) was quickly consuming all the available RAM on
   the machine.  This behavior was caused by a svn_stream_mark_t allocated in
   a long-living pool that was preventing deallocation of the buffered data.

   As of r1700305, the problem is mitigated by calling svn_stream_remove_mark()
   within the svn_stream_readline() implementation.  However, the new stream
   is generic, and every potential user of the stream API should be aware of
   these caveats, because she could in fact be working with the new stream.

   Place a mark somewhere in the middle of what svn_repos_parse_dumpstream3()
   does — and you could be already consuming unbounded amounts of memory.
   As another example, the pools are not being cleaned up / destroyed during
   error flow.  Say, we successfully place a stream mark, but receive an error
   afterwards, and, coincidentally, the caller thinks that it's okay to keep
   reading from the stream after this specific error.  Again, the memory usage
   is potentially unbounded.

2) The new API is backward incompatible.

   The problem is not only about how we handle this in our codebase.  Existing
   API users have no idea about the fact that having svn_stream_mark_t objects
   could be "expensive", and about the existence of svn_stream_remove_mark().

   Calling svn_stream_mark() can have a severe impact on the behavior of the
   stream, and can trigger memory usage issues if used improperly.  The new
   stream is a generic svn_stream_t, but instances of this stream cannot cross
   the API boundaries, because if they do, the users would be required to
   adapt their code in order to avoid the possibility of unbounded memory
   consumption.

3) The behavior of pool-based reference counting is unpredictable.

   The new svn_stream_wrap_buffered_read() stream performs reference counting
   on a per-pool basis, and its behavior depends of whether the corresponding
   pools were cleared / destroyed, or not.  This behavior is unpredictable and
   can lead to hard-to-diagnose problems that only happen under non-default
   circumstances or with certain data patterns.

   Furthermore, we have a history of dealing with unbounded memory usage that
   happened due to pool-based refcounting in the first-level FSFS DAG cache
   (CVE-2015-0202 [1]).  As far as I recall, at some point a long-living pool
   was causing positive reference count and preventing deallocation of the
   cached objects.  As I see it, the new stream does the same sort of reference
   counting for pools holding svn_stream_mark_t objects, and, similarly, if
   one of those objects is allocated in a long-living pool, it prevents the
   deallocation of the data buffered by the stream.

Given the above, I am -1 on this optimization for svnadmin load-revprops that
was implemented in r1698359 and r1700305.  Please revert these changes.

As for the problem itself, if the way we currently process the input during
svnadmin load and load-revprops is causing a noticeable overhead, I think that
we should introduce -F (--file) option to both of these commands:

  svnadmin load /path/to/repos -F (--file) /path/to/dump

  svnadmin load-revprops /path/to/repos -F (--file) /path/to/dump

As long as file streams support both svn_stream_seek() and svn_stream_mark(),
this should avoid byte-by-byte processing of the input and get rid of the
associated overhead.

[1] https://subversion.apache.org/security/CVE-2015-0202-advisory.txt


Regards,
Evgeny Kotkov

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Branko Čibej <br...@wandisco.com>.
On 01.09.2015 20:26, Evgeny Kotkov wrote:
> Stefan Fuhrmann <st...@wandisco.com> writes:
>
>> Yes. This is exactly why we can only use it when we have reasonable control
>> over the stream's usage, i.e. we can use it in our CL tools because all the
>> code that will be run is under our control. But we cannot make e.g.
>> svn_stream_for_stdin() use it by default.
> [...]
>
>> The best solution seems to be to allow for explicit resource management as
>> we do with other potentially "expensive" objects. r1700305 implements that.
> I have several concerns about these changes (r1698359 and r1700305):

FWIW: I agree with Evgeny's analysis and conclusions. There surely must
be a way to get reasonable performance from a generic stream without the
really flaky memory management that these changes bring.

One approach might be a similar buffered-stream wrapper that supports
mark/seek, but where the caller provides a (fixed-size) buffer and/or
buffer management callbacks. Something like that would make the
buffering explicit to the API consumer, although things might still
become tricky if such a stream is used in a generic stream context.
Perhaps such a buffered stream should be a completely different type of
object.

> As for the problem itself, if the way we currently process the input during
> svnadmin load and load-revprops is causing a noticeable overhead, I think that
> we should introduce -F (--file) option to both of these commands:
>
>   svnadmin load /path/to/repos -F (--file) /path/to/dump
>
>   svnadmin load-revprops /path/to/repos -F (--file) /path/to/dump
>
> As long as file streams support both svn_stream_seek() and svn_stream_mark(),
> this should avoid byte-by-byte processing of the input and get rid of the
> associated overhead.

This would not solve the common case where users dump/load without
incurring the possibly huge, or even unmanageable overhead of creating
an intermediate dumpfile.

-- Brane


Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ivan Zhakov wrote on Wed, Sep 02, 2015 at 18:39:38 +0300:
> On 2 September 2015 at 17:01, Johan Corveleyn <jc...@gmail.com> wrote:
> > On Wed, Sep 2, 2015 at 2:43 PM, Evgeny Kotkov
> > <ev...@visualsvn.com> wrote:
> >> Stefan Fuhrmann <st...@wandisco.com> writes:
> >>
> >>>> Given the above, I am -1 on this optimization for svnadmin load-revprops
> >>>> that was implemented in r1698359 and r1700305.  Please revert these
> >>>> changes.
> >>>
> >>> Thinking about alternative solutions I found that simply having a buffering
> >>> wrapper without mark support would still eliminate the OS overhead and parts
> >>> of the internal overhead. That would address all the points you have made
> >>> above while still providing a decent improvement.
> >>>
> >>> IOW, revert r1700305 and rework / reduce / simplify the code changed
> >>> by r1698359.
> >>
> >> I stated my veto and provided the justification that covers both r1700305
> >> and r1698359.  So, could you please revert both of them?  Reworking and
> >> adding changes on top of it is going to increase the mess and will be harder
> >> to review.
> >
> > ISTR that in this community we try to treat a veto as a signal that
> > further discussion is needed, or that more work is needed to resolve
> > the question / issue. You may ask / suggest a revert if you think
> > that's best, but it's mainly up for discussion how to best resolve
> > things (and other avenues, such as further commits, may also resolve
> > the issue).
> >
> You're right, but the problem is that the discussion doesn't happen.
> Instead of this new code is committed (usually broken again) [1].
> 

Could you please refrain from describing anyone's code as "usually
broken"?  That's neither concrete nor specific to the issue at hand
(r1698359 and r1700305).

Thanks,

Daniel

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ivan Zhakov wrote on Wed, Sep 02, 2015 at 18:39:38 +0300:
> On 2 September 2015 at 17:01, Johan Corveleyn <jc...@gmail.com> wrote:
> > On Wed, Sep 2, 2015 at 2:43 PM, Evgeny Kotkov
> > <ev...@visualsvn.com> wrote:
> >> Stefan Fuhrmann <st...@wandisco.com> writes:
> >>
> >>>> Given the above, I am -1 on this optimization for svnadmin load-revprops
> >>>> that was implemented in r1698359 and r1700305.  Please revert these
> >>>> changes.
> >>>
> >>> Thinking about alternative solutions I found that simply having a buffering
> >>> wrapper without mark support would still eliminate the OS overhead and parts
> >>> of the internal overhead. That would address all the points you have made
> >>> above while still providing a decent improvement.
> >>>
> >>> IOW, revert r1700305 and rework / reduce / simplify the code changed
> >>> by r1698359.
> >>
> >> I stated my veto and provided the justification that covers both r1700305
> >> and r1698359.  So, could you please revert both of them?  Reworking and
> >> adding changes on top of it is going to increase the mess and will be harder
> >> to review.
> >
> > ISTR that in this community we try to treat a veto as a signal that
> > further discussion is needed, or that more work is needed to resolve
> > the question / issue. You may ask / suggest a revert if you think
> > that's best, but it's mainly up for discussion how to best resolve
> > things (and other avenues, such as further commits, may also resolve
> > the issue).
> >
> You're right, but the problem is that the discussion doesn't happen.
> Instead of this new code is committed (usually broken again) [1].
> 
> Let summarize what happened from my point of view:
> 1. An optimization was implemented in r1698359
> 2. Evgeny raised his concerns about possible unbounded memory usage
> 3. Some fix was applied in r1700305 (no discussion happened)
> 4. Evgeny vetoed both of them.
> 5. The original optimization was reworked (no discussion happened)

Vetoes do not require the vetoed code to be reverted.

Vetoes do not require the vetoed codepath to be frozen.

Vetoes require the community to resolve the technical concernes
underlying them before the next release.

Rewriting the code is a form of discussion.  That discussion is
continuing on the r1700799 thread which you linked.  Let's wait for
Stefan2 to reply on that thread to Evgeny's technical concerns with the
rewrite.

> 6. Another problem was found in the new code [1]
> 7. ???

What's the question?  Do what you always do whenever a problem is found
in the code: review the code, reach consensus on a fix, implement it;
iteratively as needed.  That's exactly what the r1700799 thread is.

Cheers,

Daniel

P.S. I haven't reviewed the technical content of the commits in
question.

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 2 September 2015 at 17:01, Johan Corveleyn <jc...@gmail.com> wrote:
> On Wed, Sep 2, 2015 at 2:43 PM, Evgeny Kotkov
> <ev...@visualsvn.com> wrote:
>> Stefan Fuhrmann <st...@wandisco.com> writes:
>>
>>>> Given the above, I am -1 on this optimization for svnadmin load-revprops
>>>> that was implemented in r1698359 and r1700305.  Please revert these
>>>> changes.
>>>
>>> Thinking about alternative solutions I found that simply having a buffering
>>> wrapper without mark support would still eliminate the OS overhead and parts
>>> of the internal overhead. That would address all the points you have made
>>> above while still providing a decent improvement.
>>>
>>> IOW, revert r1700305 and rework / reduce / simplify the code changed
>>> by r1698359.
>>
>> I stated my veto and provided the justification that covers both r1700305
>> and r1698359.  So, could you please revert both of them?  Reworking and
>> adding changes on top of it is going to increase the mess and will be harder
>> to review.
>
> ISTR that in this community we try to treat a veto as a signal that
> further discussion is needed, or that more work is needed to resolve
> the question / issue. You may ask / suggest a revert if you think
> that's best, but it's mainly up for discussion how to best resolve
> things (and other avenues, such as further commits, may also resolve
> the issue).
>
You're right, but the problem is that the discussion doesn't happen.
Instead of this new code is committed (usually broken again) [1].

Let summarize what happened from my point of view:
1. An optimization was implemented in r1698359
2. Evgeny raised his concerns about possible unbounded memory usage
3. Some fix was applied in r1700305 (no discussion happened)
4. Evgeny vetoed both of them.
5. The original optimization was reworked (no discussion happened)
6. Another problem was found in the new code [1]
7. ???

This reminds me of the story we had with L1 DAG node cache -- see
"Unbounded memory usage and DoS possibility in mod_dav_svn / FSFS DAG
node cache" in private@s.a.o.

[1] http://svn.haxx.se/dev/archive-2015-09/0010.shtml

-- 
Ivan Zhakov

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Sep 02, 2015 at 04:01:08PM +0200, Johan Corveleyn wrote:
> ISTR that in this community we try to treat a veto as a signal that
> further discussion is needed, or that more work is needed to resolve
> the question / issue. You may ask / suggest a revert if you think
> that's best, but it's mainly up for discussion how to best resolve
> things (and other avenues, such as further commits, may also resolve
> the issue).
> 
> Of course, in some circumstances there is more pressure to revert
> quickly (e.g. when the build is broken, or other developer's work is
> blocked for other reasons, or ...). But usually we try to work our way
> towards consensus without "forcing" a revert.
> 
> On the other hand, it might have been better for Stefan to wait for
> your reaction first, and discuss the issue and a possible solution,
> before making further changes ...
> It's not like this is such an urgent matter that it cannot wait for a
> couple of days ...
> 
> Just my 2 cents,
> -- 
> Johan

+1

Thank you for writing this!

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Johan Corveleyn <jc...@gmail.com>.
On Wed, Sep 2, 2015 at 2:43 PM, Evgeny Kotkov
<ev...@visualsvn.com> wrote:
> Stefan Fuhrmann <st...@wandisco.com> writes:
>
>>> Given the above, I am -1 on this optimization for svnadmin load-revprops
>>> that was implemented in r1698359 and r1700305.  Please revert these
>>> changes.
>>
>> Thinking about alternative solutions I found that simply having a buffering
>> wrapper without mark support would still eliminate the OS overhead and parts
>> of the internal overhead. That would address all the points you have made
>> above while still providing a decent improvement.
>>
>> IOW, revert r1700305 and rework / reduce / simplify the code changed
>> by r1698359.
>
> I stated my veto and provided the justification that covers both r1700305
> and r1698359.  So, could you please revert both of them?  Reworking and
> adding changes on top of it is going to increase the mess and will be harder
> to review.

ISTR that in this community we try to treat a veto as a signal that
further discussion is needed, or that more work is needed to resolve
the question / issue. You may ask / suggest a revert if you think
that's best, but it's mainly up for discussion how to best resolve
things (and other avenues, such as further commits, may also resolve
the issue).

Of course, in some circumstances there is more pressure to revert
quickly (e.g. when the build is broken, or other developer's work is
blocked for other reasons, or ...). But usually we try to work our way
towards consensus without "forcing" a revert.

On the other hand, it might have been better for Stefan to wait for
your reaction first, and discuss the issue and a possible solution,
before making further changes ...
It's not like this is such an urgent matter that it cannot wait for a
couple of days ...

Just my 2 cents,
-- 
Johan

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Sep 2, 2015 at 5:11 PM, Evgeny Kotkov <ev...@visualsvn.com>
wrote:

> Branko Čibej <br...@wandisco.com> writes:
>
> > Uh, sorry, you don't get to dictate how the veto gets resolved. Reworking
> > what's on trunk is as valid as reverting. If you don't intend to work on
> the
> > solution, you may as well just let Stefan do that whichever way works
> best
> > for him.
>
> I certainly intend to work on the solution for the problems that I stated
> in my e-mails, but it's a quite difficult task given the amount of changes
> that target these problems, but happen without any prior discussion.


Fair enough. My position on that has two different aspects. First, I think
that the initial commit was well-contained (embracing the stream design
pattern, no server code touched, i.e. no risk of CVEs, explicitly selective
in its application). There was no reason to believe any released
functionality
would be impaired - and it wasn't. So, nothing prompted me to reach out
for feedback beforehand.

The second thing is that I rather prefer exploring a solution in terms of
writing code over speculating about it. Once a patch is available, I find it
much more convenient to review it in the code base than to fiddle through
patch contributions in e-mail. If the necessary change seems complex or
risky, open a branch. This is particularly true in later stages of a release
cycle.

The exploring part becomes particularly attractive when "urgency" is
indicated by raising an issue late Friday night before a long weekend
(Monday was a holiday in the UK) or holidays in general. I then like
to get these things of my plate while at the same time quickly finding
a partner for discussion is difficult. So, that makes just trying to come
up with a fix very attractive.

I am aware that others may prefer e-mail discussions with patches
going back and forth and that's perfectly fine. It's just not my preference.
I'm also pretty sure that you do not purposefully post issues to ruin other
people's time off but it important to know that it does have an effect
on my mode of work. And this is the only reason why I even bring it up.


>   This
> has now happened twice — once with r1700305, and the second time with
> r1700799, and I believe that both of these changes contain (other)
> problems.
>

r1700305 is a prime example of what I explained above.

r1700799 is very different. You called my code a mess and asked
for changes to be combined for easy review. That's what you've got.

What other problems do you suspect in r1700799?

>
> The reason why I asked if Stefan could revert the old changes beforehand is
> because otherwise it becomes easy to get lost for anyone who tries to get
> the big picture of what's happening, what we are trying to fix, and how do
> we do that.
>

Honestly, I found the initial "Please revert." rude. An implicit but
nonetheless absolute demand. It can easily be read as "I wont
listen to anything you say until you did as I just told you!"  A "I
think we should revert this and look for an alternative solution"
would probably have been much closer to your actual intentions.
I say closer because those are my words not yours.

The second time, you gave at least a rationale (ease of review).
After a few moments of figuring out that all what was needed was
simply a double revert, it has been a matter of minutes to actually
do it - without losing any functionality from my POV. So, I took
that as an annoying but fair request.

After being all explainy in this mail, I'd like to close with something
constructive. Maybe, it would help to prefix "I think stuff is broken"
mails with a simple 1-liner like "[important but not urgent]" or even
"[quick revert needed]". Not as a tag in the subject but just to set
the tone. Then it becomes obvious a solution is needed right away
or if it can easily wait for a week or two.

-- Stefan^2.

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@wandisco.com> writes:

> Uh, sorry, you don't get to dictate how the veto gets resolved. Reworking
> what's on trunk is as valid as reverting. If you don't intend to work on the
> solution, you may as well just let Stefan do that whichever way works best
> for him.

I certainly intend to work on the solution for the problems that I stated
in my e-mails, but it's a quite difficult task given the amount of changes
that target these problems, but happen without any prior discussion.  This
has now happened twice — once with r1700305, and the second time with
r1700799, and I believe that both of these changes contain (other) problems.

The reason why I asked if Stefan could revert the old changes beforehand is
because otherwise it becomes easy to get lost for anyone who tries to get
the big picture of what's happening, what we are trying to fix, and how do
we do that.


Regards,
Evgeny Kotkov

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Branko Čibej <br...@wandisco.com>.
On 2 Sep 2015 2:44 pm, "Evgeny Kotkov" <ev...@visualsvn.com> wrote:

> I stated my veto and provided the justification that covers both r1700305
> and r1698359.  So, could you please revert both of them?  Reworking and
> adding changes on top of it is going to increase the mess and will be
harder
> to review.

Uh, sorry, you don't get to dictate how the veto gets resolved. Reworking
what's on trunk is as valid as reverting. If you don't intend to work on
the solution, you may as well just let Stefan do that whichever way works
best for him.

-- Brane

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Stefan Fuhrmann <st...@wandisco.com> writes:

>> Given the above, I am -1 on this optimization for svnadmin load-revprops
>> that was implemented in r1698359 and r1700305.  Please revert these
>> changes.
>
> Thinking about alternative solutions I found that simply having a buffering
> wrapper without mark support would still eliminate the OS overhead and parts
> of the internal overhead. That would address all the points you have made
> above while still providing a decent improvement.
>
> IOW, revert r1700305 and rework / reduce / simplify the code changed
> by r1698359.

I stated my veto and provided the justification that covers both r1700305
and r1698359.  So, could you please revert both of them?  Reworking and
adding changes on top of it is going to increase the mess and will be harder
to review.

> I wanted to add -F for a while now because it makes debugging much easier
> (when using IDEs). Never got around it, though.
>
> So, feel free to add -F support to load and load-revprops.

Frankly, I don't have the numbers that would prove the noticeable performance
benefit from replacing byte-by-byte parsing with chunk-based parsing.  So, this
is not on my current priority list.


Regards,
Evgeny Kotkov

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, Sep 1, 2015 at 7:26 PM, Evgeny Kotkov <ev...@visualsvn.com>
wrote:

> Stefan Fuhrmann <st...@wandisco.com> writes:
>
> > Yes. This is exactly why we can only use it when we have reasonable
> control
> > over the stream's usage, i.e. we can use it in our CL tools because all
> the
> > code that will be run is under our control. But we cannot make e.g.
> > svn_stream_for_stdin() use it by default.
>
> [...]
>
> > The best solution seems to be to allow for explicit resource management
> as
> > we do with other potentially "expensive" objects. r1700305 implements
> that.
>
> I have several concerns about these changes (r1698359 and r1700305):
>
> 1) The new public API — svn_stream_wrap_buffered_read() — is dangerous to
>    use, and wrong usage causes unbounded memory consumption in seemingly
>    harmless situations. [...]
>

This is also true for svn_stringbuf_t, property lists and changed paths
lists.

2) The new API is backward incompatible.
>
>    The problem is not only about how we handle this in our codebase.
> Existing
>    API users have no idea about the fact that having svn_stream_mark_t
> objects
>    could be "expensive", and about the existence of
> svn_stream_remove_mark().

[...]
>

Some of them already *are* expensive. See translated_stream_mark()
and to some degree stream_mark() in jni_io_stream.cpp.


> 3) The behavior of pool-based reference counting is unpredictable.
>
[...]


It is no more unpredictable than open file handles, right?
You have to close or clean up all of them before you can
e.g. delete the file in Windows.


> Given the above, I am -1 on this optimization for svnadmin load-revprops
> that
> was implemented in r1698359 and r1700305.  Please revert these changes.
>

Thinking about alternative solutions I found that simply
having a buffering wrapper without mark support would
still eliminate the OS overhead and parts of the internal
overhead. That would address all the points you have
made above while still providing a decent improvement.

IOW, revert r1700305 and rework / reduce / simplify the
code changed by r1698359.

As for the problem itself, if the way we currently process the input during
> svnadmin load and load-revprops is causing a noticeable overhead, I think
> that
> we should introduce -F (--file) option to both of these commands:
>
>   svnadmin load /path/to/repos -F (--file) /path/to/dump
>
>   svnadmin load-revprops /path/to/repos -F (--file) /path/to/dump
>

I wanted to add -F for a while now because it makes
debugging much easier (when using IDEs). Never got
around it, though.

So, feel free to add -F support to load and load-revprops.

-- Stefan^2.