You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@wandisco.com> on 2014/05/05 01:13:59 UTC

Re: svn commit: r1590405 - in /subversion/trunk: build.conf subversion/include/private/svn_subr_private.h subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c

On Tue, Apr 29, 2014 at 4:29 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 29 April 2014 17:54, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Mon, Apr 28, 2014 at 8:11 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >>
> >> eOn 27 April 2014 19:27,  <st...@apache.org> wrote:
> >> > Author: stefan2
> >> > Date: Sun Apr 27 15:27:46 2014
> >> > New Revision: 1590405
> >> >
> >> > URL: http://svn.apache.org/r1590405
> >> > Log:
> >> > More 'svn log -g' memory usage reduction.  We use a hash to keep track
> >> > of all revisions reported so far, i.e. easily a million.
> >> >
>
 > * Some system provided APR (1.5+ in particular) uses mmap

> >   to allocate memory. I.e. for every block, e.g. 8k, there is a
> >   separate mmap call. The Linux default is 65530 (sic!) mmap
> >   regions per process. Slowly allocating pools can trigger OOM
> >   errors after only 512MB actual memory usage (sum across
> >   all threads). I already prepared a patch for that.
> >
> Ouch, I didn't know that. I was thinking that MMAP APR pool allocator
> is experimental and is not enabled by default.
>

It is not enabled by default, I guess but the
package responsible decided to enable it anyway.


>
> >> > We introduce a simple packed bit array data structure to replace
> >> > the hash.  For repos < 100M revs, the initialization overhead is less
> >> > than 1ms and will amortize as soon as more than 1% of all revs are
> >> > reported.
> >> >
> >>
> >> It may be worth implement the same trick like we done with
> >> membuffer_cache: use array of bit arrays for every 100k of revisions
> >> for example and initialize them lazy. I mean:
> >> [0...99999] - bit array 0
> >> [100000....199999] -- bit array 1
> >> ...
> >>
> >> It should be easy to implement.
> >
> >
> > I gave it a try and it turned out not too horribly complex.
> > See r1590982.
> Great!
>
> But it may be worth to keep original svn_bit_array and add new
> svn_sparse_bit_array() with array of svn_bit_array() objects So things
> will be separated in two micro layers.
>

I think the new implementation is flexible enough
for a wide range of future usages. I'd prefer having
a single implementation for now as it is simply less
code that could break.

-- Stefan^2.

Re: svn commit: r1590405 - in /subversion/trunk: build.conf subversion/include/private/svn_subr_private.h subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Mon, May 5, 2014 at 11:50 AM, Bert Huijben <be...@qqmail.nl> wrote:

>
> > -----Original Message-----
> > From: Ivan Zhakov [mailto:ivan@visualsvn.com]
> > Sent: maandag 5 mei 2014 10:26
> > To: Stefan Fuhrmann
> > Cc: Subversion Development; Stefan Fuhrman
> > Subject: Re: svn commit: r1590405 - in /subversion/trunk: build.conf
> > subversion/include/private/svn_subr_private.h
> > subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c
> >
> > On 5 May 2014 03:13, Stefan Fuhrmann <st...@wandisco.com>
> > wrote:
> > >
> > >
> > >
> > > On Tue, Apr 29, 2014 at 4:29 PM, Ivan Zhakov <iv...@visualsvn.com>
> wrote:
> > >>
> > >> On 29 April 2014 17:54, Stefan Fuhrmann
> > <st...@wandisco.com>
> > >> wrote:
> > >> > On Mon, Apr 28, 2014 at 8:11 AM, Ivan Zhakov <iv...@visualsvn.com>
> > wrote:
> > >> >>
> > >> >> eOn 27 April 2014 19:27,  <st...@apache.org> wrote:
> > >> >> > Author: stefan2
> > >> >> > Date: Sun Apr 27 15:27:46 2014
> > >> >> > New Revision: 1590405
> > >> >> >
> > >> >> > URL: http://svn.apache.org/r1590405
> > >> >> > Log:
> > >> >> > More 'svn log -g' memory usage reduction.  We use a hash to keep
> > >> >> > track
> > >> >> > of all revisions reported so far, i.e. easily a million.
> > >> >> >
> > >
> > >  > * Some system provided APR (1.5+ in particular) uses mmap
> > >
> > >> >   to allocate memory. I.e. for every block, e.g. 8k, there is a
> > >> >   separate mmap call. The Linux default is 65530 (sic!) mmap
> > >> >   regions per process. Slowly allocating pools can trigger OOM
> > >> >   errors after only 512MB actual memory usage (sum across
> > >> >   all threads). I already prepared a patch for that.
> > >> >
> > >> Ouch, I didn't know that. I was thinking that MMAP APR pool allocator
> > >> is experimental and is not enabled by default.
> > >
> > >
> > > It is not enabled by default, I guess but the
> > > package responsible decided to enable it anyway.
> > >
> > Thanks for information.
>
> In that case I think we should try to fix APR to remove this limitation,
> instead of rewriting our own code to cope with this.
> When it uses mmap as allocator it should request bigger chunks from the OS.
>
> I don't think mmap() will be faster than the default allocator for such
> small chunks... I would expect quite the opposite, unless code is using big
> chunks.
>

I already have two patches for it. Just hadn't found the time
to submit them. (I just arrived in NYC for svnlive 2014).

-- Stefan^2.

RE: svn commit: r1590405 - in /subversion/trunk: build.conf subversion/include/private/svn_subr_private.h subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Ivan Zhakov [mailto:ivan@visualsvn.com]
> Sent: maandag 5 mei 2014 10:26
> To: Stefan Fuhrmann
> Cc: Subversion Development; Stefan Fuhrman
> Subject: Re: svn commit: r1590405 - in /subversion/trunk: build.conf
> subversion/include/private/svn_subr_private.h
> subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c
> 
> On 5 May 2014 03:13, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> >
> >
> >
> > On Tue, Apr 29, 2014 at 4:29 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >>
> >> On 29 April 2014 17:54, Stefan Fuhrmann
> <st...@wandisco.com>
> >> wrote:
> >> > On Mon, Apr 28, 2014 at 8:11 AM, Ivan Zhakov <iv...@visualsvn.com>
> wrote:
> >> >>
> >> >> eOn 27 April 2014 19:27,  <st...@apache.org> wrote:
> >> >> > Author: stefan2
> >> >> > Date: Sun Apr 27 15:27:46 2014
> >> >> > New Revision: 1590405
> >> >> >
> >> >> > URL: http://svn.apache.org/r1590405
> >> >> > Log:
> >> >> > More 'svn log -g' memory usage reduction.  We use a hash to keep
> >> >> > track
> >> >> > of all revisions reported so far, i.e. easily a million.
> >> >> >
> >
> >  > * Some system provided APR (1.5+ in particular) uses mmap
> >
> >> >   to allocate memory. I.e. for every block, e.g. 8k, there is a
> >> >   separate mmap call. The Linux default is 65530 (sic!) mmap
> >> >   regions per process. Slowly allocating pools can trigger OOM
> >> >   errors after only 512MB actual memory usage (sum across
> >> >   all threads). I already prepared a patch for that.
> >> >
> >> Ouch, I didn't know that. I was thinking that MMAP APR pool allocator
> >> is experimental and is not enabled by default.
> >
> >
> > It is not enabled by default, I guess but the
> > package responsible decided to enable it anyway.
> >
> Thanks for information.

In that case I think we should try to fix APR to remove this limitation, instead of rewriting our own code to cope with this. 
When it uses mmap as allocator it should request bigger chunks from the OS.

I don't think mmap() will be faster than the default allocator for such small chunks... I would expect quite the opposite, unless code is using big chunks.

	Bert 


Re: svn commit: r1590405 - in /subversion/trunk: build.conf subversion/include/private/svn_subr_private.h subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 5 May 2014 03:13, Stefan Fuhrmann <st...@wandisco.com> wrote:
>
>
>
> On Tue, Apr 29, 2014 at 4:29 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>
>> On 29 April 2014 17:54, Stefan Fuhrmann <st...@wandisco.com>
>> wrote:
>> > On Mon, Apr 28, 2014 at 8:11 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> >>
>> >> eOn 27 April 2014 19:27,  <st...@apache.org> wrote:
>> >> > Author: stefan2
>> >> > Date: Sun Apr 27 15:27:46 2014
>> >> > New Revision: 1590405
>> >> >
>> >> > URL: http://svn.apache.org/r1590405
>> >> > Log:
>> >> > More 'svn log -g' memory usage reduction.  We use a hash to keep
>> >> > track
>> >> > of all revisions reported so far, i.e. easily a million.
>> >> >
>
>  > * Some system provided APR (1.5+ in particular) uses mmap
>
>> >   to allocate memory. I.e. for every block, e.g. 8k, there is a
>> >   separate mmap call. The Linux default is 65530 (sic!) mmap
>> >   regions per process. Slowly allocating pools can trigger OOM
>> >   errors after only 512MB actual memory usage (sum across
>> >   all threads). I already prepared a patch for that.
>> >
>> Ouch, I didn't know that. I was thinking that MMAP APR pool allocator
>> is experimental and is not enabled by default.
>
>
> It is not enabled by default, I guess but the
> package responsible decided to enable it anyway.
>
Thanks for information.

>> >> > We introduce a simple packed bit array data structure to replace
>> >> > the hash.  For repos < 100M revs, the initialization overhead is less
>> >> > than 1ms and will amortize as soon as more than 1% of all revs are
>> >> > reported.
>> >> >
>> >>
>> >> It may be worth implement the same trick like we done with
>> >> membuffer_cache: use array of bit arrays for every 100k of revisions
>> >> for example and initialize them lazy. I mean:
>> >> [0...99999] - bit array 0
>> >> [100000....199999] -- bit array 1
>> >> ...
>> >>
>> >> It should be easy to implement.
>> >
>> >
>> > I gave it a try and it turned out not too horribly complex.
>> > See r1590982.
>> Great!
>>
>> But it may be worth to keep original svn_bit_array and add new
>> svn_sparse_bit_array() with array of svn_bit_array() objects So things
>> will be separated in two micro layers.
>
>
> I think the new implementation is flexible enough
> for a wide range of future usages. I'd prefer having
> a single implementation for now as it is simply less
> code that could break.
>
Ok. It was just suggestion. Code looks great now.


-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com