You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Eric Gillespie <ep...@pretzelnet.org> on 2004/10/26 03:23:27 UTC

r11556 causes enormous fsfs memory use

Importing NetBSD src (822 MB, 89482 files, 10902 directories)
into an fsfs repository with svnserve 1.1.1, svnserve grows to
use 124 MB.  The user who initially reported this to me saw it
bump into a 128 MB resource limit.

I tracked this down to r11556.  Undo that from a 1.1.1 working
copy, and the memory usage decreases back to 1.1.0 levels which,
while bad[1], were at least acceptable.

Hopefully someone more familiar with this UTF-8 caching stuff (hi
lundblad and dionosos! :) will beat me to investigating further.
If not, i'll look into it tomorrow.

[1] Even with 1.1.0, there is something fishy going on with
    svnserve memory usage:

1.0.5 bdb:
32608 egillesp  25   0 28228  26M  2084 D    45.5  5.3   5:25   0 svnserve
32552 egillesp  15   0  5120 5064  1364 S     0.0  0.9   1:22   0 svn

1.1.0 bdb:
  384 egillesp  25   0 29348  26M  2272 D    31.3  5.3   5:38   0 svnserve
  328 egillesp  15   0  5972 5956  1588 S     0.0  1.1   1:22   0 svn

1.1.0 fsfs:
  871 egillesp  16   0 20632  19M  1072 R    10.3  3.9   7:38   0 svnserve
  815 egillesp  25   0  4852 4852   704 S     0.0  0.9   2:04   0 svn
  871 egillesp  17   0 66396  64M  1072 D    34.9 12.8   7:40   0 svnserve
  815 egillesp  25   0  4852 4852   704 S     0.0  0.9   2:04   0 svn

bdb has reasonable usage throughout its lifetime.  fsfs does
through most of its life, but then spikes to 64MB at the end.  Is
that the finalize-a-transaction-into-a-rev step?  Does it have to
be so costly?  With 1.1.1, we see the same pattern, except the
spike is from 23 MB to 124 MB:

21164 egillesp  25   0 24016  23M  1092 S    47.7  4.6   1:33   0 svnserve
21107 egillesp  25   0 12436  11M  1008 R    19.1  2.3   0:48   0 svn
[...]
21164 egillesp  16   0 55232  49M  1116 R     9.1  9.9   7:43   0 svnserve
21107 egillesp  25   0 32792  28M   704 S     0.0  5.6   2:00   0 svn
21164 egillesp  16   0 66820  61M  1064 R    12.7 12.2   7:43   0 svnserve
21107 egillesp  25   0 32792  28M   704 S     0.0  5.6   2:00   0 svn
21164 egillesp  15   0 98.0M  93M  1060 R    24.2 18.6   7:44   0 svnserve
21107 egillesp  25   0 32792  28M   704 S     0.0  5.6   2:00   0 svn
21164 egillesp  16   0  129M 124M  1128 D    22.3 24.8   7:46   0 svnserve
21107 egillesp  25   0 32768  28M   680 S     0.0  5.6   2:00   0 svn
21164 egillesp  15   0  1540 1528  1180 R    12.3  0.2   7:46   0 svnserve
21107 egillesp  15   0 30352  25M   900 D     0.0  5.1   2:00   0 svn

-- 
Eric Gillespie <*> epg@pretzelnet.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Tue, 2 Nov 2004, Peter N. Lundblad wrote:

> On Sun, 31 Oct 2004, Eric Gillespie wrote:
>
> > "Peter N. Lundblad" <pe...@famlundblad.se> writes:
> >
> > > According to my tests, it looks like I pluggged the leak introduced by the
> > > new utf8 caching in r11673. If you want to try it out, Eric, it would be
> > > nice.
> > >
> > You may well have plugged the leak, but this did not have any
> > noticeable effect on the large fsfs import.  Probably it did, but
> > whatever savings it gave still left the process spiralling past
> > 128 MB, where it hits my resource limit.  I have posted two
> > patches that take care of that.
> >
> Yes, that was my observation too. It at least got smaller. I'll do my 4
> linux kernels import and see how memory usage is after your fixes, too.
>
Before you're changes, importing the four kernels peaked at about 27MB at
the end. Afterwards it was about 16MB.

Nice work!
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Sun, 31 Oct 2004, Eric Gillespie wrote:

> "Peter N. Lundblad" <pe...@famlundblad.se> writes:
>
> > According to my tests, it looks like I pluggged the leak introduced by the
> > new utf8 caching in r11673. If you want to try it out, Eric, it would be
> > nice.
> >
> You may well have plugged the leak, but this did not have any
> noticeable effect on the large fsfs import.  Probably it did, but
> whatever savings it gave still left the process spiralling past
> 128 MB, where it hits my resource limit.  I have posted two
> patches that take care of that.
>
Yes, that was my observation too. It at least got smaller. I'll do my 4
linux kernels import and see how memory usage is after your fixes, too.

Thanks,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by Eric Gillespie <ep...@pretzelnet.org>.

"Peter N. Lundblad" <pe...@famlundblad.se> writes:

> According to my tests, it looks like I pluggged the leak introduced by the
> new utf8 caching in r11673. If you want to try it out, Eric, it would be
> nice.
> 
> Prpposed for inclusion in 1.1.2.

You may well have plugged the leak, but this did not have any
noticeable effect on the large fsfs import.  Probably it did, but
whatever savings it gave still left the process spiralling past
128 MB, where it hits my resource limit.  I have posted two
patches that take care of that.

Thanks, Peter.

--  
Eric Gillespie <*> epg@pretzelnet.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Tue, 26 Oct 2004, Peter N. Lundblad wrote:

> On Mon, 25 Oct 2004, Eric Gillespie wrote:
>
> > Importing NetBSD src (822 MB, 89482 files, 10902 directories)
> > into an fsfs repository with svnserve 1.1.1, svnserve grows to
> > use 124 MB.  The user who initially reported this to me saw it
> > bump into a 128 MB resource limit.
> >
> > I tracked this down to r11556.  Undo that from a 1.1.1 working
> > copy, and the memory usage decreases back to 1.1.0 levels which,
> > while bad[1], were at least acceptable.
> >
> It apperas that apr_hash_set() allocates memory for each call that isn't a
> delete, even if there is an entry in the table already. We are using
> apr_hash_set to get and put back a translation handle for each translation
> (we don't want to do the translation while holding a global lock). So
> there we have an allocation per translation, which, ofcourse is
> unacceptable.
>
According to my tests, it looks like I pluggged the leak introduced by the
new utf8 caching in r11673. If you want to try it out, Eric, it would be
nice.

Prpposed for inclusion in 1.1.2.

Thanks,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Wed, 27 Oct 2004, [UTF-8] Branko �^Libej wrote:

> Peter N. Lundblad wrote:
>
> >It apperas that apr_hash_set() allocates memory for each call that isn't a
> >delete, even if there is an entry in the table already.
> >
> Then that's an APR bug. Sure, we should work around it in SVN, but it
> needs fixing in APR, too.
>
Well, since setting the value to NULL (which we often do in this case) is
equivalent to deleting the entry, it might be valid to allocate memory
when the value changes from NULL. Might be a philosophical question...

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by Branko Čibej <br...@xbc.nu>.

Julian Foad wrote:

> Branko Čibej wrote:
>
>> Peter N. Lundblad wrote:
>>
>>> It apperas that apr_hash_set() allocates memory for each call that 
>>> isn't a
>>> delete, even if there is an entry in the table already.
>>>
>> Then that's an APR bug. Sure, we should work around it in SVN, but it 
>> needs fixing in APR, too.
>
>
> Is this still thought to be a bug in APR?

Well, if APR allocates memory when it doesn't have to, I'd say it is.

-- Brane



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] APR hash table memory use

Posted by Branko Čibej <br...@xbc.nu>.

Branko Čibej wrote:

> Thom May wrote:
>
>> Committed to HEAD, thanks.
>> -Thom
>>
>> * Julian Foad (julianfoad@btopenworld.com) wrote :
>>  
>>
>>> The Subversion project encountered a usage pattern of APR hash 
>>> tables that caused many megabytes of memory to be occupied by a 
>>> small hash table after a sequence of many thousands of (insert, 
>>> delete) pairs.  They have subsequently changed the usage pattern to 
>>> avoid this.
>>>   
>>
> And I wanted to merge this to the 0.9 branch, now it turns out I don't 
> know my password because I've always used ssh access...

Doh! Figured it out. Merged to the 0.9 branch in r65594.

Sorry for the noise.
(Although I do think that a small post to the list about the conversion 
and what to do about access wouldn't have hurt; not all of us are at the 
hackathon...)

-- Brane

Re: [PATCH] APR hash table memory use

Posted by Branko Čibej <br...@xbc.nu>.

Thom May wrote:

>Committed to HEAD, thanks.
>-Thom
>
>* Julian Foad (julianfoad@btopenworld.com) wrote :
>  
>
>>The Subversion project encountered a usage pattern of APR hash tables that 
>>caused many megabytes of memory to be occupied by a small hash table after 
>>a sequence of many thousands of (insert, delete) pairs.  They have 
>>subsequently changed the usage pattern to avoid this.
>>    
>>
And I wanted to merge this to the 0.9 branch, now it turns out I don't 
know my password because I've always used ssh access...

 
-- Brane



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] APR hash table memory use

Posted by Branko Čibej <br...@xbc.nu>.

Thom May wrote:

>Committed to HEAD, thanks.
>-Thom
>
>* Julian Foad (julianfoad@btopenworld.com) wrote :
>  
>
>>The Subversion project encountered a usage pattern of APR hash tables that 
>>caused many megabytes of memory to be occupied by a small hash table after 
>>a sequence of many thousands of (insert, delete) pairs.  They have 
>>subsequently changed the usage pattern to avoid this.
>>    
>>
And I wanted to merge this to the 0.9 branch, now it turns out I don't 
know my password because I've always used ssh access...

 
-- Brane

Re: [PATCH] APR hash table memory use

Posted by Thom May <th...@planetarytramp.net>.

Committed to HEAD, thanks.
-Thom

* Julian Foad (julianfoad@btopenworld.com) wrote :
> The Subversion project encountered a usage pattern of APR hash tables that 
> caused many megabytes of memory to be occupied by a small hash table after 
> a sequence of many thousands of (insert, delete) pairs.  They have 
> subsequently changed the usage pattern to avoid this.

Re: [PATCH] APR hash table memory use

Posted by Sander Striker <st...@apache.org>.

On Tue, 2004-11-02 at 17:22, Julian Foad wrote:
> Prevent unbounded memory use during repeated operations on a hash table.
> 
> The hash table was allocating new memory for each new insertion of an entry,
> and not reclaiming it on deletions, so repetition of (insert, delete) caused
> the memory use to keep growing.  This fix causes the memory freed by a deletion
> to be reused by a subsequent insertion, so that the memory used by the hash
> table is proportional to the maximum number of entries that it has ever held
> simultaneously, rather than the number of insertions that have been performed.

+1.

Sander

Re: [PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

> Julian Foad wrote:
>> Prevent unbounded memory use during repeated operations on a hash table.

Sander Striker wrote:
> +1.

Branko Čibej wrote:
> This looks good. I vote we apply it on both HEAD and the 0.9.x branch.

Can I do anything further to get this applied?

- Julian

Re: [PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

Branko Čibej wrote:
> Julian Foad wrote:
>> Prevent unbounded memory use during repeated operations on a hash table.
[...]
> This looks good. I vote we apply it on both HEAD and the 0.9.x branch.

Thanks.  I have now checked that it passes Subversion's "make check", and, in a 
debugger, that the new code is actually executed.

I would be very happy if someone would do an overall memory usage check with a 
very large operation (e.g. in Subversion without the patch that avoids this usage).

- Julian

Re: [PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

> Julian Foad wrote:
>> Prevent unbounded memory use during repeated operations on a hash table.

Sander Striker wrote:
> +1.

Branko Čibej wrote:
> This looks good. I vote we apply it on both HEAD and the 0.9.x branch.

Can I do anything further to get this applied?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

Branko Čibej wrote:
> Julian Foad wrote:
>> Prevent unbounded memory use during repeated operations on a hash table.
[...]
> This looks good. I vote we apply it on both HEAD and the 0.9.x branch.

Thanks.  I have now checked that it passes Subversion's "make check", and, in a 
debugger, that the new code is actually executed.

I would be very happy if someone would do an overall memory usage check with a 
very large operation (e.g. in Subversion without the patch that avoids this usage).

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] APR hash table memory use

Posted by Branko Čibej <br...@xbc.nu>.

Julian Foad wrote:

>Prevent unbounded memory use during repeated operations on a hash table.
>
>The hash table was allocating new memory for each new insertion of an entry,
>and not reclaiming it on deletions, so repetition of (insert, delete) caused
>the memory use to keep growing.  This fix causes the memory freed by a deletion
>to be reused by a subsequent insertion, so that the memory used by the hash
>table is proportional to the maximum number of entries that it has ever held
>simultaneously, rather than the number of insertions that have been performed.
>
>* apr/tables/apr_hash.c:
>  (apr_hash_t): Add new field "free", a free-list.
>  (apr_hash_make, apr_hash_copy, apr_hash_merge): Initialise the free-list.
>  (find_entry): Use an entry from the free-list if there is one.
>  (apr_hash_set): Return a deleted entry to the free-list.
>  
>
This looks good. I vote we apply it on both HEAD and the 0.9.x branch.

-- Brane

Re: [PATCH] APR hash table memory use

Posted by Thom May <th...@planetarytramp.net>.

Committed to HEAD, thanks.
-Thom

* Julian Foad (julianfoad@btopenworld.com) wrote :
> The Subversion project encountered a usage pattern of APR hash tables that 
> caused many megabytes of memory to be occupied by a small hash table after 
> a sequence of many thousands of (insert, delete) pairs.  They have 
> subsequently changed the usage pattern to avoid this.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] APR hash table memory use

Posted by Sander Striker <st...@apache.org>.

On Tue, 2004-11-02 at 17:22, Julian Foad wrote:
> Prevent unbounded memory use during repeated operations on a hash table.
> 
> The hash table was allocating new memory for each new insertion of an entry,
> and not reclaiming it on deletions, so repetition of (insert, delete) caused
> the memory use to keep growing.  This fix causes the memory freed by a deletion
> to be reused by a subsequent insertion, so that the memory used by the hash
> table is proportional to the maximum number of entries that it has ever held
> simultaneously, rather than the number of insertions that have been performed.

+1.

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] APR hash table memory use

Posted by Branko Čibej <br...@xbc.nu>.

Julian Foad wrote:

>Prevent unbounded memory use during repeated operations on a hash table.
>
>The hash table was allocating new memory for each new insertion of an entry,
>and not reclaiming it on deletions, so repetition of (insert, delete) caused
>the memory use to keep growing.  This fix causes the memory freed by a deletion
>to be reused by a subsequent insertion, so that the memory used by the hash
>table is proportional to the maximum number of entries that it has ever held
>simultaneously, rather than the number of insertions that have been performed.
>
>* apr/tables/apr_hash.c:
>  (apr_hash_t): Add new field "free", a free-list.
>  (apr_hash_make, apr_hash_copy, apr_hash_merge): Initialise the free-list.
>  (find_entry): Use an entry from the free-list if there is one.
>  (apr_hash_set): Return a deleted entry to the free-list.
>  
>
This looks good. I vote we apply it on both HEAD and the 0.9.x branch.

-- Brane



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

The Subversion project encountered a usage pattern of APR hash tables that 
caused many megabytes of memory to be occupied by a small hash table after a 
sequence of many thousands of (insert, delete) pairs.  They have subsequently 
changed the usage pattern to avoid this.

The cause of the problem is that every time apr_hash_set(..., non-NULL) adds a 
new entry, it allocates some new pool memory for that entry and links it into 
one of its linked lists.  When apr_hash_set(..., NULL) deletes an entry, it 
unlinks it but never re-uses that memory.

So this is an inefficiency in APR's implementation of hash tables.  It is not 
exactly a "leak", since it is contained with a pool, but it is unbounded memory 
use caused by repeated simple actions - not a good property for a container to 
have.

It would be possible to fix this in APR if people thought it worthwhile, e.g. 
by using a "free-list".  (Maintain, in the hash table, a list of free entries, 
initially empty.  When deleting an entry, move it to the free-list.  When a new 
entry is wanted, take one from the free-list if available, else allocate it 
from the pool.)  This would make the memory used by a hash table proportional 
to the maximum number of entries that it had ever held simultaneously, rather 
than the number of insertions that had ever been performed.

In fact, using a free-list seems to be quite easy, with virtually no speed 
penalty.  A patch is attached, prepared in the manner of a Subversion patch.

(Please copy replies to dev@subversion.tigris.org, as I am not subscribed to 
dev@apr.apache.org)

- Julian

[PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

The Subversion project encountered a usage pattern of APR hash tables that 
caused many megabytes of memory to be occupied by a small hash table after a 
sequence of many thousands of (insert, delete) pairs.  They have subsequently 
changed the usage pattern to avoid this.

The cause of the problem is that every time apr_hash_set(..., non-NULL) adds a 
new entry, it allocates some new pool memory for that entry and links it into 
one of its linked lists.  When apr_hash_set(..., NULL) deletes an entry, it 
unlinks it but never re-uses that memory.

So this is an inefficiency in APR's implementation of hash tables.  It is not 
exactly a "leak", since it is contained with a pool, but it is unbounded memory 
use caused by repeated simple actions - not a good property for a container to 
have.

It would be possible to fix this in APR if people thought it worthwhile, e.g. 
by using a "free-list".  (Maintain, in the hash table, a list of free entries, 
initially empty.  When deleting an entry, move it to the free-list.  When a new 
entry is wanted, take one from the free-list if available, else allocate it 
from the pool.)  This would make the memory used by a hash table proportional 
to the maximum number of entries that it had ever held simultaneously, rather 
than the number of insertions that had ever been performed.

In fact, using a free-list seems to be quite easy, with virtually no speed 
penalty.  A patch is attached, prepared in the manner of a Subversion patch.

(Please copy replies to dev@subversion.tigris.org, as I am not subscribed to 
dev@apr.apache.org)

- Julian

[PATCH] APR hash table memory use

Posted by Julian Foad <ju...@btopenworld.com>.

Julian Foad wrote:
> It would be possible to fix this in APR if people thought it worthwhile, 
[...]

Well, in fact, it seems to be quite easy.  Patch attached, prepared in the 
manner of a Subversion patch.  Does anyone here care to review or test it?

> It seems to me that the APR developers ought to be interested in doing 
> this. Will I make the effort to take this to their mailing list?  Don't 
> know.

Having done a patch, I probably will.

- Julian

APR hash table memory use [was: r11556 causes enormous fsfs memory use]

Posted by Julian Foad <ju...@btopenworld.com>.

Peter N. Lundblad wrote:
> Let's see if we a) mean the same by insert, remove, ... and b) read the
> code the same way:-)
> 
> 1. hash_set(..., non-NULL): entry not found, allocate.
> 2. hash_set(..., NULL) (remove): entry found, remove from linked list
> 3. hash_set(..., non-NULL): entry not found, allocate
> ...
> 
> If you do this, the hash pool will grow for each set. (The above is for
> the same key). We can discuss whose leak it was, but it isn't anymore,
> since I eliminated that usage.

Thanks for clarifying.  I understand now:

Every time apr_hash_set(..., non-NULL) adds a new entry, it allocates some pool 
memory for that entry and links it into one of its linked lists.  When 
apr_hash_set(..., NULL) deletes an entry, it unlinks it but does not give that 
memory back to the pool.

So this is an inefficiency in APR's implementation of hash tables.  It is not 
exactly a "leak", since it is contained with a pool, but it is unbounded memory 
use caused by repeated simple actions - not a good property for a container to 
have.

It would be possible to fix this in APR if people thought it worthwhile, e.g. 
by using a "free" list.  (Maintain, in the hash table, a list of free entries, 
initially empty.  When deleting an entry, move it to the "free" list.  When a 
new entry is wanted, take one from the "free" list if available, else allocate 
it from the pool.)  This would make the memory used by a hash table 
proportional to the maximum number of entries that it had ever held 
simultaneously, rather than the number of insertions that had ever been performed.

It seems to me that the APR developers ought to be interested in doing this. 
Will I make the effort to take this to their mailing list?  Don't know.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Tue, 2 Nov 2004, Julian Foad wrote:

> Peter N. Lundblad wrote:
> > On Tue, 2 Nov 2004, Julian Foad wrote:
> >>Branko Čibej wrote:
> >>Is this still thought to be a bug in APR?
> >
> > Actually, when I read the code mroe carefully, it allocates a new entry
> > when inserting a new entry. This isn't a bug. Our use in this case was to
> > insert, remove, insert, remove etc. This caused the leak.
>
>  From my reading of the implementation of apr_hash_set(), a sequence of insert,
> remove, insert, remove etc. would not cause the hash table to expand (except
> perhaps once, on the very first insertion).
>
Let's see if we a) mean the same by insert, remove, ... and b) read the
code the same way:-)

1. hash_set(..., non-NULL): entry not found, allocate.
2. hash_set(..., NULL) (remove): entry found, remove from linked list
3. hash_set(..., non-NULL): entry not found, allocate
...

If you do this, the hash pool will grow for each set. (The above is for
the same key). We can discuss whose leak it was, but it isn't anymore,
since I eliminated that usage.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by Julian Foad <ju...@btopenworld.com>.

Peter N. Lundblad wrote:
> On Tue, 2 Nov 2004, Julian Foad wrote:
>>Branko Čibej wrote:
>>>Peter N. Lundblad wrote:
>>>>It apperas that apr_hash_set() allocates memory for each call that isn't a
>>>>delete, even if there is an entry in the table already.
>>>
>>>Then that's an APR bug. Sure, we should work around it in SVN, but it
>>>needs fixing in APR, too.
>>
>>Is this still thought to be a bug in APR?
> 
> Actually, when I read the code mroe carefully, it allocates a new entry
> when inserting a new entry. This isn't a bug. Our use in this case was to
> insert, remove, insert, remove etc. This caused the leak.

 From my reading of the implementation of apr_hash_set(), a sequence of insert, 
remove, insert, remove etc. would not cause the hash table to expand (except 
perhaps once, on the very first insertion).

I therefore assume the leak was in Subversion, and not in APR.

 > I think it would be too much to require APR to cache deleted entries.

That would be an unreasonable expectation - but I don't quite see the 
relevance.  Note that APR hash tables are responsible only for storing pointers 
to your data, not for allocating memory for your data.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Tue, 2 Nov 2004, Julian Foad wrote:

> Branko Čibej wrote:
> > Peter N. Lundblad wrote:
> >> It apperas that apr_hash_set() allocates memory for each call that isn't a
> >> delete, even if there is an entry in the table already.
> >>
> > Then that's an APR bug. Sure, we should work around it in SVN, but it
> > needs fixing in APR, too.
>
> Is this still thought to be a bug in APR?
>
Actually, when I read the code mroe carefully, it allocates a new entry
when inserting a new entry. This isn't a bug. Our use in this case was to
insert, remove, insert, remove etc. This caused the leak. I think it would
be too much to require APR to cache deleted entries.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by Julian Foad <ju...@btopenworld.com>.

Branko Čibej wrote:
> Peter N. Lundblad wrote:
>> It apperas that apr_hash_set() allocates memory for each call that isn't a
>> delete, even if there is an entry in the table already.
>>
> Then that's an APR bug. Sure, we should work around it in SVN, but it 
> needs fixing in APR, too.

Is this still thought to be a bug in APR?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by Branko Čibej <br...@xbc.nu>.

Peter N. Lundblad wrote:

>On Mon, 25 Oct 2004, Eric Gillespie wrote:
>
>  
>
>>Importing NetBSD src (822 MB, 89482 files, 10902 directories)
>>into an fsfs repository with svnserve 1.1.1, svnserve grows to
>>use 124 MB.  The user who initially reported this to me saw it
>>bump into a 128 MB resource limit.
>>
>>I tracked this down to r11556.  Undo that from a 1.1.1 working
>>copy, and the memory usage decreases back to 1.1.0 levels which,
>>while bad[1], were at least acceptable.
>>
>>    
>>
>Thank you for tracking it that far. I'm pretty sure I found the reason for
>this.
>
>It apperas that apr_hash_set() allocates memory for each call that isn't a
>delete, even if there is an entry in the table already.
>
Then that's an APR bug. Sure, we should work around it in SVN, but it 
needs fixing in APR, too.

-- Brane



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: r11556 causes enormous fsfs memory use

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Mon, 25 Oct 2004, Eric Gillespie wrote:

> Importing NetBSD src (822 MB, 89482 files, 10902 directories)
> into an fsfs repository with svnserve 1.1.1, svnserve grows to
> use 124 MB.  The user who initially reported this to me saw it
> bump into a 128 MB resource limit.
>
> I tracked this down to r11556.  Undo that from a 1.1.1 working
> copy, and the memory usage decreases back to 1.1.0 levels which,
> while bad[1], were at least acceptable.
>
Thank you for tracking it that far. I'm pretty sure I found the reason for
this.

It apperas that apr_hash_set() allocates memory for each call that isn't a
delete, even if there is an entry in the table already. We are using
apr_hash_set to get and put back a translation handle for each translation
(we don't want to do the translation while holding a global lock). So
there we have an allocation per translation, which, ofcourse is
unacceptable.

I think we can avoid this by another level of indirection:-) Storing a
pointer to a pointer to the first entry in the hash table instead of the
pointer itself. This wouldn't be removed, but just modified.

I don't think I can work on this until the weekend.

Thanks for an interesting evening, Erik, with importing four linux kernels
a few times:-)
Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org