You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2016/01/21 11:59:39 UTC

[jira] [Comment Edited] (OAK-3889) SegmentMk StringCache memory leak

    [ https://issues.apache.org/jira/browse/OAK-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110422#comment-15110422 ] 

Alex Parvulescu edited comment on OAK-3889 at 1/21/16 10:59 AM:
----------------------------------------------------------------

While I was able to test this out and see that the cache size does go down a bit thanks to the key objects not being shared anymore, but the patch doesn't account for the entire growth of the cache size (now stands at _only_ double the size: 510mb+).
I believe this also comes from a poor estimation of the size of a cache item. 

We currently manually estimate it using 
{code}100 + s.length() * 2{code} 

which for an input like _'2009-11-14-002'_ yields _128_

but here's where things get interesting:

!CacheLIRS-Entry-heap.png|width=600!

the screenshot contains said string as a cache key, and the heap size stands at _204_.

Moreover if you look at the cache segment's _maxMemory_ and _usedMemory_ it is clear that the cache mechanics act like the cache is properly sized, even though it stands at double the size.

I'm not sure what could be done here, [~tmueller] [~mduerig] should we bump up the estimation to a more realistic value?

[~mreutegg] does documentmk account for the cache key overhead when estimating a size of an entry, or does it leave it to the cache to handle this?




was (Author: alex.parvulescu):
While I was able to test this out and see that the cache size does go down a bit thanks to the key objects not being shared anymore, but the patch doesn't account for the entire growth of the cache size (now stands at _only_ double the size: 510mb+).
I believe this also comes from a poor estimation of the size of a cache item. 

We currently manually estimate it using 
{code}100 + s.length() * 2{code} 

which for an input like _'2009-11-14-002'_ yields _128_

but here's where things get interesting:

!CacheLIRS-Entry-heap.png|width=600!

the screenshot contains said string as a cache key, and the heap size stands at _204_.

Moreover if you look at the cache segment's _maxMemory_ and _usedMemory_ it is clear that the cache mechanics act like the cache is properly sized, even though it stands at double the size.

I'm not sure what could be done here, [~tmueller] [~mduerig] should we bump up the estimation to a more realistic value?




> SegmentMk StringCache memory leak
> ---------------------------------
>
>                 Key: OAK-3889
>                 URL: https://issues.apache.org/jira/browse/OAK-3889
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segmentmk
>            Reporter: Alex Parvulescu
>         Attachments: CacheLIRS-Entry-heap.png, OAK-3889-v2.patch, StringCache.java.patch
>
>
> The StringCache is made of 2 components: a FastCache and a Lirs Cache and both caches use the same key object 'StringCacheEntry' with the condition that the FastCache contains the string value itself with the key while the Lirs Cache will only contain the _msb_, _lsb_ and _offset_.
> Sharing the same key leads to issues when a value qualifies for both caches as it results in the string value ending up contained in the Lirs Cache, effectively blowing up the cache's size. [0]
> On a test I ran I noticed the Lirs Cache going up to 800mb even though it was configured at 256mb.
> [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/StringCache.java#L86



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)