You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Chetan Mehrotra <ch...@gmail.com> on 2013/07/08 12:54:02 UTC

Using DirectMemory as second level cache

Hi,

I tried to integrate Apache DirectMemory as Second level cache (L2).
It uses Kryo  [2] for serializing/derserializing objects from the off
heap memory managed by DirectMemory. Initial test based on this show
quite a bit of saving in terms of memory taken

Below stats (sum of all MongoMK related cache) are taken when L2
memory is enabled after restarting a CQ instance and some basic access
done.

On Heap
    No of entries    - 138288
    Memory Taken - 217.3 MB
Off Heap
    No of entries    - 113012
    Memory Taken - 42.8 MB
    Hit Count 1257
    Total Load Time 80253030
    Avg Load Time 63844.8926014320

If I completely serialize the Document cache (the biggest of 3 )
- Document Cache - On Heap 134.7 MB for 43935 entries
- Document Cache - Off Heap 27.5 MB for 43935 entries. Takes 555.5 ms
to deserialize all 43935 entries

So far I can make following observation
-- Storing cache objects in serialized form provides a more compact storage
-- Deserializtion cost is low
-- Using Off heap memory would be helpful compared to on heap memory
-- Even if we do not use DirectMemory we can look into storing the
data in serialized form
-- L2 cache can be added as an optional feature. Oak Core can still be
used in absence of L2 cache

Caveats
-- DirectMemory project is still in very early phase and I have to use
the latest snapshot version with some local fixes to get it work

The required changes to be done can be checked at [3]

So would it be worthwhile to add this as an optional feature in Oak.
If user require they can enable L2 cache otherwise we only use the
Guava Cache as L1 cache

Thoughts?

Chetan Mehrotra

[1] http://directmemory.apache.org/
[2] https://code.google.com/p/kryo/
[3] https://github.com/chetanmeh/jackrabbit-oak/compare/apache:ab955ffd40e23a9f8aee87f9076bc045b643e35d...offheap

Re: Using DirectMemory as second level cache

Posted by Chetan Mehrotra <ch...@gmail.com>.
I have created OAK-891 to track this.

@Tommaso - Thanks for looking into this. I have couple of issues in
current DirectMemory implementation. Would open JIRA and follow up
there.

Chetan Mehrotra


On Tue, Jul 9, 2013 at 3:15 PM, Tommaso Teofili <te...@adobe.com> wrote:
>
> On 08/lug/2013, at 12:54, Chetan Mehrotra wrote:
>
>> Hi,
>>
>> I tried to integrate Apache DirectMemory as Second level cache (L2).
>> It uses Kryo  [2] for serializing/derserializing objects from the off
>> heap memory managed by DirectMemory. Initial test based on this show
>> quite a bit of saving in terms of memory taken
>>
>> Below stats (sum of all MongoMK related cache) are taken when L2
>> memory is enabled after restarting a CQ instance and some basic access
>> done.
>>
>> On Heap
>>    No of entries    - 138288
>>    Memory Taken - 217.3 MB
>> Off Heap
>>    No of entries    - 113012
>>    Memory Taken - 42.8 MB
>>    Hit Count 1257
>>    Total Load Time 80253030
>>    Avg Load Time 63844.8926014320
>>
>> If I completely serialize the Document cache (the biggest of 3 )
>> - Document Cache - On Heap 134.7 MB for 43935 entries
>> - Document Cache - Off Heap 27.5 MB for 43935 entries. Takes 555.5 ms
>> to deserialize all 43935 entries
>>
>> So far I can make following observation
>> -- Storing cache objects in serialized form provides a more compact storage
>> -- Deserializtion cost is low
>> -- Using Off heap memory would be helpful compared to on heap memory
>> -- Even if we do not use DirectMemory we can look into storing the
>> data in serialized form
>> -- L2 cache can be added as an optional feature. Oak Core can still be
>> used in absence of L2 cache
>>
>> Caveats
>> -- DirectMemory project is still in very early phase and I have to use
>> the latest snapshot version with some local fixes to get it work
>>
>> The required changes to be done can be checked at [3]
>>
>> So would it be worthwhile to add this as an optional feature in Oak.
>> If user require they can enable L2 cache otherwise we only use the
>> Guava Cache as L1 cache
>>
>> Thoughts?
>
> I think it makes sense, probably as an optional feature with a pluggable implementation (can be DM or some other mechanism).
> Regarding DM I think I the community would be happy to support our use case (and I'd be happy to help on that).
>
> Regards,
> Tommaso
>
>>
>> Chetan Mehrotra
>>
>> [1] http://directmemory.apache.org/
>> [2] https://code.google.com/p/kryo/
>> [3] https://github.com/chetanmeh/jackrabbit-oak/compare/apache:ab955ffd40e23a9f8aee87f9076bc045b643e35d...offheap
>

Re: Using DirectMemory as second level cache

Posted by Tommaso Teofili <te...@adobe.com>.
On 08/lug/2013, at 12:54, Chetan Mehrotra wrote:

> Hi,
> 
> I tried to integrate Apache DirectMemory as Second level cache (L2).
> It uses Kryo  [2] for serializing/derserializing objects from the off
> heap memory managed by DirectMemory. Initial test based on this show
> quite a bit of saving in terms of memory taken
> 
> Below stats (sum of all MongoMK related cache) are taken when L2
> memory is enabled after restarting a CQ instance and some basic access
> done.
> 
> On Heap
>    No of entries    - 138288
>    Memory Taken - 217.3 MB
> Off Heap
>    No of entries    - 113012
>    Memory Taken - 42.8 MB
>    Hit Count 1257
>    Total Load Time 80253030
>    Avg Load Time 63844.8926014320
> 
> If I completely serialize the Document cache (the biggest of 3 )
> - Document Cache - On Heap 134.7 MB for 43935 entries
> - Document Cache - Off Heap 27.5 MB for 43935 entries. Takes 555.5 ms
> to deserialize all 43935 entries
> 
> So far I can make following observation
> -- Storing cache objects in serialized form provides a more compact storage
> -- Deserializtion cost is low
> -- Using Off heap memory would be helpful compared to on heap memory
> -- Even if we do not use DirectMemory we can look into storing the
> data in serialized form
> -- L2 cache can be added as an optional feature. Oak Core can still be
> used in absence of L2 cache
> 
> Caveats
> -- DirectMemory project is still in very early phase and I have to use
> the latest snapshot version with some local fixes to get it work
> 
> The required changes to be done can be checked at [3]
> 
> So would it be worthwhile to add this as an optional feature in Oak.
> If user require they can enable L2 cache otherwise we only use the
> Guava Cache as L1 cache
> 
> Thoughts?

I think it makes sense, probably as an optional feature with a pluggable implementation (can be DM or some other mechanism).
Regarding DM I think I the community would be happy to support our use case (and I'd be happy to help on that).

Regards,
Tommaso

> 
> Chetan Mehrotra
> 
> [1] http://directmemory.apache.org/
> [2] https://code.google.com/p/kryo/
> [3] https://github.com/chetanmeh/jackrabbit-oak/compare/apache:ab955ffd40e23a9f8aee87f9076bc045b643e35d...offheap


Re: Using DirectMemory as second level cache

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Mon, Jul 8, 2013 at 1:54 PM, Chetan Mehrotra
<ch...@gmail.com> wrote:
> Initial test based on this show quite a bit of saving in terms of
> memory taken

This suggests that we could/should also look at optimizing the size of
objects cached in the JVM heap. Your numbers suggest that each
Document cache entry takes a few kilobytes of memory on average, which
is quite a lot. The average SegmentMK entry is an order of magnitude
smaller.

> Thoughts?

In general I think such an L2 cache makes a lot of sense (TarMK uses
memory mapped files for the same effect). That said, I think we should
also pay attention to how the L1 cache could be used more effectively
and not just let an L2 cache postpone the issue to a bigger scale.

BR,

Jukka Zitting