You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directmemory.apache.org by Ashish <pa...@gmail.com> on 2011/10/16 09:31:32 UTC

Will OffHeapMemoryBuffer get fragmented over time?

Folks,

Will the offHeapMemoryBuffer get fragmented over time? Say after a
couple thousand get/remove operations, will the off-heap have start
having holes in the Buffer?

-- 
thanks
ashish

Re: Will OffHeapMemoryBuffer get fragmented over time?

Posted by Ashish <pa...@gmail.com>.
On Sun, Oct 16, 2011 at 6:56 PM, Raffaele P. Guidi
<ra...@gmail.com> wrote:
> I suggest to make the defragmentation strategy another configurable aspect
> and to keep the first option in any case because its both the simplest to
> achieve and the one with less runtime overhead. I would call it the
> aggressive strategy ;-)

Agree :)

Lets get it working, benchmark it and then plan about refining, if needed.

cheers
ashish

Re: Will OffHeapMemoryBuffer get fragmented over time?

Posted by "Raffaele P. Guidi" <ra...@gmail.com>.
I suggest to make the defragmentation strategy another configurable aspect
and to keep the first option in any case because its both the simplest to
achieve and the one with less runtime overhead. I would call it the
aggressive strategy ;-)

On Sunday, October 16, 2011, Ashish <pa...@gmail.com> wrote:
> On Sun, Oct 16, 2011 at 5:57 PM, Daniel Manzke
> <da...@googlemail.com> wrote:
>> How about the Idea Hadoop is working. Make the Size of the Slices
>> configurable and asume that the Values fit into it. Or small ones get
into
>> One slice, but this ones are marked as dropable :)
>>
>> Bye,
>> Daniel
>
> I think this is what memcache does. This would be important as the for
> lesser number of entries people might not use offheap. They would go
> for it to store millions of entries without having to worry about GC,
> and that's where our Memory Manager implementation would matter a lot.
>
> We can also take idea from the HFile, the way HBase stores key-value
> pairs there. The important distinction would be give our
> implementation a Map view, as we won't be using scans to retrieve
> data, and may never store keys in lexicographical order.
>
> cheers
> ashish
>

Re: Will OffHeapMemoryBuffer get fragmented over time?

Posted by Ashish <pa...@gmail.com>.
On Sun, Oct 16, 2011 at 5:57 PM, Daniel Manzke
<da...@googlemail.com> wrote:
> How about the Idea Hadoop is working. Make the Size of the Slices
> configurable and asume that the Values fit into it. Or small ones get into
> One slice, but this ones are marked as dropable :)
>
> Bye,
> Daniel

I think this is what memcache does. This would be important as the for
lesser number of entries people might not use offheap. They would go
for it to store millions of entries without having to worry about GC,
and that's where our Memory Manager implementation would matter a lot.

We can also take idea from the HFile, the way HBase stores key-value
pairs there. The important distinction would be give our
implementation a Map view, as we won't be using scans to retrieve
data, and may never store keys in lexicographical order.

cheers
ashish

Re: Will OffHeapMemoryBuffer get fragmented over time?

Posted by Daniel Manzke <da...@googlemail.com>.
How about the Idea Hadoop is working. Make the Size of the Slices
configurable and asume that the Values fit into it. Or small ones get into
One slice, but this ones are marked as dropable :)

Bye,
Daniel

Am Sonntag, 16. Oktober 2011 schrieb Ashish <pa...@gmail.com>:
> On Sun, Oct 16, 2011 at 2:50 PM, Raffaele P. Guidi
> <ra...@gmail.com> wrote:
>> It will, definitely. I had two solutions ready in my mind (that rely on
>> having more than one buffer active):
>>
>>   1. *Simplest, and fastest* but with *some drawbacks*: when
>>   buffer.isTooDefragmented() then simply buffer.clear() - you loose
>>   everything, but - hey, it's a cache, not a db
>
> IMHO, we can't assume that since its a cache, we can clear any buffer
> it at our own will.
> Cache entries must always be evicted based on what is configured by
> user. Also we need a very efficient way of
> finding when a buffer is too fragmented.
>
> Take a use case, we put an entry a few hundred KB in size, and a lot
> of entries which are few KB in size.
> So how would the implementation work in these scenario's. I am just
> thinking loud, we may already have this working, and I may not be
> aware of it :)
>
>>   2. *Less simple, slower, less drawbacks*: when
>>   buffer.isTooDefragmented() mark the buffer as readOnly and then foreach
(ptr
>>   in buffer) copy ptr.content in emptyBuffer and update ptr accordingly
>>
>> where *isTooFragmented==number_of_empty_pointers over total_pointers >
>> desirable quota*
>>
>> The first one could be accomplished during a put() operation
(buffer.clear
>> is a logical operation that takes no time) while the second should be
taken
>> care of by the background thread. Those quick&dirty solutions could of
>> course be replaced with real defragmentation algorithms - may taken from
>> various malloc() implementations, that are the original inspiration
>> http://en.wikipedia.org/wiki/Malloc#Implementations
>
> Lets experiment with different strategies and see which works best. I
> am yet to take a deep dive into the Pointer implementation :)
>
>>
>> Beside that: I think this was filed in github issue tracker as an
>> enhancement together with some more - I think I should re-file them in
JIRA.
>
> Missed that, actually this came up when I was implementing the sample.
>
>>
>> Ciao,
>>    R
>>
>> On Sun, Oct 16, 2011 at 9:31 AM, Ashish <pa...@gmail.com> wrote:
>>
>>> Folks,
>>>
>>> Will the offHeapMemoryBuffer get fragmented over time? Say after a
>>> couple thousand get/remove operations, will the off-heap have start
>>> having holes in the Buffer?
>>>
>>> --
>>> thanks
>>> ashish
>>>
>>
>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>

-- 
Viele Grüße/Best Regards

Daniel Manzke

Re: Will OffHeapMemoryBuffer get fragmented over time?

Posted by Ashish <pa...@gmail.com>.
On Sun, Oct 16, 2011 at 2:50 PM, Raffaele P. Guidi
<ra...@gmail.com> wrote:
> It will, definitely. I had two solutions ready in my mind (that rely on
> having more than one buffer active):
>
>   1. *Simplest, and fastest* but with *some drawbacks*: when
>   buffer.isTooDefragmented() then simply buffer.clear() - you loose
>   everything, but - hey, it's a cache, not a db

IMHO, we can't assume that since its a cache, we can clear any buffer
it at our own will.
Cache entries must always be evicted based on what is configured by
user. Also we need a very efficient way of
finding when a buffer is too fragmented.

Take a use case, we put an entry a few hundred KB in size, and a lot
of entries which are few KB in size.
So how would the implementation work in these scenario's. I am just
thinking loud, we may already have this working, and I may not be
aware of it :)

>   2. *Less simple, slower, less drawbacks*: when
>   buffer.isTooDefragmented() mark the buffer as readOnly and then foreach (ptr
>   in buffer) copy ptr.content in emptyBuffer and update ptr accordingly
>
> where *isTooFragmented==number_of_empty_pointers over total_pointers >
> desirable quota*
>
> The first one could be accomplished during a put() operation (buffer.clear
> is a logical operation that takes no time) while the second should be taken
> care of by the background thread. Those quick&dirty solutions could of
> course be replaced with real defragmentation algorithms - may taken from
> various malloc() implementations, that are the original inspiration
> http://en.wikipedia.org/wiki/Malloc#Implementations

Lets experiment with different strategies and see which works best. I
am yet to take a deep dive into the Pointer implementation :)

>
> Beside that: I think this was filed in github issue tracker as an
> enhancement together with some more - I think I should re-file them in JIRA.

Missed that, actually this came up when I was implementing the sample.

>
> Ciao,
>    R
>
> On Sun, Oct 16, 2011 at 9:31 AM, Ashish <pa...@gmail.com> wrote:
>
>> Folks,
>>
>> Will the offHeapMemoryBuffer get fragmented over time? Say after a
>> couple thousand get/remove operations, will the off-heap have start
>> having holes in the Buffer?
>>
>> --
>> thanks
>> ashish
>>
>



-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Re: Will OffHeapMemoryBuffer get fragmented over time?

Posted by "Raffaele P. Guidi" <ra...@gmail.com>.
It will, definitely. I had two solutions ready in my mind (that rely on
having more than one buffer active):

   1. *Simplest, and fastest* but with *some drawbacks*: when
   buffer.isTooDefragmented() then simply buffer.clear() - you loose
   everything, but - hey, it's a cache, not a db
   2. *Less simple, slower, less drawbacks*: when
   buffer.isTooDefragmented() mark the buffer as readOnly and then foreach (ptr
   in buffer) copy ptr.content in emptyBuffer and update ptr accordingly

where *isTooFragmented==number_of_empty_pointers over total_pointers >
desirable quota*

The first one could be accomplished during a put() operation (buffer.clear
is a logical operation that takes no time) while the second should be taken
care of by the background thread. Those quick&dirty solutions could of
course be replaced with real defragmentation algorithms - may taken from
various malloc() implementations, that are the original inspiration
http://en.wikipedia.org/wiki/Malloc#Implementations

Beside that: I think this was filed in github issue tracker as an
enhancement together with some more - I think I should re-file them in JIRA.

Ciao,
    R

On Sun, Oct 16, 2011 at 9:31 AM, Ashish <pa...@gmail.com> wrote:

> Folks,
>
> Will the offHeapMemoryBuffer get fragmented over time? Say after a
> couple thousand get/remove operations, will the off-heap have start
> having holes in the Buffer?
>
> --
> thanks
> ashish
>