You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Bryan Duxbury <br...@rapleaf.com> on 2008/04/01 05:22:59 UTC

Serving from memory

Quick poll for us devs - if you had to guess, how long do you think  
it would take for the in-memory option of HBase to actually be  
implemented to work reliably?

-Bryan

Re: Serving from memory

Posted by Bryan Duxbury <br...@rapleaf.com>.
Seems about right.

On Apr 1, 2008, at 9:41 AM, stack wrote:

> Chatting on list, was thought that an InMemoryMapFile would not be  
> too hard to do.. couple of days, maybe.  In HBaseMapFile, could  
> read all of the data into memory, into arrays (since it already  
> sorted) as we do now reading in the index.  The HBaseMapFile.Reader  
> would be modified to go get entries from in-memory rather than from  
> disk.
>
> Would be ugly since flags would have to go down through multiple  
> levels of inheritance -- down through BloomFilterMapFile,  
> HalfMapFile -- and that it should get cleaned up when we do our own  
> Mapfile.
>
> But was thought, without more intelligent rebalancing of regions  
> over cluster so they were evenly distributed, with our current  
> lumpy assignment, it would be easy for a regionserver to have its  
> memory overstrained; it'd go down with an OOME, regions would be  
> redistributed lumpy, another would go down and then a downward  
> spiral.  Was thought this had to be addressed first.
>
> (That a fair summary Bryan?)
>
> St.Ack
>
>
> Bryan Duxbury wrote:
>> I'm not thinking of hot cell caching. I'm talking about going the  
>> whole way and putting all the data in-memory. So yes, store file  
>> contents would be loaded into memory, though not the memcache,  
>> because that would get really complicated, I think.  
>> InMemoryStoreFile would really be what I was going for, I'd guess.  
>> The table wouldn't be read-only. Writes would go through to disk  
>> but reads would come straight from memory.
>>
>> On Mar 31, 2008, at 8:42 PM, stack wrote:
>>
>>> A Reference-cache of hot cells would take a day at the outside  
>>> I'd guess.  The bulk of the work is done.
>>>
>>> If you're talking about something else, lets discuss.  What would  
>>> it look like?  Store MapFiles would be floated in memory or  
>>> copied to MemCache?  We'd need a special In-Memory MapFile?  We'd  
>>> do a bulk memcopy from HDFS up into mem and then you'd serve from  
>>> there?  Would the table have to be read-only?
>>>
>>> St.Ack
>>>
>>>
>>> Bryan Duxbury wrote:
>>>> Quick poll for us devs - if you had to guess, how long do you  
>>>> think it would take for the in-memory option of HBase to  
>>>> actually be implemented to work reliably?
>>>>
>>>> -Bryan
>>>
>>
>


Re: Serving from memory

Posted by stack <st...@duboce.net>.
Chatting on list, was thought that an InMemoryMapFile would not be too 
hard to do.. couple of days, maybe.  In HBaseMapFile, could read all of 
the data into memory, into arrays (since it already sorted) as we do now 
reading in the index.  The HBaseMapFile.Reader would be modified to go 
get entries from in-memory rather than from disk.

Would be ugly since flags would have to go down through multiple levels 
of inheritance -- down through BloomFilterMapFile, HalfMapFile -- and 
that it should get cleaned up when we do our own Mapfile.

But was thought, without more intelligent rebalancing of regions over 
cluster so they were evenly distributed, with our current lumpy 
assignment, it would be easy for a regionserver to have its memory 
overstrained; it'd go down with an OOME, regions would be redistributed 
lumpy, another would go down and then a downward spiral.  Was thought 
this had to be addressed first.

(That a fair summary Bryan?)

St.Ack


Bryan Duxbury wrote:
> I'm not thinking of hot cell caching. I'm talking about going the 
> whole way and putting all the data in-memory. So yes, store file 
> contents would be loaded into memory, though not the memcache, because 
> that would get really complicated, I think. InMemoryStoreFile would 
> really be what I was going for, I'd guess. The table wouldn't be 
> read-only. Writes would go through to disk but reads would come 
> straight from memory.
>
> On Mar 31, 2008, at 8:42 PM, stack wrote:
>
>> A Reference-cache of hot cells would take a day at the outside I'd 
>> guess.  The bulk of the work is done.
>>
>> If you're talking about something else, lets discuss.  What would it 
>> look like?  Store MapFiles would be floated in memory or copied to 
>> MemCache?  We'd need a special In-Memory MapFile?  We'd do a bulk 
>> memcopy from HDFS up into mem and then you'd serve from there?  Would 
>> the table have to be read-only?
>>
>> St.Ack
>>
>>
>> Bryan Duxbury wrote:
>>> Quick poll for us devs - if you had to guess, how long do you think 
>>> it would take for the in-memory option of HBase to actually be 
>>> implemented to work reliably?
>>>
>>> -Bryan
>>
>


Re: Serving from memory

Posted by Bryan Duxbury <br...@rapleaf.com>.
I'm not thinking of hot cell caching. I'm talking about going the  
whole way and putting all the data in-memory. So yes, store file  
contents would be loaded into memory, though not the memcache,  
because that would get really complicated, I think. InMemoryStoreFile  
would really be what I was going for, I'd guess. The table wouldn't  
be read-only. Writes would go through to disk but reads would come  
straight from memory.

On Mar 31, 2008, at 8:42 PM, stack wrote:

> A Reference-cache of hot cells would take a day at the outside I'd  
> guess.  The bulk of the work is done.
>
> If you're talking about something else, lets discuss.  What would  
> it look like?  Store MapFiles would be floated in memory or copied  
> to MemCache?  We'd need a special In-Memory MapFile?  We'd do a  
> bulk memcopy from HDFS up into mem and then you'd serve from  
> there?  Would the table have to be read-only?
>
> St.Ack
>
>
> Bryan Duxbury wrote:
>> Quick poll for us devs - if you had to guess, how long do you  
>> think it would take for the in-memory option of HBase to actually  
>> be implemented to work reliably?
>>
>> -Bryan
>


Re: Serving from memory

Posted by stack <st...@duboce.net>.
A Reference-cache of hot cells would take a day at the outside I'd 
guess.  The bulk of the work is done.

If you're talking about something else, lets discuss.  What would it 
look like?  Store MapFiles would be floated in memory or copied to 
MemCache?  We'd need a special In-Memory MapFile?  We'd do a bulk 
memcopy from HDFS up into mem and then you'd serve from there?  Would 
the table have to be read-only?

St.Ack


Bryan Duxbury wrote:
> Quick poll for us devs - if you had to guess, how long do you think it 
> would take for the in-memory option of HBase to actually be 
> implemented to work reliably?
>
> -Bryan