You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Li Pi (JIRA)" <ji...@apache.org> on 2011/06/22 19:22:48 UTC

[jira] [Created] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Attach memcached as secondary block cache to regionserver
---------------------------------------------------------

                 Key: HBASE-4018
                 URL: https://issues.apache.org/jira/browse/HBASE-4018
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
            Reporter: Li Pi
            Assignee: Li Pi


Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.

We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067925#comment-13067925 ] 

Li Pi commented on HBASE-4018:
------------------------------

I just went directly to direct byte buffers. See 4027. Closing this for now.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054194#comment-13054194 ] 

Jason Rutherglen commented on HBASE-4018:
-----------------------------------------

I understand the problem you're trying to solve here a little better, eg, the block cache and the GC.  Perhaps JNA [1] can also be used for this use case, eg [2] enables direct creation and destruction of an array (unlike direct byte buffers which doesn't enable 'direct' destruction).

1. https://github.com/twall/jna

2. https://github.com/twall/jna/blob/master/src/com/sun/jna/Memory.java


> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Li Pi reassigned HBASE-4018:
----------------------------

    Assignee: Li Pi

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054156#comment-13054156 ] 

Jason Rutherglen commented on HBASE-4018:
-----------------------------------------

bq. fs cache will always be compressed

That's likely where the slowdown occurs.  I agree the values should be compressed, in many cases the CPU overhead dwarfs (or should) the extra RAM consumption from uncompressing into heap space.  Right now in HBase there's effectively a page fault when a block isn't in the cache, eg it then loads from disk or network and uncompress'es into RAM while [likely] also removing existing pages/blocks.  That seems likely to be problematic.

CPU should be cheaper than RAM especially for HBase which logically should be IO bound.  This is also true of search, eg compression of posting lists is implemented using vint or PFOR, instead of laying all the ints out on disk.  Search then becomes CPU bound from the iteration of multiple posting lists.  HBase is iterating one effective "list" though the compression algorithm likely consumes far greater CPU.  Perhaps it's easily offset with a less intensive comp algorithm.

bq. What if some user uses the node, runs a package manager to update things, or uses scp to get things off the server? the fs cache is likely to get screwed.

The fs cache becoming invalid in the examples given would be few and far between.  More worrisome is the block/page fault issue that I'm assuming can happen frequently at the moment.  I guess one could always set the block cache to be quite small, and make the block sizes on the small side as well.  Effectively shifting the problem back to the system IO cache.

I think we need to benchmark.  Also running yet another process on an HBase node sounds scary.  

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053445#comment-13053445 ] 

Li Pi commented on HBASE-4018:
------------------------------

Memcached on the same server - thus JNI rather than TCP. I currently have it working over TCP, but thats slower.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053646#comment-13053646 ] 

Todd Lipcon commented on HBASE-4018:
------------------------------------

I don't imagine the Java client will support domain sockets, since they don't exist in Java.

I agree it's worth looking at all these options in parallel and doing some shootouts to at least understand the performance differences.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053418#comment-13053418 ] 

Jason Rutherglen commented on HBASE-4018:
-----------------------------------------

Does this mean a cache on another server?  

bq. This should be faster than the linux file system's caching

Why is that?

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053486#comment-13053486 ] 

Jonathan Gray commented on HBASE-4018:
--------------------------------------

bq. Optimal solution would be building a slab allocated block cache within java. Use reference counting for a zero copy solution. This is difficult to implement and debug though.

I'm working on this.  I think implementing both directions is worthwhile and we can run good comparisons (including against linux fs cache + local datanodes).

bq. It would seem best to move in the direction of local HDFS file access and allow plugging in the block cache as a point of comparison / legacy.

I think it's best to move in all directions and do comparisons.  I've already seen performance differences between fs cache and the actual hbase block cache.  There's also compressed vs. decompressed (fs cache will always be compressed)

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Li Pi reassigned HBASE-4018:
----------------------------

    Assignee:     (was: Li Pi)

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054172#comment-13054172 ] 

Jonathan Gray commented on HBASE-4018:
--------------------------------------

bq. in many cases the CPU overhead dwarfs (or should) the extra RAM consumption from uncompressing into heap space.

This is not necessarily the case.  Many applications see 4-5X compression ratio and it means being able to increase your cache capacity by that much.  Some applications can also be CPU bound, or the might be IO bound, or they might actually be IO bound because they are RAM bound (can't fit working set in memory).  In general, it's hard to generalize here I think.

bq. Perhaps it's easily offset with a less intensive comp algorithm.

That's one of the major motivations for an hbase-specific "prefix" compression algorithm

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053998#comment-13053998 ] 

Li Pi commented on HBASE-4018:
------------------------------

Java has a JNI domain sockets library - though at that point you might as well go JNI -> Memcached directly. But my C-fu is weaker than expected, and this is taking me longer than it should.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053628#comment-13053628 ] 

Li Pi commented on HBASE-4018:
------------------------------

To add on that, the fs cache isn't the best thing to rely on. What if some user uses the node, runs a package manager to update things, or uses scp to get things off the server? the fs cache is likely to get screwed.

I found out memcached supports domain sockets, I'm now working on implementing this around domain sockets.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053454#comment-13053454 ] 

Jason Rutherglen commented on HBASE-4018:
-----------------------------------------

bq. Even still, making a copy out of in-process memory should be faster than linux fs caching.

Why's that?  For a base reference Lucene relies on the filesystem cache and makes use of Java's memory map capability to deliver very fast results.  It would seem best to move in the direction of local HDFS file access and allow plugging in the block cache as a point of comparison / legacy.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054180#comment-13054180 ] 

Jason Rutherglen commented on HBASE-4018:
-----------------------------------------

bq. Some applications can also be CPU bound

The main user of CPU with HBase should be [de]compression?  In just browsing the BigTable paper, they mention caching individual key-values for applications that require random reads.  If an application is more scan oriented, then the block cache makes sense for the duration of the scan of that block.  The paper also goes on to describe compression per-row vs. per-block.

bq. That's one of the major motivations for an hbase-specific "prefix" compression algorithm

However that's only for keys which is a separate discussion.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053446#comment-13053446 ] 

Jonathan Gray commented on HBASE-4018:
--------------------------------------

The perf gain over the FS caching would be less-so if using short-circuited local reads.  But anything that bypasses the DataNode is great for random read perf.

Even still, making a copy out of in-process memory should be faster than linux fs caching.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053453#comment-13053453 ] 

Li Pi commented on HBASE-4018:
------------------------------

Optimal solution would be building a slab allocated block cache within java. Use reference counting for a zero copy solution. This is difficult to implement and debug though.

Memcached is already well debugged and optimized.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4018) Attach memcached as secondary block cache to regionserver

Posted by "Li Pi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Li Pi resolved HBASE-4018.
--------------------------

    Resolution: Not A Problem

Wrote slab allocator in java using directbytebuffers instead of using memcached. See HBASE-4027.

> Attach memcached as secondary block cache to regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-4018
>                 URL: https://issues.apache.org/jira/browse/HBASE-4018
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Li Pi
>            Assignee: Li Pi
>
> Currently, block caches are limited by heap size, which is limited by garbage collection times in Java.
> We can get around this by using memcached w/JNI as a secondary block cache. This should be faster than the linux file system's caching, and allow us to very quickly gain access to a high quality slab allocated cache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira