You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2010/11/30 08:09:11 UTC

[jira] Created: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Add option to cache blocks on hfile write and evict blocks on hfile close
-------------------------------------------------------------------------

                 Key: HBASE-3287
                 URL: https://issues.apache.org/jira/browse/HBASE-3287
             Project: HBase
          Issue Type: Improvement
          Components: io, regionserver
    Affects Versions: 0.90.0
            Reporter: Jonathan Gray
            Assignee: Jonathan Gray
             Fix For: 0.92.0


This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.

The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.

The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray resolved HBASE-3287.
----------------------------------

      Resolution: Fixed
    Release Note: Introduces two new configuration parameters: hbase.rs.cacheblocksonwrite (default: false) which will pre-cache all blocks of a file into the block cache as it is written, and hbase.rs.evictblocksonclose (default: true) which will evict all blocks of a file from the block cache when a file is closed on a RS.
    Hadoop Flags: [Reviewed]

Committed to trunk.  I'm going to be using this in prod atop 0.90 but since it's really a new feature I've left it out of 0.90 tree for now.

Filed HBASE-3288 to make this configurable at the family level (something I don't need yet but we should support eventually).

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965309#action_12965309 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: stack@duboce.net

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2009
-----------------------------------------------------------

Ship it!


Looks good to me.  Some comments below.


branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java
<http://review.cloudera.org/r/1261/#comment6343>

    This looks like useful addition.



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.cloudera.org/r/1261/#comment6346>

    Why the flush?



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.cloudera.org/r/1261/#comment6344>

    Does this create new byte array?



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.cloudera.org/r/1261/#comment6345>

    I wonder if we have to have full path here?  Anything less could cause clashes?  But small optimization would strip the hbase.root at least?



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.cloudera.org/r/1261/#comment6347>

    Can you presize the BAOS?  Whats the default?  4k?  If so, and our default block size is 64k, that'd be a bit of expensive array resizing going on?  Just guessing.



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.cloudera.org/r/1261/#comment6348>

    Surround with if debug?


- stack





> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965487#action_12965487 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: "Jonathan Gray" <jg...@apache.org>


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.  > branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765
bq.  > <http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765>
bq.  >
bq.  >     why would you not want to evict blocks from the cache on close?
bq.  
bq.  stack wrote:
bq.      I think this a good point.  Its different behavior but its behavior we should have always had?  One less option too.
bq.  
bq.  Ryan Rawson wrote:
bq.      I'm still confused why we are adding config for something that we should always be doing it.  While we'll never be zero conf, I am not seeing the reason why we'd want to keep things in the LRU.  
bq.      
bq.      It would make more sense not to evict on a split, but evict every other time, since a split will probably reopen the same hfiles and need those blocks again.

I think it makes sense to have undocumented configuration parameters.  The default behavior is then "the way" but having a config option checked in the code at least gives the opportunity to turn something on/off without making a code change and redeploying completely.  In the unit test, I'm turning it on/off with the config parameter so I can verify it works as expected.

And although I've changed the default to true, I'm not convinced that it always makes sense in all cases.

Ryan came up with example of the split, though that would override the config parameter.  But I think there could be other situations where you don't want to as well.

In any case, I want to keep it configurable so I can turn it on/off between test runs and see what, if any, difference these optimizations make and IMO there's very little cost associated with using conf.getBoolean("some.undocumented.thing", true) vs. a hard-coded true (if there's any possibility you might want to change the behavior).


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
-----------------------------------------------------------





> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965310#action_12965310 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: "Ryan Rawson" <ry...@gmail.com>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
-----------------------------------------------------------



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
<http://review.cloudera.org/r/1261/#comment6349>

    why would you not want to evict blocks from the cache on close?


- Ryan





> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081432#comment-13081432 ] 

stack commented on HBASE-3287:
------------------------------

@Yi This issue looks like its old enough for the patch herein to have made 0.90. Are you not finding it there?  (I believe there were issues found with the way we cached blocks on write found subsequent to the application of this patch, wrong name was used when file was added to cache -- in trunk its been redone)

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "Yi Liang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081372#comment-13081372 ] 

Yi Liang commented on HBASE-3287:
---------------------------------

Is there a patch for 0.90.3? 

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082703#comment-13082703 ] 

stack commented on HBASE-3287:
------------------------------

bq. So what should we do if we want to enable this option?

Someone (you?) should fix whats broke.

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965455#action_12965455 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: "Ryan Rawson" <ry...@gmail.com>


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.  > branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765
bq.  > <http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765>
bq.  >
bq.  >     why would you not want to evict blocks from the cache on close?
bq.  
bq.  stack wrote:
bq.      I think this a good point.  Its different behavior but its behavior we should have always had?  One less option too.

I'm still confused why we are adding config for something that we should always be doing it.  While we'll never be zero conf, I am not seeing the reason why we'd want to keep things in the LRU.  

It would make more sense not to evict on a split, but evict every other time, since a split will probably reopen the same hfiles and need those blocks again.


- Ryan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
-----------------------------------------------------------





> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965311#action_12965311 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: stack@duboce.net


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.  > branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765
bq.  > <http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765>
bq.  >
bq.  >     why would you not want to evict blocks from the cache on close?

I think this a good point.  Its different behavior but its behavior we should have always had?  One less option too.


- stack


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
-----------------------------------------------------------





> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "Yi Liang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082080#comment-13082080 ] 

Yi Liang commented on HBASE-3287:
---------------------------------

@stack You mean there's something wrong with this patch? So what should we do if we want to enable this option? Use the code from trunk? IMO a correct patch against 0.90 will be very useful until 0.92.0 releases as this feature is so critical for online service.

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-3287:
---------------------------------

    Issue Type: New Feature  (was: Improvement)

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-3287:
---------------------------------

    Attachment: HBASE-3287-FINAL-trunk.patch

Final patch against trunk.  Change from last review is EvictOnClose is now on by default.  CacheOnWrite is stil off by default.

> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965489#action_12965489 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: "Jonathan Gray" <jg...@apache.org>


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.  > branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765
bq.  > <http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765>
bq.  >
bq.  >     why would you not want to evict blocks from the cache on close?
bq.  
bq.  stack wrote:
bq.      I think this a good point.  Its different behavior but its behavior we should have always had?  One less option too.
bq.  
bq.  Ryan Rawson wrote:
bq.      I'm still confused why we are adding config for something that we should always be doing it.  While we'll never be zero conf, I am not seeing the reason why we'd want to keep things in the LRU.  
bq.      
bq.      It would make more sense not to evict on a split, but evict every other time, since a split will probably reopen the same hfiles and need those blocks again.
bq.  
bq.  Jonathan Gray wrote:
bq.      I think it makes sense to have undocumented configuration parameters.  The default behavior is then "the way" but having a config option checked in the code at least gives the opportunity to turn something on/off without making a code change and redeploying completely.  In the unit test, I'm turning it on/off with the config parameter so I can verify it works as expected.
bq.      
bq.      And although I've changed the default to true, I'm not convinced that it always makes sense in all cases.
bq.      
bq.      Ryan came up with example of the split, though that would override the config parameter.  But I think there could be other situations where you don't want to as well.
bq.      
bq.      In any case, I want to keep it configurable so I can turn it on/off between test runs and see what, if any, difference these optimizations make and IMO there's very little cost associated with using conf.getBoolean("some.undocumented.thing", true) vs. a hard-coded true (if there's any possibility you might want to change the behavior).

Filed HBASE-3289 to disable them on close of parent files during split.  I looked at the code and it's a fairly significant change since we'll need to pass a boolean in to all of the close() methods (there are several levels of them).

Also, figuring out when we do want to evict these blocks (once both children have closed the file) is tricky.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
-----------------------------------------------------------





> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: New Feature
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3287-FINAL-trunk.patch
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965134#action_12965134 ] 

HBase Review Board commented on HBASE-3287:
-------------------------------------------

Message from: "Jonathan Gray" <jg...@apache.org>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/
-----------------------------------------------------------

Review request for hbase, stack and khemani.


Summary
-------

This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.

The first option, hbase.rs.cacheblocksonwrite, will make it so we pre-cache blocks as we are writing out new files.

The second option, hbase.rs.evictblocksonclose, will make it so we evict blocks when files are closed.


This addresses bug HBASE-3287.
    http://issues.apache.org/jira/browse/HBASE-3287


Diffs
-----

  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1040422 

Diff: http://review.cloudera.org/r/1261/diff


Testing
-------

Added a unit test to TestStoreFile.  That passes.

Need to do perf testing on a cluster.


Thanks,

Jonathan




> Add option to cache blocks on hfile write and evict blocks on hfile close
> -------------------------------------------------------------------------
>
>                 Key: HBASE-3287
>                 URL: https://issues.apache.org/jira/browse/HBASE-3287
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.92.0
>
>
> This issue is about adding configuration options to add/remove from the block cache when creating/closing files.  For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity.
> The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files.
> The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.