You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2010/11/09 23:42:22 UTC
[jira] Created: (HBASE-3209) New Compaction Heuristic
New Compaction Heuristic
------------------------
Key: HBASE-3209
URL: https://issues.apache.org/jira/browse/HBASE-3209
Project: HBase
Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
1) don't unconditionally compact 4 files. have a min threshold
2) intelligently upgrade minors to majors
3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3209) New Compaction Heuristic
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930376#action_12930376 ]
HBase Review Board commented on HBASE-3209:
-------------------------------------------
Message from: stack@duboce.net
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1192/#review1881
-----------------------------------------------------------
Ship it!
k... let me commit.
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.cloudera.org/r/1192/#comment6110>
Whitespace here -- I can fix on commit.
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.cloudera.org/r/1192/#comment6113>
Very cute
- stack
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3209) New Compaction Heuristic
Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930332#action_12930332 ]
Nicolas Spiegelberg commented on HBASE-3209:
--------------------------------------------
Results from our cluster:
Multiput Latency: 25 ms => 3 ms avg
Sync Latency: 12 ms => 1.5 ms avg
Compaction Queue: 3 => 0.08 avg
Compaction Time: now 1-30 sec (note: our compaction time ods chart is off, looked manually at logs)
Read Latency: 6 ms => 9 ms
Files / Store: 2 => 2.6
Note that the minor Read drop can be fixed by setting compactionThreshold from 3 to 2. We just didn't need the improvement
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3209) New Compaction Heuristic
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-3209.
--------------------------
Resolution: Fixed
Fix Version/s: 0.90.0
Hadoop Flags: [Reviewed]
Thanks for the patch Nicolas. Committed.
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
> Fix For: 0.90.0
>
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3209) New Compaction Heuristic
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930352#action_12930352 ]
HBase Review Board commented on HBASE-3209:
-------------------------------------------
Message from: "Nicolas" <ns...@facebook.com>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1192/#review1878
-----------------------------------------------------------
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.cloudera.org/r/1192/#comment6095>
I know. I'm a whitespace ASBO. sorry :(
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.cloudera.org/r/1192/#comment6092>
we can work on fine-tuning this. +50% was a safe cover for us and worked fine because we pre-split regions. Really, you probably don't need the pad because the summation algorithm will operate on at least 3 storefiles, so (assuming roughly even flush sizes) you get +100% pad from that.
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.cloudera.org/r/1192/#comment6091>
s/are meet/have met/
- Nicolas
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3209) New Compaction Heuristic
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930347#action_12930347 ]
HBase Review Board commented on HBASE-3209:
-------------------------------------------
Message from: "Nicolas" <ns...@facebook.com>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1192/
-----------------------------------------------------------
Review request for hbase.
Summary
-------
We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
1) don't unconditionally compact 4 files. have a min threshold
2) intelligently upgrade minors to majors
3) new compaction algo (derived in HBASE-2462 )
This addresses bug HBASE-3209.
http://issues.apache.org/jira/browse/HBASE-3209
Diffs
-----
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033278
Diff: http://review.cloudera.org/r/1192/diff
Testing
-------
Has been running on our primary cluster for the past couple weeks.
Thanks,
Nicolas
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3209) New Compaction Heuristic
Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930363#action_12930363 ]
Nicolas Spiegelberg commented on HBASE-3209:
--------------------------------------------
St^Ack_: nspiegelberg: what config. would I set so it favored less files and kept the old read performance?
[3:36pm] nspiegelberg: you have 2 options
[3:37pm] nspiegelberg: 1) set compactionThreshold == 2
[3:38pm] nspiegelberg: 2) make minCompactSize configurable and set it high
[3:39pm] nspiegelberg: basically, before this algo, we would unconditionally compact 4 files, but the compactionThreshold == 3
[3:40pm] nspiegelberg: this means that we would never use the compaction algorithm unless our cluster was stressed out
[3:41pm] jdcryans: it used to not be like that tho
[3:42pm] jdcryans: it's a hack that we compact everything
[3:42pm] nspiegelberg: the only downside to the current algorithm is that sum(storefiles) doesn't take into account dedupe can have a snowball effect of compacting too aggressively during load. this can be migitated by lowering hbase.hstore.compaction.max
[3:43pm] nspiegelberg: in reality, this hasn't proved to be an issue for us. lowering the max compact files will fix it. we can also add on some simple dedupe heuristics to fix this issue
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.