You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/01/17 23:09:43 UTC
[jira] Created: (HBASE-3450) Per-CF Flushes
Per-CF Flushes
--------------
Key: HBASE-3450
URL: https://issues.apache.org/jira/browse/HBASE-3450
Project: HBase
Issue Type: Improvement
Components: regionserver, wal
Affects Versions: 0.90.1, 0.92.0
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection. This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3450) Per-CF Flushes
Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982895#action_12982895 ]
Nicolas Spiegelberg commented on HBASE-3450:
--------------------------------------------
Some interesting stats. We did some rough calculations internally to see what effect an uneven distribution of data into column families was having on our network IO. Our data distribution for 3 column families was 1:1:20. When we looked at the flush:minor-compaction ratio for each of the store files, the large column family had a 1:2 ratio but the small CFs both had a 1:20 ratio! We are looking at roughly a 10% network IO decrease if we can bring those other 2 CFs down to a 1:2 ratio as well.
> Per-CF Flushes
> --------------
>
> Key: HBASE-3450
> URL: https://issues.apache.org/jira/browse/HBASE-3450
> Project: HBase
> Issue Type: Improvement
> Components: regionserver, wal
> Affects Versions: 0.90.1, 0.92.0
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
> Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection. This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3450) Per-CF Flushes
Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicolas Spiegelberg resolved HBASE-3450.
----------------------------------------
Resolution: Duplicate
Duplicate of HBASE-3149
> Per-CF Flushes
> --------------
>
> Key: HBASE-3450
> URL: https://issues.apache.org/jira/browse/HBASE-3450
> Project: HBase
> Issue Type: Improvement
> Components: regionserver, wal
> Affects Versions: 0.90.1, 0.92.0
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
> Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection. This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3450) Per-CF Flushes
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983303#action_12983303 ]
Jean-Daniel Cryans commented on HBASE-3450:
-------------------------------------------
Should we close HBASE-3149 then?
> Per-CF Flushes
> --------------
>
> Key: HBASE-3450
> URL: https://issues.apache.org/jira/browse/HBASE-3450
> Project: HBase
> Issue Type: Improvement
> Components: regionserver, wal
> Affects Versions: 0.90.1, 0.92.0
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
> Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection. This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3450) Per-CF Flushes
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983029#action_12983029 ]
stack commented on HBASE-3450:
------------------------------
Great stuff Nicolas. Bring it on.
> Per-CF Flushes
> --------------
>
> Key: HBASE-3450
> URL: https://issues.apache.org/jira/browse/HBASE-3450
> Project: HBase
> Issue Type: Improvement
> Components: regionserver, wal
> Affects Versions: 0.90.1, 0.92.0
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
> Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection. This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.