You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/01/17 23:09:43 UTC

[jira] Created: (HBASE-3450) Per-CF Flushes

Per-CF Flushes
--------------

                 Key: HBASE-3450
                 URL: https://issues.apache.org/jira/browse/HBASE-3450
             Project: HBase
          Issue Type: Improvement
          Components: regionserver, wal
    Affects Versions: 0.90.1, 0.92.0
            Reporter: Nicolas Spiegelberg
            Assignee: Nicolas Spiegelberg
            Priority: Minor


In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection.  This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3450) Per-CF Flushes

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982895#action_12982895 ] 

Nicolas Spiegelberg commented on HBASE-3450:
--------------------------------------------

Some interesting stats.  We did some rough calculations internally to see what effect an uneven distribution of data into column families was having on our network IO.  Our data distribution for 3 column families was 1:1:20.  When we looked at the flush:minor-compaction ratio for each of the store files, the large column family had a 1:2 ratio but the small CFs both had a 1:20 ratio!  We are looking at roughly a 10% network IO decrease if we can bring those other 2 CFs down to a 1:2 ratio as well. 

> Per-CF Flushes
> --------------
>
>                 Key: HBASE-3450
>                 URL: https://issues.apache.org/jira/browse/HBASE-3450
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, wal
>    Affects Versions: 0.90.1, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection.  This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-3450) Per-CF Flushes

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg resolved HBASE-3450.
----------------------------------------

    Resolution: Duplicate

Duplicate of HBASE-3149

> Per-CF Flushes
> --------------
>
>                 Key: HBASE-3450
>                 URL: https://issues.apache.org/jira/browse/HBASE-3450
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, wal
>    Affects Versions: 0.90.1, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection.  This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3450) Per-CF Flushes

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983303#action_12983303 ] 

Jean-Daniel Cryans commented on HBASE-3450:
-------------------------------------------

Should we close HBASE-3149 then?

> Per-CF Flushes
> --------------
>
>                 Key: HBASE-3450
>                 URL: https://issues.apache.org/jira/browse/HBASE-3450
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, wal
>    Affects Versions: 0.90.1, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection.  This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3450) Per-CF Flushes

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983029#action_12983029 ] 

stack commented on HBASE-3450:
------------------------------

Great stuff Nicolas.  Bring it on.

> Per-CF Flushes
> --------------
>
>                 Key: HBASE-3450
>                 URL: https://issues.apache.org/jira/browse/HBASE-3450
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, wal
>    Affects Versions: 0.90.1, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Minor
>
> In cases where the load to all column families in a store is not evenly distributed, having per-column family flushes will reduce network IO by helping the compaction algorithm minimize its need for unconditional selection.  This issue is about refactoring the flush algorithm to move from HRegion granularity to Store.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.