You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2011/01/18 19:32:44 UTC

[jira] Commented: (HBASE-3149) Make flush decisions per column family

    [ https://issues.apache.org/jira/browse/HBASE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983312#action_12983312 ] 

Nicolas Spiegelberg commented on HBASE-3149:
--------------------------------------------

Some interesting stats. We did some rough calculations internally to see what effect an uneven distribution of data into column families was having on our network IO. Our data distribution for 3 column families was 1:1:20. When we looked at the flush:minor-compaction ratio for each of the store files, the large column family had a 1:2 ratio but the small CFs both had a 1:20 ratio! We are looking at roughly a 10% network IO decrease if we can bring those other 2 CFs down to a 1:2 ratio as well.

> Make flush decisions per column family
> --------------------------------------
>
>                 Key: HBASE-3149
>                 URL: https://issues.apache.org/jira/browse/HBASE-3149
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Karthik Ranganathan
>
> Today, the flush decision is made using the aggregate size of all column families. When large and small column families co-exist, this causes many small flushes of the smaller CF. We need to make per-CF flush decisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.