You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Billy Pearson (JIRA)" <ji...@apache.org> on 2008/08/16 03:37:44 UTC

[jira] Commented: (HBASE-834) Upper bound on files we compact at any one time

    [ https://issues.apache.org/jira/browse/HBASE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623070#action_12623070 ] 

Billy Pearson commented on HBASE-834:
-------------------------------------

HBASE-745 solved the minor compaction with incremental compaction and it still 
can do major compaction's sometimes but not often.

The only downside to HBASE-745 is it does not guarantee a major compaction to ever happen of the old larger files. 
We do have an option to call the compaction with forced set to true and skip the minor compaction.

Suggestion to complete the major compaction part

1. Add a function in HRegion to return the oldest file timestamp of when it was created  something like HRegion.getOldestHStoreTimestamp()
2. Add a option (hbase.hregion.majorcompaction) in the hbase-default.xml setting to make major compaction's to happen every X secs say default 1 per day or a week .
3.  Compare hbase-default.xml against the oldest timestamp in HStore.compact and change from force(false) to force(true) when needed but not in reverse. 

If someone could help with the HRegion.getOldestHStoreTimestamp() function or point me in the right direct on how to do that in hadoop. 
I thank I could come up with a patch to give us a major compaction and add a limit on the number of regions to compact at one time while we are doing the minor compaction.

Anything I am missing here stack?

> Upper bound on files we compact at any one time
> -----------------------------------------------
>
>                 Key: HBASE-834
>                 URL: https://issues.apache.org/jira/browse/HBASE-834
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Priority: Minor
>
> From Billy in HBASE-64, which we closed because it got pulled all over the place:
> {code}
> Currently we do compaction on a region when the hbase.hstore.compactionThreshold is reached - default 3
> I thank we should configure a max number of mapfiles to compact at one time simulator to doing a minor compaction in bigtable. This keep compaction's form getting tied up in one region to long letting other regions get way to many memcache flushes making compaction take longer and longer for each region
> If we did that when a regions updates start to slack off the max number will eventuly include all mapfiles causeing a major compaction on that region. Unlike big table this would leave the master out of the process and letting the region server handle the major compaction when it has time.
> When doing a minor compaction on a few files I thank we should compact the newest mapfiles first leave the larger/older ones for when we have low updates to a region.
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.