You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Junegunn Choi (JIRA)" <ji...@apache.org> on 2015/12/28 03:34:49 UTC

[jira] [Commented] (HBASE-12712) skipLargeFiles in minor compact but not in major compact

    [ https://issues.apache.org/jira/browse/HBASE-12712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072353#comment-15072353 ] 

Junegunn Choi commented on HBASE-12712:
---------------------------------------

We're having a related issue. In our case, it's not the number of versions, but TTL of the column family. We expected old (and large) storefiles to be removed from the system but they are not because skipLargeFile excludes them and thus major compaction is never triggered for them.

It seems trivial to make the method take TTL into account, i.e. do not skip storefiles whose minimum timestamps are older than TTL. However, I'm not completely sure if it's the right way to do it as one may argue that the current implementation is not "wrong" and "hbase.hstore.compaction.max.size" simply has priority over TTL. Also it does not fix the problem [~mopishv0] is having.

> skipLargeFiles in minor compact but not in major compact
> --------------------------------------------------------
>
>                 Key: HBASE-12712
>                 URL: https://issues.apache.org/jira/browse/HBASE-12712
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.98.6
>            Reporter: Liu Junhong
>              Labels: beginner
>             Fix For: 0.98.6
>
>         Attachments: compact.diff
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Here is my case. After repeatedly minor compaction, the size of storefile is very large. Compaction with large storefile will waste much bandwidth, so i use the “hbase.hstore.compaction.max.size” to skip this case. But after use this config, i find that major compaction will be skipped forever when i read the source code and the deletes and muti-versions data my waste storage. So i had to modify the code. 
> Now i'm try to submit my patch.But my patch is not perfect. I think there should be an other config to determine if the large size storefile should join major compaction in HColumnDescriptor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)