You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (JIRA)" <ji...@apache.org> on 2016/07/13 07:44:20 UTC

[jira] [Created] (HBASE-16223) Drop duplicated delete markers in minor compaction

Duo Zhang created HBASE-16223:
---------------------------------

             Summary: Drop duplicated delete markers in minor compaction
                 Key: HBASE-16223
                 URL: https://issues.apache.org/jira/browse/HBASE-16223
             Project: HBase
          Issue Type: Improvement
            Reporter: Duo Zhang


Recently we suffer from this. One of our customers may delete the same row multiple times(the record is about 100, 000 times), and cause scan timeout.

Now we trigger major compaction every day to drop the duplicated delete markers. But this is not a good idea since the cost of major compaction gets higher as the data gets larger.

And in fact, I think only the newest delete marker is useful(if maxverions = 1), so we could only retain this delete marker when doing minor compaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)