You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (JIRA)" <ji...@apache.org> on 2016/07/13 07:44:20 UTC
[jira] [Created] (HBASE-16223) Drop duplicated delete markers in
minor compaction
Duo Zhang created HBASE-16223:
---------------------------------
Summary: Drop duplicated delete markers in minor compaction
Key: HBASE-16223
URL: https://issues.apache.org/jira/browse/HBASE-16223
Project: HBase
Issue Type: Improvement
Reporter: Duo Zhang
Recently we suffer from this. One of our customers may delete the same row multiple times(the record is about 100, 000 times), and cause scan timeout.
Now we trigger major compaction every day to drop the duplicated delete markers. But this is not a good idea since the cost of major compaction gets higher as the data gets larger.
And in fact, I think only the newest delete marker is useful(if maxverions = 1), so we could only retain this delete marker when doing minor compaction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)