You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Boris Tyukin (Jira)" <ji...@apache.org> on 2020/04/26 16:15:00 UTC

[jira] [Commented] (KUDU-1625) Schedule compaction on rowsets with high percentage of deleted data

    [ https://issues.apache.org/jira/browse/KUDU-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092770#comment-17092770 ] 

Boris Tyukin commented on KUDU-1625:
------------------------------------

looks like we got hit with this issue. just deleted 50% of data from 2B row table and after a week tablets only got bigger as ever. Compaction does run as we can see from metics. we are on Kudu 1.9. Is the only workaround to drop and reload these tables? 

> Schedule compaction on rowsets with high percentage of deleted data
> -------------------------------------------------------------------
>
>                 Key: KUDU-1625
>                 URL: https://issues.apache.org/jira/browse/KUDU-1625
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tablet
>    Affects Versions: 1.0.0
>            Reporter: Todd Lipcon
>            Priority: Major
>
> Although with KUDU-236 we can now remove rows that were deleted prior to the ancient history mark, we don't actively schedule compactions based on deleted rows. So, if for example we have a fully compacted table and issue a DELETE for every row, the data size actually does not change, because no compactions are triggered.
> We need some way to notice the fact that the ratio of deletes to rows is high and decide to compact those rowsets.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)