You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/12/20 04:01:00 UTC

[jira] [Commented] (KUDU-1625) Schedule compaction on rowsets with high percentage of deleted data

    [ https://issues.apache.org/jira/browse/KUDU-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649549#comment-17649549 ] 

ASF subversion and git services commented on KUDU-1625:
-------------------------------------------------------

Commit ad920e69fcd67ceefa25ea81a38a10a27d9e3afc in kudu's branch refs/heads/master from kedeng
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=ad920e69f ]

KUDU-3367 [compaction] add supplement to gc algorithm

If we get a REDO delta full of delete ops, which means there is
not a single update operation in the delta. The current compaction
algorithm doesn't run GC on such deltamemstores. The accumulation
of deltamemstores like that negatively affects performance of scan
operations.

This patch as a supplement to KUDU-1625, we could  release storage
space for old tablet metadata that does not support the live count
function. See KUDU-3367 for details.

Change-Id: I8b26737dffecc17688b42188da959b2ba16351ed
Reviewed-on: http://gerrit.cloudera.org:8080/18503
Reviewed-by: Alexey Serbin <al...@apache.org>
Tested-by: Alexey Serbin <al...@apache.org>


> Schedule compaction on rowsets with high percentage of deleted data
> -------------------------------------------------------------------
>
>                 Key: KUDU-1625
>                 URL: https://issues.apache.org/jira/browse/KUDU-1625
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tablet
>    Affects Versions: 1.0.0
>            Reporter: Todd Lipcon
>            Priority: Major
>
> Although with KUDU-236 we can now remove rows that were deleted prior to the ancient history mark, we don't actively schedule compactions based on deleted rows. So, if for example we have a fully compacted table and issue a DELETE for every row, the data size actually does not change, because no compactions are triggered.
> We need some way to notice the fact that the ratio of deletes to rows is high and decide to compact those rowsets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)