You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/01/06 17:45:00 UTC

[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

    [ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655527#comment-17655527 ] 

ASF subversion and git services commented on KUDU-3367:
-------------------------------------------------------

Commit 27072d3382889b1852f4fef58010115585685bd3 in kudu's branch refs/heads/master from Yingchun Lai
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=27072d338 ]

[tools] Add 'kudu local_replica tmeta delete_rowsets' to delete rowsets from tablet

There are some use cases we need to delete rowsets from a tablet.
For example:
1. Some blocks are corrupted in a single node cluster, the server cannot be
   started. Note: some data will be lost in this case.
2. Some rowsets are fully deleted but the blocks can not be GCed (KUDU-3367).
   Note: no data will be lost in this case.

There is 'kudu pbc edit' CLI tool to achieve that, but it's error prone and
hard to operate when working with large amount of data.

This patch introduces a new CLI tool 'kudu local_replica tmeta delete_rowsets'
which makes removing rowsets from a tablet much easier.

Change-Id: If2cf9035babf4c3af4c238cebe8dcecd2c65848f
Reviewed-on: http://gerrit.cloudera.org:8080/19357
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <al...@apache.org>


> Delta file with full of delete op can not be schedule to compact
> ----------------------------------------------------------------
>
>                 Key: KUDU-3367
>                 URL: https://issues.apache.org/jira/browse/KUDU-3367
>             Project: Kudu
>          Issue Type: New Feature
>          Components: compaction
>            Reporter: dengke
>            Assignee: dengke
>            Priority: Major
>         Attachments: image-2022-05-09-14-13-16-525.png, image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update op in the file. The current compact algorithm will not schedule the file do compact. If such files exist, after accumulating for a period of time, it will greatly affect our scan speed. However, processing such files every time compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)