You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2021/11/16 01:27:00 UTC

[jira] [Commented] (HUDI-2735) Fix archival of commits in Java client for Kafka Connect

    [ https://issues.apache.org/jira/browse/HUDI-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444216#comment-17444216 ] 

Ethan Guo commented on HUDI-2735:
---------------------------------

The archival process is triggered post every commit.  Deltacommits are archived based on the following config:
{code:java}
"hoodie.keep.max.commits": 8,
"hoodie.keep.min.commits": 6,
"hoodie.cleaner.commits.retained": 4 {code}
However, the number of rollbacks can keep going due to HUDI-2672 and the rollbacks are kept being added, which are not archived.  Only when more deltacommits are added and the number of deltacommits hits the threshold, some rollbacks are archived.  It looks like the archival process does not count the number of rollback instants.

I filed a separate ticket to track the fix since the issue is not kafka-connect specific: [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-2765?filter=allissues.]

Once the rollbacks due to no Kafka message are fixed in https://issues.apache.org/jira/browse/HUDI-2672, this issue won't be severe anymore.

> Fix archival of commits in Java client for Kafka Connect
> --------------------------------------------------------
>
>                 Key: HUDI-2735
>                 URL: https://issues.apache.org/jira/browse/HUDI-2735
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Writer Core
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Blocker
>             Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)