You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Teng Huo (Jira)" <ji...@apache.org> on 2023/01/20 03:06:00 UTC

[jira] [Commented] (HUDI-4717) CompactionCommitEvent message corrupted when sent by compact_task

    [ https://issues.apache.org/jira/browse/HUDI-4717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678965#comment-17678965 ] 

Teng Huo commented on HUDI-4717:
--------------------------------

Hi,
We got the exactly same issue recently in our Flink MOR pipeline.

 !issue.png! 

I have checked Hudi files and all compaction operation were done because parquet files are good. I can't understand how it loses events between compact_task and compact_commit.
May I ask if there is anyway to do trouble shooting for this issue? Really thanks.

>  CompactionCommitEvent message corrupted when sent by compact_task
> ------------------------------------------------------------------
>
>                 Key: HUDI-4717
>                 URL: https://issues.apache.org/jira/browse/HUDI-4717
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: flink, flink-sql
>    Affects Versions: 0.10.1
>            Reporter: nonggia.liang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: figure 1.png, figure 2.png, issue.png
>
>
> When running a flink application inserting data to hudi table with async compaction enabled, we found that after running for some time, compactions become abnormal, which were scheduled, executed succesfully, but not committed. And we can observed inconsistence between the messges compact_task sending and compact_commit receiving in number, as figure 1 shown below.
> By looking into the abnormal InputChannel state of the compact_commit operator using tool Arthas, we found the channel is waiting for a `huge` message of size 16M, which is far more than the size of normal CompactionCommitEvent object. As shown by figure 2.
> Now in the method processElement() of class CompactFunction, we use collector to send CompactionCommitEvent message asynchronously, but the Collector provided by flink seems not to be thread-safe. Can that be the cause of the corruption of the message received by compact_commit operator? Shall we use the MailboxExecutorAdapter to run collector.collect just like in StreamReadOperator?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)