You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jark Wu (Jira)" <ji...@apache.org> on 2022/10/14 02:28:00 UTC

[jira] [Comment Edited] (FLINK-26726) Remove the unregistered task from readersAwaitingSplit

    [ https://issues.apache.org/jira/browse/FLINK-26726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616389#comment-17616389 ] 

Jark Wu edited comment on FLINK-26726 at 10/14/22 2:27 AM:
-----------------------------------------------------------

Fixed in 
 - master: ab9e5844703848e79b2a62abae757bc6bd2268d9
 - release-1.16: 9935b0c56b646b301dbd4d11a0838aacfeb5430f
 - release-1.15: a826fe8d501ed8bdd9cbdc5febf7aac4cfa0b947
 - release-1.14: 81d2c1b340f3a4d063e97db2519d6911028d807d


was (Author: jark):
Fixed in 
 - master: ab9e5844703848e79b2a62abae757bc6bd2268d9
 - release-1.16: TODO

> Remove the unregistered  task from readersAwaitingSplit
> -------------------------------------------------------
>
>                 Key: FLINK-26726
>                 URL: https://issues.apache.org/jira/browse/FLINK-26726
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Ecosystem
>            Reporter: zoucao
>            Assignee: zoucao
>            Priority: Major
>              Labels: pull-request-available, stale-assigned
>             Fix For: 1.17.0
>
>         Attachments: stack.txt
>
>
> Recently, we faced a problem caused by the unregistered task when using the hive table as a source to do streaming reading. 
> I think the problem is that we do not remove the unregistered  task from `readersAwaitingSplit` in `ContinuousHiveSplitEnumerator` and `ContinuousFileSplitEnumerator`.
> Assuming that we have two tasks 0 and 1, they all exist in `readersAwaitingSplit`,  if there does not exist any new file in the path for a long time. Then, a new split is generated, and it is assigned to task-1. Unfortunately, task-1 can not consume the split successfully, and the exception will be thrown and cause all tasks to restart. The failover will not affect the `readersAwaitingSplit`, but it will clear the `SourceCoordinatorContext#registeredReaders`.
> After restarting, task-0 exists in `readersAwaitingSplit` but not in `registeredReaders`. if task-1 register first and send the request to get split, the SplitEnumerator will assign splits for both task-1 and task-0, but task-0 has not been registered.
> The stack exists in the attachment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)