You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "zoucao (Jira)" <ji...@apache.org> on 2022/03/18 08:49:00 UTC
[jira] [Created] (FLINK-26726) Remove the unregistered task from readersAwaitingSplit
zoucao created FLINK-26726:
------------------------------
Summary: Remove the unregistered task from readersAwaitingSplit
Key: FLINK-26726
URL: https://issues.apache.org/jira/browse/FLINK-26726
Project: Flink
Issue Type: Improvement
Components: Table SQL / Ecosystem
Reporter: zoucao
Attachments: stack.txt
Recently, we faced a problem caused by the unregistered task when using the hive table as a source to do streaming reading.
I think the problem is that we do not remove the unregistered task from `readersAwaitingSplit` in `ContinuousHiveSplitEnumerator` and `ContinuousFileSplitEnumerator`.
Assuming that we have two tasks 0 and 1, they all exist in `readersAwaitingSplit`, if there does not exist any new file in the path for a long time. Then, a new split is generated, and it is assigned to task-1. Unfortunately, task-1 can not consume the split successfully, and the exception will be thrown and cause all tasks to restart. The failover will not affect the `readersAwaitingSplit`, but it will clear the `SourceCoordinatorContext#registeredReaders`.
After restarting, task-0 exists in `readersAwaitingSplit` but not in `registeredReaders`. if task-1 register first and send the request to get split, the SplitEnumerator will assign splits for both task-1 and task-0, but task-0 has not been registered.
The stack exists in the attachment.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)