You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Dong Lin (Jira)" <ji...@apache.org> on 2021/04/27 05:44:00 UTC

[jira] [Created] (FLINK-22488) KafkaSourceLegacyITCase.testOneToOneSources failed due to "OperatorEvent from an OperatorCoordinator to a task was lost"

Dong Lin created FLINK-22488:
--------------------------------

             Summary: KafkaSourceLegacyITCase.testOneToOneSources failed due to "OperatorEvent from an OperatorCoordinator to a task was lost"
                 Key: FLINK-22488
                 URL: https://issues.apache.org/jira/browse/FLINK-22488
             Project: Flink
          Issue Type: Improvement
            Reporter: Dong Lin


According to [1], the test KafkaSourceLegacyITCase.testOneToOneSources failed because it runs a streaming job (which uses KafkaSource) with restartAttempts=1. In addition to the failover explicitly triggered by the FailingIdentityMapper, the job additionally failed due to "org.apache.flink.util.FlinkException: An OperatorEvent from an OperatorCoordinator to a task was lost. Triggering task failover to ensure consistency", which is unexpected by the test.

Note that SubtaskGatewayImpl was updated by [2] on 4/14 which triggers task failover if any OperatorEvent was lost. This could explain why those Kafka tests start to fail due to the exception described above.

In order to make this test stable, let's try to understand why there is such a high chance of loosing OperatorEvent in the Azure test pipeline. And if we could not avoid loosing OperatorEvent in the test pipeline, we probably need to update the test to allow the pipeline being restarted arbitrary times (and still be able to stop the test on the happy path).


[1] https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17212&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5&l=6960
[2] https://github.com/apache/flink/pull/15605




--
This message was sent by Atlassian Jira
(v8.3.4#803005)