You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Stefan Richter (JIRA)" <ji...@apache.org> on 2017/11/15 14:51:00 UTC
[jira] [Created] (FLINK-8086) FlinkKafkaProducer011 can permanently
fail in recovery
Stefan Richter created FLINK-8086:
-------------------------------------
Summary: FlinkKafkaProducer011 can permanently fail in recovery
Key: FLINK-8086
URL: https://issues.apache.org/jira/browse/FLINK-8086
Project: Flink
Issue Type: Bug
Components: Kafka Connector
Affects Versions: 1.4.0, 1.5.0
Reporter: Stefan Richter
Priority: Blocker
Chaos monkey test in a cluster environment can permanently bring down our FlinkKafkaProducer011.
Typically, after a small number of randomly killed TMs, the data generator job is no longer able to recover from a checkpoint because of the following problem:
org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an operation with an old epoch. Either there is a newer producer with the same transactionalId, or the producer's transaction has been expired by the broker.
The problem is reproduceable and happened for me in every run after the choas monkey killed a couple of TMs.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)