You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "sunjincheng (JIRA)" <ji...@apache.org> on 2019/05/29 09:41:00 UTC
[jira] [Comment Edited] (FLINK-10455) Potential Kafka producer leak
in case of failures
[ https://issues.apache.org/jira/browse/FLINK-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850556#comment-16850556 ]
sunjincheng edited comment on FLINK-10455 at 5/29/19 9:40 AM:
--------------------------------------------------------------
What is the current situation of this JIRA, is there anyone else to follow up? [~till.rohrmann] [~cslotterback]
[~azagrebin] Do you want to continue to solve this problem?
Since this is a blocker issue for release 1.8.1, we're expecting a quick response here.
Thanks.
was (Author: sunjincheng121):
What is the current situation of this JIRA, is there anyone else to follow up? [~till.rohrmann] [~cslotterback]
[~azagrebin] Do you want to continue to solve this problem?
> Potential Kafka producer leak in case of failures
> -------------------------------------------------
>
> Key: FLINK-10455
> URL: https://issues.apache.org/jira/browse/FLINK-10455
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Kafka
> Affects Versions: 1.5.2
> Reporter: Nico Kruber
> Assignee: Andrey Zagrebin
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.5.6, 1.7.0, 1.7.3, 1.9.0, 1.8.1
>
>
> If the Kafka brokers' timeout is too low for our checkpoint interval [1], we may get an {{ProducerFencedException}}. Documentation around {{ProducerFencedException}} explicitly states that we should close the producer after encountering it.
> By looking at the code, it doesn't seem like this is actually done in {{FlinkKafkaProducer011}}. Also, in case one transaction's commit in {{TwoPhaseCommitSinkFunction#notifyCheckpointComplete}} fails with an exception, we don't clean up (nor try to commit) any other transaction.
> -> from what I see, {{TwoPhaseCommitSinkFunction#notifyCheckpointComplete}} simply iterates over the {{pendingCommitTransactions}} which is not touched during {{close()}}
> Now if we restart the failing job on the same Flink cluster, any resources from the previous attempt will still linger around.
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/connectors/kafka.html#kafka-011
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)