You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Nico Kruber (JIRA)" <ji...@apache.org> on 2018/03/08 11:20:00 UTC

[jira] [Created] (FLINK-8896) Kafka08Fetcher trying to look up topic "n/a" on partiton "-1"

Nico Kruber created FLINK-8896:
----------------------------------

             Summary: Kafka08Fetcher trying to look up topic "n/a" on partiton "-1"
                 Key: FLINK-8896
                 URL: https://issues.apache.org/jira/browse/FLINK-8896
             Project: Flink
          Issue Type: Bug
          Components: Kafka Connector
    Affects Versions: 1.4.1, 1.3.2, 1.4.0, 1.3.1, 1.3.0, 1.5.0, 1.6.0
            Reporter: Nico Kruber
            Assignee: Nico Kruber
             Fix For: 1.5.0, 1.6.0


A user on the [mailing list|https://lists.apache.org/thread.html/fa96b09fc1d3a7efdb1bf7946489edafed8cdf138e933e9d0d8948a1@%3Cuser.flink.apache.org%3E] reported this error:
{code}
java.lang.RuntimeException: Unable to find a leader for partitions: [Partition: KafkaTopicPartition{topic='n/a', partition=-1}, KafkaPartitionHandle=[n/a,-1], offset=(not set)]
        at org.apache.flink.streaming.connectors.kafka.internals.Kafka08Fetcher.findLeaderForPartitions(Kafka08Fetcher.java:495)
        at org.apache.flink.streaming.connectors.kafka.internals.Kafka08Fetcher.runFetchLoop(Kafka08Fetcher.java:205)
        at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:449)
        at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
        at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
        at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:262)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
        at java.lang.Thread.run(Thread.java:748)
{code}

The root cause seems to be that {{Kafka08Fetcher#MARKER}} is in the {{unassignedPartitionsQueue}} more than once which could come from multiple calls to {{Kafka08Fetcher#cancel()}}. One code path leading to this is {{FlinkKafkaConsumerBase#cancel()}} being called in one thread and {{FlinkKafkaConsumerBase}}'s partition discovery loop thread dropping out before the first thread was able to call {{Kafka08Fetcher#cancel}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)