You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Chen Zhiming (JIRA)" <ji...@apache.org> on 2017/03/03 05:19:45 UTC

[jira] [Created] (STORM-2394) KafkaSpout: Has no leader of partitions for a short time

Chen Zhiming created STORM-2394:
-----------------------------------

             Summary: KafkaSpout: Has no leader of partitions for a short time
                 Key: STORM-2394
                 URL: https://issues.apache.org/jira/browse/STORM-2394
             Project: Apache Storm
          Issue Type: Improvement
          Components: storm-kafka
    Affects Versions: 0.9.2-incubating, 0.9.3, 0.10.0, 0.9.3-rc2, 0.9.4, 1.0.0, 0.9.5, 0.9.6, 0.10.1, 2.0.0, 1.0.1, 0.10.2, 1.0.2, 1.1.0, 1.0.3, 1.x, 0.10.3, 1.0.4, 1.1.1
            Reporter: Chen Zhiming


In our case, there is something wrong with network for a short time. So some partitions of Kafka have no leaders.
The nextTuple of KafkaSpout throw an exception of "No leader found for partition 0" at the position of "_coordinator.refresh();". The exception is from the function getLeaderFor in DynamicBrokersReader.java. So the spout is hanged.
The partitions of Kafka have recover for a short time. But the spout can not deal with this problem. This problem appears several times on our server. Such as:
Feb 25 06:31:19 CST 2017, KafkaSpout threw the exception.
Feb 25 06:31:21 CST 2017, Kafka partitions recoverd.
To be stronger, I think that the "_coordinator.refresh();" can try times. At the last time, throw the exception. Anyway, it will die, why not try one more time?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)