You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "lilizhi (JIRA)" <ji...@apache.org> on 2017/07/01 11:50:00 UTC

[jira] [Created] (STORM-2611) a batched kafkaspout with offsets in zookeeper

lilizhi created STORM-2611:
------------------------------

             Summary: a batched kafkaspout with offsets in zookeeper
                 Key: STORM-2611
                 URL: https://issues.apache.org/jira/browse/STORM-2611
             Project: Apache Storm
          Issue Type: Improvement
          Components: examples
    Affects Versions: 1.1.0
         Environment: Kafka, storm, zookeeper
            Reporter: lilizhi
            Priority: Trivial
             Fix For: 1.1.0


There are some issues with org.apache.storm.kafka.spout.KafkaSpout.
1. When the topology is running in multi workers in different supervisors, it is very often to trigger kafkaspout rebalance. And so the streaming is not stable. And it will cause massive retransmission of lost packets.
2. When max.uncommitted.offsets is less than 200000 (for limited flow), sometimes there is deadlock. The phenomenon is the heartbeat between spout and kafka can not be performed.
3. When the data is from storm to hbase,  batch is used to improve writing productivity. So using batch from spout to bolt is better for special scene.
4. So a batched kafkaspout and bolt with offsets in zookeeper will be valuable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)