You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Erik Weathers (JIRA)" <ji...@apache.org> on 2017/12/14 00:56:00 UTC
[jira] [Updated] (STORM-2296) Kafka spout - no duplicates on topic
leader changes
[ https://issues.apache.org/jira/browse/STORM-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Weathers updated STORM-2296:
---------------------------------
Summary: Kafka spout - no duplicates on topic leader changes (was: Kafka spout - no duplicates on leader changes)
> Kafka spout - no duplicates on topic leader changes
> ---------------------------------------------------
>
> Key: STORM-2296
> URL: https://issues.apache.org/jira/browse/STORM-2296
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-kafka
> Affects Versions: 1.0.2
> Reporter: Ernestas Vaiciukevičius
> Assignee: Ernestas Vaiciukevičius
> Fix For: 2.0.0, 1.1.0, 1.0.4
>
> Time Spent: 3h
> Remaining Estimate: 0h
>
> Current behavior of Kafka spout emits duplicate tuples whenever Kafka topic leader's change.
> In case of exception caused by leader changes, PartitionManagers are simply recreated losing the state about which tuples were already emitted and new PartitionManager re-emits them again.
> This is fine as at-least-once is fulfilled, but still it would be better to not emit duplicate data if possible.
> Moreover this could be easily avoided by moving the state related to emitted tuples from old PartitionManager to new one.
> Pull requests implementing this:
> 1.0.x-branch - https://github.com/apache/storm/pull/1873
> 1.x-branch - https://github.com/apache/storm/pull/1888
> Pull request for related bugfix: https://github.com/apache/storm/pull/1940
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)