You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Adrian Seungjin Lee (JIRA)" <ji...@apache.org> on 2015/01/09 02:09:35 UTC

[jira] [Updated] (STORM-618) Add spoutconfig option to make kafka spout process messages at most once.

     [ https://issues.apache.org/jira/browse/STORM-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Seungjin Lee updated STORM-618:
--------------------------------------
    Summary:  Add spoutconfig option to make kafka spout process messages at most once.  (was: Kafka spout should provide optional way to implement at-most once semantic)

>  Add spoutconfig option to make kafka spout process messages at most once.
> --------------------------------------------------------------------------
>
>                 Key: STORM-618
>                 URL: https://issues.apache.org/jira/browse/STORM-618
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-kafka
>    Affects Versions: 0.9.3
>            Reporter: Adrian Seungjin Lee
>
> While it's nice for kafka spout to push failed tuple back into a sorted set and try to process it again, this way of guaranteed message processing sometimes makes situation pretty bad when a failed tuple repeatedly fails in downstream bolts since PartitionManager#fill method tries to fetch from that offset repeatedly.
> This is a corresponding code snippet.
>     private void fill() {
> ...
>         if (had_failed) {
>             offset = failed.first();
>         } else {
>             offset = _emittedToOffset;
>         }
> ...
>             msgs = KafkaUtils.fetchMessages(_spoutConfig, _consumer, _partition, offset);
> ...
> So there should be an option for a developer to decide if he wants to process failed tuple again or just skip failed tuple. One of the best thing of Storm is that spout together with trident can be implemented to guarantee at-least-once,exactly-once and at-most-once message processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)