You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Stig Rohde Døssing (JIRA)" <ji...@apache.org> on 2018/01/10 18:41:01 UTC

[jira] [Commented] (STORM-2896) Support automatic migration of offsets from storm-kafka to storm-kafka-client KafkaSpout

    [ https://issues.apache.org/jira/browse/STORM-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320817#comment-16320817 ] 

Stig Rohde Døssing commented on STORM-2896:
-------------------------------------------

I'm happy to work on implementing this, but it probably won't be for a while so if anyone else is up for it please take the issue.

> Support automatic migration of offsets from storm-kafka to storm-kafka-client KafkaSpout
> ----------------------------------------------------------------------------------------
>
>                 Key: STORM-2896
>                 URL: https://issues.apache.org/jira/browse/STORM-2896
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka-client
>    Affects Versions: 2.0.0, 1.2.0
>            Reporter: Stig Rohde Døssing
>
> I think we can ease migration for people looking to move from storm-kafka to storm-kafka-client. We should be able to support migrating offsets from the old spout by setting some extra configuration in KafkaSpoutConfig, and by adding a new FirstPollOffsetStrategy (e.g. something like FirstPollOffsetStrategy.UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA).
> The old spout stores offsets in Storm's Zookeeper at one of two paths. The storm-kafka SpoutConfig has two parameters we'll also need, namely zkRoot and id. In addition we need to know if the storm-kafka subscription was a wildcard subscription or not.
> The zookeeper path for commit info is 
> {code}
> zkRoot + "/" + id + "/" + topicName + "partition_" + partition
> {code}
> if the subscription was a wildcard. Otherwise it is 
> {code}
> zkRoot + "/" + id + "/" + "partition_" + partition
> {code}
> We can get topicName and partition numbers from the KafkaConsumer.assignment. When we run initialize, we should be able to read the old offset structure from Zookeeper when the strategy is UNCOMMITTED_MIGRATE_FROM_STORM_KAFKA, and seek the consumer to those offsets. We can just crash if the offsets are not present.
> I'd be okay with this feature not being permanent, but I think this feature would make it a lot easier for people to move off the old spout.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)