You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by srishtyagrawal <gi...@git.apache.org> on 2018/04/19 00:44:07 UTC

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

GitHub user srishtyagrawal opened a pull request:

    https://github.com/apache/storm/pull/2637

    Map of Spout configurations from `storm-kafka` to `storm-kafka-client`

    As per @srdo and @ptgoetz's replies on the Storm Dev mailing list, I am adding the spout configuration map in the `storm-kafka-client` document . 
    [The gist](https://gist.github.com/srishtyagrawal/850b0c3f661cf3c620c27f314791224b), with initial changes, had comments from @srdo and questions from me which I am pasting here for convenience:
    
    Last comment by @srdo:
    Thanks, I think this is nearly there. The maxOffsetBehind section says that "If a failing tuple's offset is less than maxOffsetBehind, the spout stops retrying the tuple.". Shouldn't it be more than? i.e. if the latest offset is 100, and you set maxOffsetBehind to 50, and then offset 30 fails, 30 is more than maxOffsetBehind behind the latest offset, so it is not retried.
    Regarding the links, I think we should try to use links that automatically point at the right release. There's some documentation about it here https://github.com/apache/storm-site#how-release-specific-docs-work, and example usage "The allowed values are listed in the FirstPollOffsetStrategy javadocs" (from https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md). It would be great if you fix any broken links you find, or any links that are hard coded to point at a specific release.
    
    
    My reply:
    I copied the [maxOffsetBehind documentation](https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_storm-component-guide/content/storm-kafkaspout-config-core.html) from here. It is confusing because from your earlier example the value 30 itself is lesser than 100-50, but I like the idea of adding behind to make it more clear. As there are more than 1 scenarios where maxOffsetBehind is used, I have modified the documentation to specify the fail scenario as an example.
    Thanks for the documentation on links, I will fix all the existing links and the ones which are currently broken in storm-kafka-client documentation.
    
    Question:
    Seems like all the release related links in [storm-kafka-client.md](https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md) don't work. I looked at other docs as well, for example [Hooks.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Hooks.md), [Concepts.md](https://github.com/apache/storm/blob/09e01231cc427004bab475c9c70f21fa79cfedef/docs/Concepts.md), [Configuration.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Configuration.md), [Common-patterns.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Common-patterns.md) (the first 4 documents I looked into for relative links) where these links gave a 404. Yet to figure out why these links don't work.
    
    
     
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srishtyagrawal/storm migrateSpoutConfigs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2637
    
----
commit 2ca4fc851c17e1cb8a4208fe5cb0c3916551080b
Author: Srishty Agrawal <sa...@...>
Date:   2018-04-19T00:13:57Z

    Map of Spout configurations from storm-kafka to storm-kafka-client

----


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186570995
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html) and
    +[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html) 
    +and Kafka 0.10.1.0 [ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
    +
    +| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig Usage |
    --- End diff --
    
    Addressed.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186570927
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    --- End diff --
    
    Addressed.


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srishtyagrawal Thanks. +1. If you squash the commits I'll merge to master and 1.x-branch. 


---

[GitHub] storm pull request #2637: STORM-3060: Map of Spout configurations from storm...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2637


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185625951
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Kafka config:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    The only reason I explicitly put javadocs links was that there was no way to point to a certain row/configuration in the table. If you think that is not required then I will change all the config links to point to https://kafka.apache.org/10/documentation.html#newconsumerconfigs and also remove the documentation and the default values. 
    @srdo in that case I will explicitly mention that the default value for `auto.offset.reset` in not the same as the one mentioned in the table.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r182955561
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,25 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> `forceFromStart` <br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine where consumer offsets are being read from, ZooKeeper, beginning, or end of Kafka stream | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to this gist](https://gist.github.com/srishtyagrawal/e8f512b2b7f61f239b1bb5dc15d2437b) for choosing the right `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffse
 tStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    --- End diff --
    
    Nit: The phrasing for the old consumer properties is a little awkward. How about "`startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored"


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srishtyagrawal I noticed that too, I think we can fix it by copying a bit of CSS from Github https://github.com/apache/storm-site/pull/5


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srishtyagrawal Thank you for the code review. It is much better now. Besides my two comments above, I still wonder if it would be better to point the links with the description of the Kafka properties to the top of the [New Consumer Configs](http://kafka.apache.org/10/documentation.html#newconsumerconfigs) table. The Javadocs have no description of the property and are basically the name of the property written using capital letters and underscores. This won't be very helpful to the user. It's better to be pointed to the table. The user will know then that he has to search for the name of the property there.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185174572
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Kafka config:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    Keep in mind that our default for this setting is different from the default for the KafkaConsumer.


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srdo thanks for taking care of this!


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185631774
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    --- End diff --
    
    Good point @hmcl. Calling the heading "Translation from `storm-kafka` to `storm-kafka-client` spout properties".
    @srdo I will submit a PR to link this table in the `storm-kafka-migration` README.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185168669
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Kafka config:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    wouldn't it be better to simply put the link for the kafka documentation table entry for auto.offset.reset ? I would suggest the same for hte other kafka properties. 
    
    https://kafka.apache.org/10/documentation.html#newconsumerconfigs
    
    This documentation changes quite often (e.g this property changed since 1.0.x)  and quoting kafka docs into storm docs will be hard to maintain.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185161496
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    --- End diff --
    
    Are the "import package:" entries throughout necessary ? The ConsumerConfig strings come all from the same Kafka package, and the KafkaSpoutConfig configurations already need to have the package imported when the set* method is declared in the code.
    
    It seems most "Usage:" web links are broken. Wouldn't it be better to simply paste the method signature and put the link for that same signature?


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185163739
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    --- End diff --
    
    Do you really want to mean migration, or rather some sort of parallel between the name and meaning of the properties in storm-kafka vs storm-kafka-client. This may imply that there is a way to migrate, whereas I don't really have a migration, but rather a way to specify the same behavior in the old and new spout.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r183161344
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,25 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> `forceFromStart` <br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine where consumer offsets are being read from, ZooKeeper, beginning, or end of Kafka stream | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to this gist](https://gist.github.com/srishtyagrawal/e8f512b2b7f61f239b1bb5dc15d2437b) for choosing the right `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffse
 tStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    --- End diff --
    
    @srdo thanks for the comments, I have addressed both of them.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185160807
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    --- End diff --
    
    Instead of discontinued I would put **N/A**


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186582501
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html) and
    +[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html) 
    +and Kafka 0.10.1.0 [ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
    +
    +| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig Usage |
    +| ------------- | ------------------------------------ | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.htm
 l#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | **Setting:** [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Setting:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Setting:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | **N/A** ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Setting:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    @hmcl these changes were reviewed by @srdo [in a gist](https://gist.github.com/srishtyagrawal/850b0c3f661cf3c620c27f314791224b) before they were submitted as this PR. Pasting his explanation about `auto.offset.reset`'s default value:
    
    The auto.offset.reset setting default depends on the picked ProcessingGuarantee. See https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java#L471. We made this change because defaulting to "latest" is a bit of a trap to leave in place for people who want at-least-once processing. We default to "earliest" if the ProcessingGuarantee is at-least-once, "latest" otherwise. I get that even with "earliest", we can't offer at-least-once if the messages get deleted in Kafka, but I figured it was better to skip as few messages as possible.
    
    I have made the explanation for `auto.offset.reset`'s default value more verbose so that it is clearer.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186253597
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html) and
    +[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html) 
    +and Kafka 0.10.1.0 [ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
    +
    +| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig Usage |
    +| ------------- | ------------------------------------ | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.htm
 l#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | **Setting:** [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Setting:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Setting:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | **N/A** ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Setting:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    I am having a difficult time understanding what you mean by: "_Exception: earliest if ProcessingGuarantee is set to AT_LEAST_ONCE_"


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srdo thanks for explaining that. I was looking at the GitHub file links that's why they were giving 404s.
    
    I have also modified the `socketTimeoutMs` setting map to not be supported in `storm-kafka-client`. The previous translation was from [Kafka's ConsumerConfig.scala](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/consumer/ConsumerConfig.scala#L30), but looks like the scala file is deprecated, also this configuration is only present in [old Kafka Consumer configs table](https://kafka.apache.org/documentation/#oldconsumerconfigs) and not in [new Kafka Consumer configs table](https://kafka.apache.org/documentation/#newconsumerconfigs).
    
    According to your suggestion the Storm links have been modified to be generic hence not tied to a particular release version, but I also have links to Kafka properties which are still release specific (last release - 1.1). This is because kafka-site repo doesn’t do release agnostic links. 
    
    Let me know if the table looks fine, I will then publish them to Storm docs.



---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    I read the documentation for maxOffsetBehind you linked, and I think it is confusingly phrased. It says 
    
    > If a failing tuple's offset is less than maxOffsetBehind, the spout stops retrying the tuple
    
    and what they mean is 
    
    > If a failing tuple's offset is farther behind the current offset than maxOffsetBehind (i.e. if `currentOffset - failingOffset > maxOffsetBehind`, the spout stops retrying the tuple
    
    I misread the description as meaning
    
    > If a failing tuple's offset is less than maxOffsetBehind behind the current offset (i.e. if `currentOffset - failingOffset < maxOffsetBehind`), the spout stops retrying the tuple
    
    The links are pointing into the javadocs directory, which is not checked into git. The javadocs directory is supposed to be created when someone publishes release documentation to the site. See the documentation at https://github.com/apache/storm-site#adding-a-new-release-to-the-website for how to generate the javadoc.
    
    I'd expect the links to work if you check out the repo and generate the javadoc.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185167619
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    --- End diff --
    
    The storm version is already specified above and could be omitted in the table.  IF possible I would suggest to present the table such as:
    ```
    |     SpoutConfig       |     KafkaSpoutConfig   |  
    ---------------------------------------------------
    | prop | desc | default | prop  | desc | default |
    ```


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185609727
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    Have substituted `Kafka Config` with `Setting`, this is because I did not succeed in restructuring the table as suggested in earlier comment. I have also changed the heading from  `KafkaSpoutConfig Name` to `KafkaSpoutConfig/ConsumerConfig Name` because the column is a mix of both kind of configs.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185606593
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    --- End diff --
    
    Addressed.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185623513
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    In the [FetchRequest Kafka code](https://github.com/apache/kafka/blob/0.10.1.0/core/src/main/scala/kafka/api/FetchRequest.scala#L53-L61) `fetchSize` and `maxBytes` are different variables. Does `val fetchSize = buffer.getInt` mean that `fetchSize` will fetch as many bytes as possible to fit in the buffer and hence putting the `max` in `fetchSizeByte`'s description is appropriate? Just trying to understand this parameter better before we make the change. 


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185644653
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    --- End diff --
    
    @srdo I suggested the other way because the name `storm-kafka-migration` suggests that the  module will help with migrating storm-kafka, but either way is fine. I can add the link to `storm-kafka-migration` here.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186572103
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html) and
    +[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html) 
    +and Kafka 0.10.1.0 [ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
    +
    +| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig Usage |
    +| ------------- | ------------------------------------ | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.htm
 l#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | **Setting:** [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Setting:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Setting:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | **N/A** ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Setting:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchMaxWait`<br><br> Maximum time in ms to wait for the response <br> **Default:** `10000` | **Setting:** [`fetch.max.wait.ms`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MAX_WAIT_MS_CONFIG) <br><br> **Default:** `500` | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `minFetchByte` | **Setting:** [`fetch.min.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MIN_BYTES_CONFIG) <br><br> **Default:** `1`| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    @hmcl while checking for the default value I realized that `minFetchByte` was added in Storm 1.x and is absent in Storm 0.9.6 (against which I am comparing in this PR). So I have removed that entry from the table, also the mapping is straightforward. `minFetchByte` --> `fetch.min.bytes`. Thanks for catching this!


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    +1
    @srishtyagrawal thank you for your nice and helpful contribution. It will benefit a lot of users.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r182955679
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,25 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> `forceFromStart` <br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine where consumer offsets are being read from, ZooKeeper, beginning, or end of Kafka stream | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to this gist](https://gist.github.com/srishtyagrawal/e8f512b2b7f61f239b1bb5dc15d2437b) for choosing the right `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffse
 tStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    --- End diff --
    
    I think we should include the FirstPollOffset table in this file as well


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @hmcl thanks for the comments. I have edited the Kafka property links to point to the [Kafka New Consumer Configs table](http://kafka.apache.org/10/documentation.html#newconsumerconfigs), and have removed the description and defaults for those configs (included in the new link).  
    
    I have created a JIRA ticket, [STORM-3060](https://issues.apache.org/jira/browse/STORM-3060), and added it to the PR and the commit title. 
    
    The commits have been squashed and the PR is ready to be merged.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185643262
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    --- End diff --
    
    Addressed.


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srdo thanks a lot for publishing the docs, I see the changes but unfortunately the table is badly formatted. I will look into how to make this better whenever I have time next. 


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186253569
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html) and
    +[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html) 
    +and Kafka 0.10.1.0 [ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
    +
    +| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig Usage |
    --- End diff --
    
    I suggest removing Name as ConsumerConfig is very explanatory


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185168807
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Kafka config:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchMaxWait`<br><br> Maximum time in ms to wait for the response <br> **Default:** `10000` | **Kafka config:** [`fetch.max.wait.ms`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MAX_WAIT_MS_CONFIG) <br><br> **Default:** `500` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `minFetchByte` | **Kafka config:** [`fetch.min.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MIN_BYTES_CONFIG) <br><br> **Default:** `1`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `maxOffsetBehind`<br><br> Specifies how long a spout attempts to retry the processing of a failed tuple. One of the scenarios is when a failing tuple's offset is more than `maxOffsetBehind` behind the acked offset, the spout stops retrying the tuple.<br>**Default:** `LONG.MAX_VALUE`| No equivalent in `storm-kafka-client` as of now||
    --- End diff --
    
    I suggest N/A instead of "No equivalent in storm-kafka-client as of now"


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srdo can you merge this if it looks ok?


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srdo squashed the commits.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185608174
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    --- End diff --
    
    Addressed. Removed Storm version from the heading. @hmcl I had initially thought of structuring this the way as you have suggested above, but was unable to nest columns in markdown. Hence dropped the idea.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186253611
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html) and
    +[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html) 
    +and Kafka 0.10.1.0 [ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
    +
    +| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig Usage |
    +| ------------- | ------------------------------------ | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | [`<KafkaSpoutConfig-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.htm
 l#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | **Setting:** [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Setting:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Setting:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | **N/A** ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Setting:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchMaxWait`<br><br> Maximum time in ms to wait for the response <br> **Default:** `10000` | **Setting:** [`fetch.max.wait.ms`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MAX_WAIT_MS_CONFIG) <br><br> **Default:** `500` | [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `minFetchByte` | **Setting:** [`fetch.min.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MIN_BYTES_CONFIG) <br><br> **Default:** `1`| [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    There is no default for _minFetchByte_?


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185174192
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    --- End diff --
    
    That's a good point. If users need to migrate a topology, they'll probably want to copy committed offsets as well. Maybe we should link to https://github.com/apache/storm/tree/master/external/storm-kafka-migration


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185168182
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    the (**max?**) number of bytes to attempt ... 


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @erikdw I am reviewing this now. Sorry but I was away the last few days.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185606682
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `bufferSizeBytes`<br><br> Buffer size (in bytes) for network requests. The buffer size which consumer has for pulling data from producer <br> **Default:** `1MB`| **Kafka config:** [`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG) <br><br> The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `socketTimeoutMs`<br><br> **Default:** `10000` | Discontinued in `storm-kafka-client` ||
    +| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange`<br><br> **Default:** `true` | **Kafka config:** [`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)<br><br> **Possible values:** `"latest"`, `"earliest"`, `"none"`<br> **Default:** `latest`. Exception: `earliest` if [`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html) is set to `AT_LEAST_ONCE` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, <String>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchMaxWait`<br><br> Maximum time in ms to wait for the response <br> **Default:** `10000` | **Kafka config:** [`fetch.max.wait.ms`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MAX_WAIT_MS_CONFIG) <br><br> **Default:** `500` | **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `minFetchByte` | **Kafka config:** [`fetch.min.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#FETCH_MIN_BYTES_CONFIG) <br><br> **Default:** `1`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, <value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `maxOffsetBehind`<br><br> Specifies how long a spout attempts to retry the processing of a failed tuple. One of the scenarios is when a failing tuple's offset is more than `maxOffsetBehind` behind the acked offset, the spout stops retrying the tuple.<br>**Default:** `LONG.MAX_VALUE`| No equivalent in `storm-kafka-client` as of now||
    --- End diff --
    
    Addressed.


---

[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

Posted by erikdw <gi...@git.apache.org>.
Github user erikdw commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @hmcl : would you happen to have a chance (or desire) to skim through this PR Hugo?  I know that you contributed heavily to storm-kafka-client, so thought you might have some thoughts on this.  No worries if not.


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185168952
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    +| **Setting:** `scheme`<br><br> The interface that specifies how a `ByteBuffer` from a Kafka topic is transformed into Storm tuple <br>**Default:** `RawMultiScheme` | [`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)<br><br> [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, <deserializer-class>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    +| **Setting:** `fetchSizeBytes`<br><br> Message fetch size -- the number of bytes to attempt to fetch in one request to a Kafka server <br> **Default:** `1MB` | **Kafka config:** [`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)<br><br> **Default:** `1MB`| **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` <br><br>  **Usage:** [`<KafkaSpoutConfig-Builder>.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, <int-value>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
    --- End diff --
    
    I suggest omitting "Kafka config:". The link is self explanatory


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185174341
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    +
    +This may not be an exhaustive list because the `storm-kafka` configs were taken from Storm 0.9.6
    +[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java) and
    +[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
    +`storm-kafka-client` spout configurations were taken from Storm 1.0.6
    +[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
    +
    +| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | KafkaSpoutConfig usage help |
    +| ------------------------- | ---------------------------- | --------------------------- |
    +| **Setting:** `startOffsetTime`<br><br> **Default:** `EarliestTime`<br>________________________________________________ <br> **Setting:** `forceFromStart` <br><br> **Default:** `false` <br><br> `startOffsetTime` & `forceFromStart` together determine the starting offset. `forceFromStart` determines whether the Zookeeper offset is ignored. `startOffsetTime` sets the timestamp that determines the beginning offset, in case there is no offset in Zookeeper, or the Zookeeper offset is ignored | **Setting:** [`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)<br><br> **Default:** `UNCOMMITTED_EARLIEST` <br><br> [Refer to the helper table](#helper-table-for-setting-firstpolloffsetstrategy) for picking `FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` settings | **Import package:** `org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.<strategy-name>` <br><br> **Usage:** [`<KafkaSpout
 Config-Builder>.setFirstPollOffsetStrategy(<strategy-name>)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
    --- End diff --
    
    The links to javadoc are broken here, but should work once this document is put on the site. 


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @srishtyagrawal We need to publish to https://github.com/apache/storm-site in order to update the live site. I'll try to get it done soon


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r185640531
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,37 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Migrating a `storm-kafka` spout to use `storm-kafka-client`
    --- End diff --
    
    @srishtyagrawal I meant the other way around, linking from here to the `storm-kafka-migration` README. Not sure why we'd want to link from `storm-kafka-migration` to here?


---

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

Posted by hmcl <gi...@git.apache.org>.
Github user hmcl commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2637#discussion_r186253552
  
    --- Diff: docs/storm-kafka-client.md ---
    @@ -313,4 +313,39 @@ KafkaSpoutConfig<String, String> kafkaConf = KafkaSpoutConfig
       .setTupleTrackingEnforced(true)
     ```
     
    -Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    \ No newline at end of file
    +Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, where tuple tracking is required and therefore always enabled.
    +
    +# Translation from `storm-kafka` to `storm-kafka-client` spout properties
    --- End diff --
    
    NIT: Rename translation to Mapping. This is not really a translation.


---

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

Posted by srishtyagrawal <gi...@git.apache.org>.
Github user srishtyagrawal commented on the issue:

    https://github.com/apache/storm/pull/2637
  
    @hmcl @srdo how often do we publish the docs for `2.0.0-SNAPSHOT`? Is it every time a document update is made to this repo? How do I request for this PR to be published in the 2.0.0 Storm docs?
    
    Also currently the [2.0.0-SNAPSHOT storm-kafka-client documentation link](http://storm.apache.org/releases/2.0.0-SNAPSHOT/storm-kafka-client.html) is broken.


---