You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by harshach <gi...@git.apache.org> on 2015/03/31 01:12:49 UTC

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

GitHub user harshach opened a pull request:

    https://github.com/apache/storm/pull/493

    STORM-563. Kafka Spout doesn't pick up from the beginning of the queue unless forceFromStart specified.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/harshach/incubator-storm STORM-563-V2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/493.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #493
    
----
commit d2e10346e58527496bccaeca7b6a1b1d924e7b99
Author: Sriharsha Chintalapani <ma...@harsha.io>
Date:   2015-03-30T23:11:22Z

    STORM-563. Kafka Spout doesn't pick up from the beginning of the queue unless forceFromStart specified.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-135783234
  
    @Renkai In this case yes it won't read from zk offsets. Which is incorrect behavior. Can you file a jira on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-90309348
  
    @ptgoetz  addressed your comments. Can you please take a look at this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-90313220
  
    It wouldn't hurt to expand on what `System.currentTimeMillis()` means in that context (i.e. if you have a specific time stored in epoch format, you can start from there).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-135807506
  
    Thanks @Renkai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on a diff in the pull request:

    https://github.com/apache/storm/pull/493#discussion_r27512587
  
    --- Diff: external/storm-kafka/README.md ---
    @@ -120,6 +120,23 @@ spoutConf.scheme = new SchemeAsMultiScheme(new StringScheme());
     OpaqueTridentKafkaSpout spout = new OpaqueTridentKafkaSpout(spoutConf);
     ```
     
    +### How KafkaSpout stores offsets of a kafka topic and recovers incase of failures
    +
    +As shown in the above KafkaConfig properties , user can control where in the topic they can start reading by setting **KafkaConfig.startOffsetTime.**
    +
    +There are two options **kafka.api.OffsetRequest.EarliestTime()** which makes the KafkaSpout to read from the begining of the topic and 
    --- End diff --
    
    I would also document the actual values of `EarliestTime()` (`-2`) and `LatestTime()` (`-1`), and that it can also be set to a point in time (a la `System.currentTimeMillis()`).
    
    My reasoning behind documenting the values (as opposed to the kafka API constants) is that the start offset time is likely to be specified via configuration (i.e. outside java code). Either that, or add spout constants that would get evaluated to `EarliestTime()`/`LatestTime()` if for some reason those values were ever changed in the Kafka API -- that seems like a less "leaky" solution.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-135625738
  
    @Renkai ignoreZkOffsets is rename of forcefromStart. So if you set ignoreZkOffsets it wil ignore already set offsets in zookeeper and start from the startOffsetTime.
    "How do spout detect it is first started or recover from failure?"
    can you explain bit more on that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by Renkai <gi...@git.apache.org>.
Github user Renkai commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-135805891
  
    @harshach  Issue created at https://issues.apache.org/jira/browse/STORM-1017


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-88334297
  
    @ptgoetz updated the doc as per your suggestion.  Please take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/493


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by Parth-Brahmbhatt <gi...@git.apache.org>.
Github user Parth-Brahmbhatt commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-88197325
  
    :100: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by Renkai <gi...@git.apache.org>.
Github user Renkai commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-135604162
  
    If I set ignoreZkOffsets to true,will a spout recover from failure read from zk offsets or use startOffsetTime?
    How do spout detect it is first started or recover from failure?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-90320879
  
    That being said, I'm +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by Renkai <gi...@git.apache.org>.
Github user Renkai commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-135636635
  
    when ignoreZkOffsets set true and startOffsetTime = kafka.api.OffsetRequest.EarliestTime().
    `workers running` -> `topology shutdown by user and restart` -> `workers will read from earliest time again`
    `workers running` -> `one of workers shutdown by accident and supervisor restart the worker` -> `what offset will the restarted worker read from?`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-563. Kafka Spout doesn't pick up from th...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/493#issuecomment-87875138
  
    @ptgoetz @nathanmarz @revans2  Please take a look at the patch. I renamed forceFromStart to ignoreZkOffsets and users can configure where they want to start based on startOffsetTime .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---