You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by nielsbasjes <gi...@git.apache.org> on 2014/10/01 11:59:27 UTC

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

GitHub user nielsbasjes opened a pull request:

    https://github.com/apache/storm/pull/275

    STORM-512 KafkaBolt doesn't handle ticks properly

    I implemented this by 
    1.   Adding an 'boolean isTick()' method to the Tuple interface. 
           This way it is for application builders much easier to  determine if the tuple at hand is a Tick or not.
    2.   Letting the KafkaBolt use that method to ignore the ticks.
    3.   Updating the other place in the code base that handles ticks differently to use this new method.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nielsbasjes/storm STORM-512

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/275.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #275
    
----
commit 6c6bec9b1b2b07b5dfb5000297c11911eda5389d
Author: Niels Basjes <nb...@bol.com>
Date:   2014-10-01T09:50:48Z

    New method in the Tuple interface "boolean isTuple()" for easier handling of TickTuples.

commit 73e54a8f8b0ef9a62955c7c1cee20925d359772b
Author: Niels Basjes <nb...@bol.com>
Date:   2014-10-01T09:51:42Z

    KafkaBolt no longer tries to map/process/send Tick Tuples to Kafka.

commit 8888ae631360ad124af9c480d2f8b621b9c51727
Author: Niels Basjes <nb...@bol.com>
Date:   2014-10-01T09:53:17Z

    Use isTick in all relevant places to avoid code duplication.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-89306290
  
    @nathanmarz I merged this in, but if you feel that there is something wrong with it still I am happy to adjust it on a follow on JIRA, or revert it if it is truly critical.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-63349586
  
    OK I am wrong.  When I tried to compile with this change the following tests failed.
    
    ```
    Failed tests:   shouldEmitSomethingIfTickTupleIsReceived(storm.starter.bolt.IntermediateRankingsBoltTest): Index: 0, Size: 0
      shouldEmitNothingIfNoObjectHasBeenCountedYetAndTickTupleIsReceived(storm.starter.bolt.RollingCountBoltTest): 
      shouldEmitSomethingIfAtLeastOneObjectWasCountedAndTickTupleIsReceived(storm.starter.bolt.RollingCountBoltTest): 
      shouldEmitSomethingIfTickTupleIsReceived(storm.starter.bolt.TotalRankingsBoltTest)
    ```
    
    It looks like to me that the tick is being used somewhere else.  Why exactly do we not want the ticks?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-60935570
  
    The execute of the KafkaBolt puts the tuple into Kafka as a message.
    What I ran into is that the Ticks were put into Kafka as messages too. 
    My consumers of this Kafka queue got really upset (lots of deserialisation exceptions) over these ticks.
    
    I'll update the example code as you indicated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66971975
  
    @nathanmarz Thanks I understand your view now. 
    I refactored my patch to meet this requirement and I've tried to make it as small a difference as possible.
    
    This set of commits should really be squashed before committing to the master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-63347222
  
    The changes look fine to me.  +1, @d2r in this case the bolt is taking its input and sending it to a kafka queue.  I don't think we want to send ticks to the kafka queue, as the ticks are internal to storm, and will not have the key/value that all other tuples have. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by d2r <gi...@git.apache.org>.
Github user d2r commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-60838468
  
    Looks reasonable to me, but I would like someone more familiar with kafka to comment on whether we never want to execute on tick tuples.
    
    We could move make the static method in [TupleHelpers](https://github.com/apache/storm/blob/5a460384f4eb90e4e3870f29090e81633aeafe7b/examples/storm-starter/src/jvm/storm/starter/util/TupleHelpers.java#L29-L30) a member method in TupleImpl as you have done, remove TupleHelpers, and then change the example code to use this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/275


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-74696542
  
    @nathanmarz / @revans2 What must I do/change to get this solution committed? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nathanmarz <gi...@git.apache.org>.
Github user nathanmarz commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-94068063
  
    I didn't get a notification that you were pinging me – but fwiw I'm +1 on it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-69375242
  
    The changes look find to me I am +1 for merging this in.
    
    @nathanmarz I would like to get your opinion on this before merging it in, because you had the original reservations about the patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by d2r <gi...@git.apache.org>.
Github user d2r commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66666316
  
    @nielsbasjes Yeah that fixed it.  I see all tests passing now. Thank you! +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by miguno <gi...@git.apache.org>.
Github user miguno commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-93694857
  
    I created [STORM-786](https://issues.apache.org/jira/browse/STORM-786) to track the tick tuple acking.  Pull request is already sent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by d2r <gi...@git.apache.org>.
Github user d2r commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66294912
  
    The explanation makes sense.  I still see some test errors:
    ```
    java.lang.NullPointerException
        at backtype.storm.tuple.TupleImpl.isTick(TupleImpl.java:218)
        at storm.kafka.bolt.KafkaBolt.execute(KafkaBolt.java:92)
        at storm.kafka.bolt.KafkaBoltTest.executeWithBrokerDown(KafkaBoltTest.java:145)
    ```
    ```
    java.lang.NullPointerException
        at backtype.storm.tuple.TupleImpl.isTick(TupleImpl.java:218)
        at storm.kafka.bolt.KafkaBolt.execute(KafkaBolt.java:92)
        at storm.kafka.bolt.KafkaBoltTest.executeWithoutKey(KafkaBoltTest.java:134)
    ```
    ```
    java.lang.NullPointerException
        at backtype.storm.tuple.TupleImpl.isTick(TupleImpl.java:218)
        at storm.kafka.bolt.KafkaBolt.execute(KafkaBolt.java:92)
        at storm.kafka.bolt.KafkaBoltTest.executeWithByteArrayKeyAndMessage(KafkaBoltTest.java:104)
    ```
    ```
    java.lang.NullPointerException
        at backtype.storm.tuple.TupleImpl.isTick(TupleImpl.java:218)
        at storm.kafka.bolt.KafkaBolt.execute(KafkaBolt.java:92)
        at storm.kafka.bolt.KafkaBoltTest.executeWithKey(KafkaBoltTest.java:91)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-89020139
  
    @nathanmarz any update on your opinion of this.  Your original concerns were addressed, and I feel we should check this in.  If I don't hear back soon I'll assume that it is OK.  This patch has languished far too long.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-63355950
  
    That is what I expected, thanks for clarifying @nielsbasjes Now I am very confused why there is a test that looks like it explicitly tests that the ticks are persisted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-63355100
  
    I'll have a look into why these tests failed.
    
    Regarding the 'why':
    A Tick is a message between the Storm framework and a bolt that runs within the framework. Kafka bolt should handle the tick any way suitable, but most importantly is must NOT persist the tick into Kafka.
    I ran into the scenario that a non-Storm consumer of this Kafka topic gave errors because it got a message (= a Tick) that did not adhere to the agreed interface (i.e. I got deserialization exceptions).
    
    So that's why.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66278969
  
    @revans2 & @d2r : I found what went wrong. Apparently these tests do not use the normal TupleImpl but they Mock the tuple without using the normal implementation. I extended the mock to return the right value for the isTick method. 
    Please verify if I did it right (these tests now work).
    Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nathanmarz <gi...@git.apache.org>.
Github user nathanmarz commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66713430
  
    -1. I don't like adding the "isTick" method to Tuple


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-93697965
  
     I had this ack for the tick in there in an earlier version of the change.
    Then I asked on the mailing list if this ack is needed and I was told that it is not required.
    http://mail-archives.apache.org/mod_mbox/storm-user/201411.mbox/%3CCAAognGZ1pEt0RfKXn+sSeo+djy38LKQ4e3gw1GBn-WmMqWk9YQ@mail.gmail.com%3E
    So I took it out. 
    
    Just to understand: Is ackibg a tick needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by miguno <gi...@git.apache.org>.
Github user miguno commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-93699373
  
    My understanding is yes, you do need to ack tick tuples.  See @nathanmarz [comment](https://groups.google.com/forum/#!topic/storm-user/ZEJabXT5nQA) from some time back:
    
    >> Do I need to ack tick tuples?  Since they're system generated I'm think yes but I'm not sure.
    >
    > Nathan's answer: You should ack all tuples.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-63358387
  
    I think you are right. It seems like the assumption that was made during the development of that test is not 100% accurate in all scenarios.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-61971101
  
    What else can/should I do to get this fix validated and committed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nathanmarz <gi...@git.apache.org>.
Github user nathanmarz commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66858418
  
    I view tick tuples as being built on top of the core ideas of streams and tuples, not as fundamentally intertwined with them. So let's keep them separate and have this function be a static helper method. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66744560
  
    @nathanmarz Why not there? To me it seemed the most logical place. 
    This Tuple interface is one of the core types of Storm and so are the ticks.
    As you can see from the patch I created not all previous implementations were the same.
    So how would you avoid people having to reimplement this very common check?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by miguno <gi...@git.apache.org>.
Github user miguno commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-93690832
  
    Since KafkaBolt is extending BaseRichBolt, I think we should perform a `collector.ack(input)` before `return`.  Tick tuples must be acked like "normal" tuples.
    
    ```
    public class KafkaBolt<K, V> extends BaseRichBolt {
    
        public void execute(Tuple input) {
            if (TupleUtils.isTick(input)) {
              return; // Do not try to send ticks to Kafka
            }
            ...
        }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-512 KafkaBolt doesn't handle ticks prope...

Posted by nielsbasjes <gi...@git.apache.org>.
Github user nielsbasjes commented on the pull request:

    https://github.com/apache/storm/pull/275#issuecomment-66650582
  
    @d2r Thanks! The latest commit I did fixes this problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---