You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by arunmahadevan <gi...@git.apache.org> on 2016/03/07 12:27:02 UTC

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

GitHub user arunmahadevan opened a pull request:

    https://github.com/apache/storm/pull/1190

    [STORM-1608] Fix stateful topology acking behavior

    Right now the acking is automatically taken care of for the non-stateful bolts in a stateful topology. This leads to double acking if BaseRichBolts are part of the topology and the acking does not complete and tuples are re-emitted.
    
    For the non-stateful bolts, its better to let the bolt do the acking rather than automatically acking.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/arunmahadevan/storm STORM-1608

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1190.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1190
    
----
commit c0bce3e470ca4d502ee4d2ee953e06e0c0fa96c5
Author: Arun Mahadevan <ai...@hortonworks.com>
Date:   2016-03-07T06:00:35Z

    [STORM-1608] Fix stateful topology acking behavior
    
    Right now the acking is automatically taken care of for the non-stateful bolts in a stateful topology.
    This leads to double acking if BaseRichBolts are part of the topology.
    For the non-stateful bolts, its better to let the bolt do the acking rather than automatically acking.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-196322223
  
    @revans2 @ptgoetz I think the comments in this PR are addressed, can you take a look and merge ? Also raised https://github.com/apache/storm/pull/1210 for updating the doc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-194341881
  
    @arunmahadevan Thanks for doing this.  I still have not had time to dig into this feature in as much detail as I would like, but from what I have seen so far the contract with users looks good, and it is an important feature to have.  I have one concern now about thread safety in the AckTrackingOutputCollector, but beyond that it looks fine.
    
    I will leave support for best effort state checkpointing without at least once processing for a later time.  Like I said before I made an assumption that was wrong, and that was part of why I got confused.  I don't see it as a must have feature, just confusion on my part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-196447926
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-193616307
  
    @revans2 anchoring/acking in enforced for stateful & non-stateful bolts in a stateful topology to provide at-least once guarantee for the state updates. For the stateful bolts, the tuples that were part of the last state update are automatically acked after that state update is committed. Non-stateful bolts need to either extend the BaseBasicBolt or do the acking themselves.
    
    The state checkpointing mechanism and the guarantee is documented [here](https://github.com/apache/storm/blob/asf-site/documentation/State-checkpointing.mdl). This does not explicitly mention whether the bolt is expected to ack or not. I can add that to the doc and the javadocs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-196450856
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-193280978
  
    I didn't dig very deeply into the design of the stateful topology check-pointing so perhaps this is a very naive question, but why do all of the tuples flowing through need to be anchored/acked?  and if we get the anchoring/acking wrong what happens?  Will the state be messed up?  will we process things incorrectly?  Why do we even care if someone is OK with dropping tuples?
    
    What exactly is the contract that we have made between the bolts and the state checkpoint system?  Because this changes that contract, but I don't see any corresponding documentation change.
    
    What about the use case where someone does not ack a tuple, because they didn't expect it to be tracked, especially because we auto-anchored it for them in a previous bolt?
    
    Auto-anchoring feels like as big of a mistake as auto-acking something, unless we have a very explicit contract with end users to explain what they are and are not allowed to do.
    
    It makes me very nervous when we try to "fix" someone's code automatically for them assuming that we know what they are doing.  Personally if we cannot guarantee that we are fixing it correctly 100% of the time we should not be fixing it ever.
    
    I am +1 for this patch, but I would like to understand why we are also auto-anchoring emitted tuples, and I would like to know where the contract is documented with end users on what a bolt is and is not allowed to do in this system.
    
    I have quite a few other misgivings about this feature in general and need to dig into it myself, but this is not really the best place for them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-194392972
  
    +1 once @revans2's thread safety concern is addressed.
    
    We may want to document that this feature requires ack'ing to be enabled, but that can be handled separately.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-196487801
  
    Merged to master and 1.x-branch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-194268412
  
    @revans2 @ptgoetz I have refactored the code and removed the auto anchoring/acking for both stateful and non-stateful bolts in a stateful topology. Currently state checkpointing provides at least once processing and we could consider providing a best effort checkpointing and relax the acking/anchoring requirements as a separate feature. 
    
    We expect IStatefulBolt implementations also to ack/anchor now and hence passing OuputCollector in the stateful bolt's prepare method should be fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-193941111
  
    @arunmahadevan I agree mostly with @ptgoetz I wanted to understand the contract (thanks for the link), and that the new contract is consistent with our current contract.  On the acking side it feels like it now is consistent, but not on the anchoring side.  AnchoringOutputCollector in CheckpointTupleForwarder will anchor non-anchored tuples to the last inputTuple.  I can see lots of situations where this is neither expected nor correct for an IRichBolt. A shell bolt for example would totally get this wrong because it does ansync processing.  If we really want to disallow tuples that are not tracked I would prefer to have an exception thrown rather than "fix" the issue on the fly and possibly get it wrong.
    
    Part of my confusion came from me assuming that we supported state checkpointing without requiring at least once processing.  It feels like we should be able to support that with some kind of a best effort state checkpointing, but now I understand the current limitations and that it would be a separate feature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1190#discussion_r55532861
  
    --- Diff: storm-core/src/jvm/org/apache/storm/topology/StatefulBoltExecutor.java ---
    @@ -148,4 +148,19 @@ private void fail(List<Tuple> tuples) {
             }
         }
     
    +    private static class AckTrackingOutputCollector extends AnchoringOutputCollector {
    +        private OutputCollector delegate;
    +        private List<Tuple> ackedTuples;
    +
    +        AckTrackingOutputCollector(OutputCollector delegate) {
    +            super(delegate);
    +            this.delegate = delegate;
    +            this.ackedTuples = new ArrayList<>();
    --- End diff --
    
    All access to ackedTuples should ideally be thread safe, as the OutputCollector could be called from a different thread.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by satishd <gi...@git.apache.org>.
Github user satishd commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-193220524
  
    +1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1190


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-194412502
  
    Addressed thread safety concern and raised [STORM-1615](https://issues.apache.org/jira/browse/STORM-1615) for updating the doc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: [STORM-1608] Fix stateful topology acking beha...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/1190#issuecomment-193934901
  
    @arunmahadevan Can you document the behavior of non-stateful bolts in a topology with stateful bolts?
    
    In a traditional topology (i.e. one that does not include stateful bolts), for `IBasicBolt` instances input tuples are automatically ack'ed and output tuples are automatically anchored to the input tuple. For `IRichBolt` instances, anchoring and ack'ing is left to the implementation.
    
    I feel (and I think @revans2 is alluding to the same) that that contract must be maintained when adding stateful bolts to a topology. (I believe with this patch it is, please correct me if I'm wrong).
    
    The seconds question is what are `IStatefulBolt` implementations expected to do in terms of ack'ing/anchoring? Since the output collector handed to `IStatefulBolt` is an instance of `OutputCollector` (as opposed to `BasicOutputCollector`) one would assume that ack'ing/anchoring is the responsibility of the implementation (the `ack()` and `fail()` methods are visible in `OutputCollector` but not `BasicOutputCollector`).
    
    If that is not the case (i.e. if stateful bolts are not expected to handle ack'ing/anchoring) it would be best to give `IStatefulBolt`s an output collector implementation that does not expose the ack'ing/anchoring API, similar to the relationship of `IBasicBolt`/`BasicOutputCollector`.
    
    Does that make sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---