You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 02:38:27 UTC

[jira] [Updated] (STORM-108) Add commit support to Trident Transactional Spouts

     [ https://issues.apache.org/jira/browse/STORM-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-108:
-------------------------------
    Component/s: storm-core

> Add commit support to Trident Transactional Spouts
> --------------------------------------------------
>
>                 Key: STORM-108
>                 URL: https://issues.apache.org/jira/browse/STORM-108
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/559
> There is no notice from Trident back to the Spout when a batch is successfully completed (for a specific transaction id). When building a Transactional Spout it would be useful to have a success method on the Coordinator to know the batch was completed.
> Looking at code: 
> On completion of a batch, PartitionedTridentSpoutExecutor's success method is called on the Coordinator, but the Coordinator doesn't do anything. And the ITridentSpout.BatchCoordinator interface doesn't even define a 'success' method.
> It looks like what I need to do to complete this code is to:
> change the IPartitionedTridentSpout.Coordinator to have a success(long txid) method
> change PartitionedTridentSpoutExecutor's success to call the coordinator's success method
> Within my own IPartitionedTridentSpout-derived Spout:
> have a common state object in the Spout accessible by both my Emitter and the Coordinator
> implement the success() method on the Coordinator
> when an batch is emitted via emitPartitionBatchNew write information about which messages were included in that batch to the shared state object with the transaction id
> when the Coordinator success() method is called, find the transaction and then 'acknowledge' the messages in that batch back to the source.
> to handle failures, have the emitPartitionBatch method check a counter in the shared state for the transaction id and fail after 'x' retries. By 'fail' I mean execute my own logic, such as writing to a dead.letter queue, then not output any tuples, thus allowing Trident to advance to the next transactions.
> I understand that some messages in the batch may have succeeded when I give up, but I have no way of knowing which ones, so we'll have to handle that in our recovery logic outside of Trident.
> Am I missing anything?
> Is there something in the TridentSpout lifecycle I haven't figured out by looking at the code? I see a 'success' method on the Coordinator but should there be a complementary 'failed' method as well? I didn't see any retry logic on the calls to emitPartitionBatch either so I'm not sure my failure handling above is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)