You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Mark Tomko <mt...@broadinstitute.org> on 2015/05/12 23:03:39 UTC

Spouts emits, acks, and threading

When implementing a spout, is it expected that different threads may be
calling nextTuple() and ack()? That is, if ack() needs to clean up some
local state, does that state have to be held in a synchronized data
structure?

It seems like that would be necessary, but I've read (somewhere, having
trouble finding it now...) that individual Storm components are only
operated on by a single thread and therefore do not need to be thread-safe.

Am I missing a spot in the documentation that explains this?

Thanks,
Mark

Re: Spouts emits, acks, and threading

Posted by Mark Tomko <mt...@broadinstitute.org>.
Of course, and in the javadoc:

Storm executes ack, fail, and nextTuple all on the same thread.

On Tue, May 12, 2015 at 5:10 PM, Jake Dodd <ja...@ontopic.io> wrote:

> From the Storm docs:
>
> The main method on spouts is nextTuple. nextTuple either emits a new
> tuple into the topology or simply returns if there are no new tuples to
> emit. It is imperative that nextTuple does not block for any spout
> implementation, because Storm calls all the spout methods on the same
> thread.
>
> The other main methods on spouts are ack and fail. These are called when
> Storm detects that a tuple emitted from the spout either successfully
> completed through the topology or failed to be completed. ack and fail are
> only called for reliable spouts. See the Javadoc
> <https://storm.apache.org/javadoc/apidocs/backtype/storm/spout/ISpout.html> for
> more information.
>
> https://storm.apache.org/documentation/Concepts.html
>
> Jake
>
>
>
> On May 12, 2015, at 2:03 PM, Mark Tomko <mt...@broadinstitute.org> wrote:
>
> When implementing a spout, is it expected that different threads may be
> calling nextTuple() and ack()? That is, if ack() needs to clean up some
> local state, does that state have to be held in a synchronized data
> structure?
>
> It seems like that would be necessary, but I've read (somewhere, having
> trouble finding it now...) that individual Storm components are only
> operated on by a single thread and therefore do not need to be thread-safe.
>
> Am I missing a spot in the documentation that explains this?
>
> Thanks,
> Mark
>
>
>

Re: Spouts emits, acks, and threading

Posted by Jake Dodd <ja...@ontopic.io>.
From the Storm docs:

The main method on spouts is nextTuple. nextTuple either emits a new tuple into the topology or simply returns if there are no new tuples to emit. It is imperative that nextTuple does not block for any spout implementation, because Storm calls all the spout methods on the same thread.

The other main methods on spouts are ack and fail. These are called when Storm detects that a tuple emitted from the spout either successfully completed through the topology or failed to be completed. ack and fail are only called for reliable spouts. See the Javadoc <https://storm.apache.org/javadoc/apidocs/backtype/storm/spout/ISpout.html> for more information.

https://storm.apache.org/documentation/Concepts.html <https://storm.apache.org/documentation/Concepts.html>

Jake
 

> On May 12, 2015, at 2:03 PM, Mark Tomko <mt...@broadinstitute.org> wrote:
> 
> When implementing a spout, is it expected that different threads may be calling nextTuple() and ack()? That is, if ack() needs to clean up some local state, does that state have to be held in a synchronized data structure?
> 
> It seems like that would be necessary, but I've read (somewhere, having trouble finding it now...) that individual Storm components are only operated on by a single thread and therefore do not need to be thread-safe.
> 
> Am I missing a spot in the documentation that explains this?
> 
> Thanks,
> Mark