You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by "Hannum, Daniel" <Da...@PremierInc.com> on 2017/09/21 16:29:56 UTC

Does storm guarantee that all tuples will be 'acked' or 'failed'

I’m writing my own spout, backed by a persistent store outside of storm.

What I need to know is whether Storm guarantees that a spout will always be called with ack() or fail() for a given tuple. I.e. even if the spout process dies, another one will make the call.

If this is true, then I can remove the record from storage in nextTuple() and put it back on in fail(), and I’ll be sure I’ll never lose any even in case of failure.

If this is not true, then I need to keep the record in the underlying storage after nextTuple() and don’t take it off until ack(). This just makes it harder because subsequent nextTuple() calls have to know to skip the in progress ones.

So, I hope Storm provides this guarantee.

Thanks

Re: Does storm guarantee that all tuples will be 'acked' or 'failed'

Posted by "Hannum, Daniel" <Da...@PremierInc.com>.

Thank you!

From: Stig Rohde Døssing <sr...@apache.org>
Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
Date: Thursday, September 21, 2017 at 12:45 PM
To: "user@storm.apache.org" <us...@storm.apache.org>
Subject: Re: Does storm guarantee that all tuples will be 'acked' or 'failed'

****This email did not originate from the Premier, Inc. network. Use caution when opening attachments or clicking on URLs.*****

.
Storm guarantees that all tuples will be acked or failed on the spout instance that emitted them. If the spout emits a tuple and the process dies and a new one comes up, the new process may or may not receive the old ack/fail (usually won't but can happen in some cases where the message id only depends on the emitted message, e.g. the KafkaSpout).
You should leave the records in storage until they are acked. A common approach to this is to keep identifiers for the in-progress records in memory in the spout, and then remove them (and delete the underlying record in your case) when the tuple is acked.

2017-09-21 18:29 GMT+02:00 Hannum, Daniel <Da...@premierinc.com>>:
I’m writing my own spout, backed by a persistent store outside of storm.

What I need to know is whether Storm guarantees that a spout will always be called with ack() or fail() for a given tuple. I.e. even if the spout process dies, another one will make the call.

If this is true, then I can remove the record from storage in nextTuple() and put it back on in fail(), and I’ll be sure I’ll never lose any even in case of failure.

If this is not true, then I need to keep the record in the underlying storage after nextTuple() and don’t take it off until ack(). This just makes it harder because subsequent nextTuple() calls have to know to skip the in progress ones.

So, I hope Storm provides this guarantee.

Thanks

Re: Does storm guarantee that all tuples will be 'acked' or 'failed'

Posted by Stig Rohde Døssing <sr...@apache.org>.

Storm guarantees that all tuples will be acked or failed on the spout
instance that emitted them. If the spout emits a tuple and the process dies
and a new one comes up, the new process may or may not receive the old
ack/fail (usually won't but can happen in some cases where the message id
only depends on the emitted message, e.g. the KafkaSpout).

You should leave the records in storage until they are acked. A common
approach to this is to keep identifiers for the in-progress records in
memory in the spout, and then remove them (and delete the underlying record
in your case) when the tuple is acked.

2017-09-21 18:29 GMT+02:00 Hannum, Daniel <Da...@premierinc.com>:

> I’m writing my own spout, backed by a persistent store outside of storm.
>
>
>
> What I need to know is whether Storm guarantees that a spout will always
> be called with ack() or fail() for a given tuple. I.e. even if the spout
> process dies, another one will make the call.
>
>
>
> If this is true, then I can remove the record from storage in nextTuple()
> and put it back on in fail(), and I’ll be sure I’ll never lose any even in
> case of failure.
>
>
>
> If this is not true, then I need to keep the record in the underlying
> storage after nextTuple() and don’t take it off until ack(). This just
> makes it harder because subsequent nextTuple() calls have to know to skip
> the in progress ones.
>
>
>
> So, I hope Storm provides this guarantee.
>
>
>
> Thanks
>
>
>