You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Arun Mahadevan (JIRA)" <ji...@apache.org> on 2017/01/11 06:20:58 UTC

[jira] [Created] (STORM-2282) Streams api - provide options for handling errors

Arun Mahadevan created STORM-2282:
-------------------------------------

             Summary: Streams api - provide options for handling errors
                 Key: STORM-2282
                 URL: https://issues.apache.org/jira/browse/STORM-2282
             Project: Apache Storm
          Issue Type: Sub-task
            Reporter: Arun Mahadevan


Adding relevant discussions from PR 1693 below.

Allow users to be explicit about how to handle errors. I don't know if any API out there does it... so this would be unique to Storm.

In short:
Broadly speaking there are two kinds of tuple process errors that the users need to be concerned about:

1- Retry worthy Errors - For instance, failure to deliver to destination service due to connection issues.
2- Not worth retrying - For instance, Parsing errors due to bad data in the tuple. These problems can jam up the pipeline if they are retried repeatedly. Such tuples can be sent to a configurable Dead-Letter-Queue.

---

>1- Retry worthy Errors

Right now theres no explicit `fail` api. When a stage in the stream completes processing (and possibly emits results), the underlying tuples are acked automatically. The only way spout will re-emit is after the message timeout. It may be good to have a fail fast api, but I am not sure how it would help. The replayed tuple could fail again in processing. Instead the processing logic itself can have some retry logic (say retry 3 times) and forward to an error stream  and ack the tuple.

> 2- Not worth retrying

This can be handled via branch logic. E.g. send valid values to stream1 and bad values to stream2.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)