You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by M Tarkeshwar Rao <m....@ericsson.com> on 2014/03/20 10:23:31 UTC

fault tolerant stretegies

Hi,



Can you please help me in finding the strategies for fault tolerant in (Trident)storm.

I want to properly send the failed reason to Master controller node.



Regards

Tarkeshwar

Re: fault tolerant stretegies

Posted by Svend Vanderveken <sv...@gmail.com>.

Hi Rao,

One typical pattern for reporting non-retry-able errors / data in a  non
human-facing application is a dead letter queue, i.e. a queue somewhere
that is plugged to an error monitoring tool. Most companies have already
something like that in place (e.g. Nagios), so usually the idea is to get
in concat with the person responsible for that system and understand how he
wants us to send failure report to that existing system.

Another simple and efficient way to report an error is really just to log
it, maybe with a separate logger instance that outputs all failure to a
separate file/queue/db/whatever.

In any case, the reporting itself is always easy, the sometimes more tricky
part is to make sure your report's grab some human's attention so that he
can take action.





On Fri, Mar 21, 2014 at 8:09 AM, M Rao <m....@ericsson.com>wrote:

>  Hi Svend,
>
>
> ---you need another specific error reporting tool, storm does not provide
> such thing (atm at least, I don't know about the future).
>
> Can you please suggest any error reporting tool for us or any pointer or
> link. It will be great help for us.
>
> Regards
> Tarkeshwar
>
>
> On 03/20/2014 04:01 PM, Svend Vanderveken wrote:
>
> HI Rao,
>
>
>  AFAIK there is no way to do that, and there is actually no master
> controller node.
>
>  Errors are reported to Storm whenever we want to trigger the Storm error
> handling / exactly once semantic => the first point of decision is at the
> place where the error occurs (or in a wrapper of your components):
>
>  * if it makes sense to retry later (e.g. DB connection lost): throw a
> FailedException to Storm
> * otherwise (invalid tuple data), don't report anything to Storm
>
>  The error reported to Storm  is going to be propagated back to the
> originating node where the corresponding spout is running. Bare in mind
> that this is a very different concept than a master controller node since
> we typically have plenty of spout instances: e.g. in case of Kafka, if we
> have 100 Kafka nodes each with 10 partitions, we are able to start up to
> 1000 instances of Storm Kafka spouts on plenty of hosts. The spout is going
> to handle the replay mechanism of that tuple, according to the
> transactional/opaque semantic that it implements.
>
>  If you need to do error reporting for other reasons that the Storm
> replay mechanism (typically reporting for at least the tuples you decide
> not to retry, so you can investigate them later without blocking the
> real-time flow of events), then you need another specific error reporting
> tool, storm does not provide such thing (atm at least, I don't know about
> the future).
>
>  Best regards,
>
>  Svend
> http://svendvanderveken.wordpress.com/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Mar 20, 2014 at 5:23 AM, M Tarkeshwar Rao <
> m.tarkeshwar.rao@ericsson.com> wrote:
>
>  Hi,
>
>
>
> Can you please help me in finding the strategies for fault tolerant in
> (Trident)storm.
>
> I want to properly send the failed reason to Master controller node.
>
>
>
> Regards
>
> Tarkeshwar
>
>
>
>
>
>
>

Re: fault tolerant stretegies

Posted by M Rao <m....@ericsson.com>.

Hi Svend,


---you need another specific error reporting tool, storm does not 
provide such thing (atm at least, I don't know about the future).

Can you please suggest any error reporting tool for us or any pointer or 
link. It will be great help for us.

Regards
Tarkeshwar

On 03/20/2014 04:01 PM, Svend Vanderveken wrote:
> HI Rao,
>
>
> AFAIK there is no way to do that, and there is actually no master 
> controller node.
>
> Errors are reported to Storm whenever we want to trigger the Storm 
> error handling / exactly once semantic => the first point of decision 
> is at the place where the error occurs (or in a wrapper of your 
> components):
>
> * if it makes sense to retry later (e.g. DB connection lost): throw a 
> FailedException to Storm
> * otherwise (invalid tuple data), don't report anything to Storm
>
> The error reported to Storm  is going to be propagated back to the 
> originating node where the corresponding spout is running. Bare in 
> mind that this is a very different concept than a master controller 
> node since we typically have plenty of spout instances: e.g. in case 
> of Kafka, if we have 100 Kafka nodes each with 10 partitions, we are 
> able to start up to 1000 instances of Storm Kafka spouts on plenty of 
> hosts. The spout is going to handle the replay mechanism of that 
> tuple, according to the transactional/opaque semantic that it implements.
>
> If you need to do error reporting for other reasons that the Storm 
> replay mechanism (typically reporting for at least the tuples you 
> decide not to retry, so you can investigate them later without 
> blocking the real-time flow of events), then you need another specific 
> error reporting tool, storm does not provide such thing (atm at least, 
> I don't know about the future).
>
> Best regards,
>
> Svend
> http://svendvanderveken.wordpress.com/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Mar 20, 2014 at 5:23 AM, M Tarkeshwar Rao 
> <m.tarkeshwar.rao@ericsson.com <ma...@ericsson.com>> 
> wrote:
>
>     Hi,
>
>     Can you please help me in finding the strategies for fault
>     tolerant in (Trident)storm.
>
>     I want to properly send the failed reason to Master controller node.
>
>     Regards
>
>     Tarkeshwar
>
>

Re: fault tolerant stretegies

Posted by Svend Vanderveken <sv...@gmail.com>.

HI Rao,

AFAIK there is no way to do that, and there is actually no master
controller node.

Errors are reported to Storm whenever we want to trigger the Storm error
handling / exactly once semantic => the first point of decision is at the
place where the error occurs (or in a wrapper of your components):

* if it makes sense to retry later (e.g. DB connection lost): throw a
FailedException to Storm
* otherwise (invalid tuple data), don't report anything to Storm

The error reported to Storm  is going to be propagated back to the
originating node where the corresponding spout is running. Bare in mind
that this is a very different concept than a master controller node since
we typically have plenty of spout instances: e.g. in case of Kafka, if we
have 100 Kafka nodes each with 10 partitions, we are able to start up to
1000 instances of Storm Kafka spouts on plenty of hosts. The spout is going
to handle the replay mechanism of that tuple, according to the
transactional/opaque semantic that it implements.

If you need to do error reporting for other reasons that the Storm replay
mechanism (typically reporting for at least the tuples you decide not to
retry, so you can investigate them later without blocking the real-time
flow of events), then you need another specific error reporting tool, storm
does not provide such thing (atm at least, I don't know about the future).

Best regards,

Svend
http://svendvanderveken.wordpress.com/

On Thu, Mar 20, 2014 at 5:23 AM, M Tarkeshwar Rao <
m.tarkeshwar.rao@ericsson.com> wrote:

>  Hi,
>
>
>
> Can you please help me in finding the strategies for fault tolerant in
> (Trident)storm.
>
> I want to properly send the failed reason to Master controller node.
>
>
>
> Regards
>
> Tarkeshwar
>
>
>