You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by shahab <sh...@gmail.com> on 2014/12/05 11:02:10 UTC

Increasing the number of retry in case of job failure

Hello,

By some (unknown) reasons some of my tasks, that fetch data from Cassandra,
are failing so often, and apparently the master removes a tasks which fails
more than 4 times (in my case).

Is there any way to increase the number of re-tries ?

best,
/Shahab

Re: Increasing the number of retry in case of job failure

Posted by Andrew Or <an...@databricks.com>.
Increasing max failures is a way to do it, but it's probably a better idea
to keep your tasks from failing in the first place. Are your tasks failing
with exceptions from Spark or your application code? If from Spark, what is
the stack trace? There might be a legitimate Spark bug such that even
increasing this max failures won't fix your problem.

2014-12-05 5:12 GMT-08:00 Daniel Darabos <da...@lynxanalytics.com>:

> It is controlled by "spark.task.maxFailures". See
> http://spark.apache.org/docs/latest/configuration.html#scheduling.
>
> On Fri, Dec 5, 2014 at 11:02 AM, shahab <sh...@gmail.com> wrote:
>
>> Hello,
>>
>> By some (unknown) reasons some of my tasks, that fetch data from
>> Cassandra, are failing so often, and apparently the master removes a tasks
>> which fails more than 4 times (in my case).
>>
>> Is there any way to increase the number of re-tries ?
>>
>> best,
>> /Shahab
>>
>
>

Re: Increasing the number of retry in case of job failure

Posted by Daniel Darabos <da...@lynxanalytics.com>.
It is controlled by "spark.task.maxFailures". See
http://spark.apache.org/docs/latest/configuration.html#scheduling.

On Fri, Dec 5, 2014 at 11:02 AM, shahab <sh...@gmail.com> wrote:

> Hello,
>
> By some (unknown) reasons some of my tasks, that fetch data from
> Cassandra, are failing so often, and apparently the master removes a tasks
> which fails more than 4 times (in my case).
>
> Is there any way to increase the number of re-tries ?
>
> best,
> /Shahab
>