You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by "assaf.mendelson" <as...@rsa.com> on 2018/07/03 06:17:04 UTC

Spark data source resiliency

Hi All,

I am implemented a data source V2 which integrates with an internal system
and I need to make it resilient to errors in the internal data source.

The issue is that currently, if there is an exception in the data reader,
the exception seems to fail the entire task. I would prefer instead to just
restart the relevant partition.

Is there a way to do it or would I need to solve it inside the iterator
itself?

Thanks,
    Assaf.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: Spark data source resiliency

Posted by "assaf.mendelson" <as...@rsa.com>.

You are correct, this solved it.
Thanks



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: Spark data source resiliency

Posted by Wenchen Fan <cl...@gmail.com>.

I believe you are using something like `local[8]` as your Spark mater,
which can't retry tasks. Please try `local[8, 3]` which can re-try failed
tasks 3 times.

On Tue, Jul 3, 2018 at 2:42 PM assaf.mendelson <as...@rsa.com>
wrote:

> That is what I expected, however, I did a very simple test (using println
> just to see when the exception is triggered in the iterator) using local
> master and I saw it failed once and cause the entire operation to fail.
>
> Is this something which may be unique to local master (or some default
> configuration which should be tested)?  I can't see a specific
> configuration
> to handle this in the documentation.
>
> Thanks,
>     Assaf.
>
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: Spark data source resiliency

Posted by "assaf.mendelson" <as...@rsa.com>.

That is what I expected, however, I did a very simple test (using println
just to see when the exception is triggered in the iterator) using local
master and I saw it failed once and cause the entire operation to fail.

Is this something which may be unique to local master (or some default
configuration which should be tested)?  I can't see a specific configuration
to handle this in the documentation.

Thanks,
    Assaf.




--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: Spark data source resiliency

Posted by Wenchen Fan <cl...@gmail.com>.

a failure in the data reader results to a task failure, and Spark will
re-try the task for you (IIRC re-try 3 times before fail the job).

Can you check your Spark log and see if the task fails consistently?

On Tue, Jul 3, 2018 at 2:17 PM assaf.mendelson <as...@rsa.com>
wrote:

> Hi All,
>
> I am implemented a data source V2 which integrates with an internal system
> and I need to make it resilient to errors in the internal data source.
>
> The issue is that currently, if there is an exception in the data reader,
> the exception seems to fail the entire task. I would prefer instead to just
> restart the relevant partition.
>
> Is there a way to do it or would I need to solve it inside the iterator
> itself?
>
> Thanks,
>     Assaf.
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>