You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Cody Koeninger <co...@koeninger.org> on 2014/12/30 21:24:45 UTC

Is there any way to tell if compute is being called from a retry?

It looks like taskContext.attemptId doesn't mean what one thinks it might
mean, based on

http://apache-spark-developers-list.1001551.n3.nabble.com/Get-attempt-number-in-a-closure-td8853.html

and the unresolved

https://issues.apache.org/jira/browse/SPARK-4014



Is there any alternative way to tell if compute is being called from a
retry?  Barring that, does anyone have any tips on how it might be possible
to get the attempt count propagated to executors?

It would be extremely useful for the kafka rdd preferred location awareness.

Re: Is there any way to tell if compute is being called from a retry?

Posted by Josh Rosen <ro...@gmail.com>.
This is timely, since I just ran into this issue myself while trying to
write a test to reproduce a bug related to speculative execution (I wanted
to configure a job so that the first attempt to compute a partition would
run slow so that a second, fast speculative copy would be launched).

I've opened a PR with a proposed fix:
https://github.com/apache/spark/pull/3849



On Tue, Dec 30, 2014 at 12:24 PM, Cody Koeninger <co...@koeninger.org> wrote:

> It looks like taskContext.attemptId doesn't mean what one thinks it might
> mean, based on
>
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/Get-attempt-number-in-a-closure-td8853.html
>
> and the unresolved
>
> https://issues.apache.org/jira/browse/SPARK-4014
>
>
>
> Is there any alternative way to tell if compute is being called from a
> retry?  Barring that, does anyone have any tips on how it might be possible
> to get the attempt count propagated to executors?
>
> It would be extremely useful for the kafka rdd preferred location
> awareness.
>