You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by William Ferrell <wf...@gmail.com> on 2015/07/04 05:08:45 UTC

Re: How to timeout a task?

Ted,

Thanks very much for your reply. It took me almost a week but I have
finally had a chance to implement what you noted and it appears to be
working locally. However, when I launch this onto a cluster on EC2 -- this
doesn't work reliably.

To expand, I think the issue is that some of the code we have takes the
python GIL and hence no internal timeout will work. That is why I was
hoping to learn of a task level timeout -- something at the Spark level --
the management level -- such that it can decide a task has taken to long
and just kill it and move on.

Does this make sense?  Are you familiar with any such options?

Best,

- Bill


On Sat, Jun 27, 2015 at 9:26 AM, Ted Yu <yu...@gmail.com> wrote:

> Have you looked at:
>
> http://stackoverflow.com/questions/2281850/timeout-function-if-it-takes-too-long-to-finish
>
> FYI
>
> On Sat, Jun 27, 2015 at 8:33 AM, wasauce <wf...@gmail.com> wrote:
>
>> Hello!
>>
>> We use pyspark to run a set of data extractors (think regex). The
>> extractors
>> (regexes) generally run quite quickly and find a few matches which are
>> returned and stored into a database.
>>
>> My question is -- is it possible to make the function that runs the
>> extractors have a timeout? I.E. if for a given file the extractor runs for
>> more than X seconds it terminates and returns a default value?
>>
>> Here is a code snippet of what we are doing with some comments as to which
>> function I am looking to timeout.
>>
>> code: https://gist.github.com/wasauce/42a956a1371a2b564918
>>
>> Thank you
>>
>> - Bill
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-timeout-a-task-tp23513.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>