You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Utkarsh Sengar <ut...@gmail.com> on 2016/06/16 04:45:32 UTC

How to deal with tasks running too long?

This SO question was asked about 1yr ago.
http://stackoverflow.com/questions/31799755/how-to-deal-with-tasks-running-too-long-comparing-to-others-in-job-in-yarn-cli

I answered this question with a suggestion to try speculation but it
doesn't quite do what the OP expects. I have been running into this issue
more these days. Out of 5000 tasks, 4950 completes in 5mins but the last 50
never really completes, have tried waiting for 4hrs. This can be a memory
issue or maybe the way spark's fine grained mode works with mesos, I am
trying to enable jmxsink to get a heap dump.

But in the mean time, is there a better fix for this? (in any version of
spark, I am using 1.5.1 but can upgrade). It would be great if the last 50
tasks in my example can be killed (timed out) and the stage completes
successfully.

-- 
Thanks,
-Utkarsh

Re: How to deal with tasks running too long?

Posted by Utkarsh Sengar <ut...@gmail.com>.
Thanks All, I know i have a data skew but the data is unpredictable and
hard to find every time.
Do you think this workaround is reasonable?

                    ExecutorService executor =
Executors.newCachedThreadPool();
                    Callable< Result > task = () -> simulation.run();
                    Future<Result> future = executor.submit(task);
                    try {
                        simResult = future.get(20, TimeUnit.MINUTES);
                    } catch (TimeoutException ex) {
                        SPARKLOG.info("Task timed out");
                    }

It will force timeout the task if it runs for more than 20mins.


On Thu, Jun 16, 2016 at 5:00 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> I'd check Details for Stage page in web UI.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Thu, Jun 16, 2016 at 6:45 AM, Utkarsh Sengar <ut...@gmail.com>
> wrote:
> > This SO question was asked about 1yr ago.
> >
> http://stackoverflow.com/questions/31799755/how-to-deal-with-tasks-running-too-long-comparing-to-others-in-job-in-yarn-cli
> >
> > I answered this question with a suggestion to try speculation but it
> doesn't
> > quite do what the OP expects. I have been running into this issue more
> these
> > days. Out of 5000 tasks, 4950 completes in 5mins but the last 50 never
> > really completes, have tried waiting for 4hrs. This can be a memory
> issue or
> > maybe the way spark's fine grained mode works with mesos, I am trying to
> > enable jmxsink to get a heap dump.
> >
> > But in the mean time, is there a better fix for this? (in any version of
> > spark, I am using 1.5.1 but can upgrade). It would be great if the last
> 50
> > tasks in my example can be killed (timed out) and the stage completes
> > successfully.
> >
> > --
> > Thanks,
> > -Utkarsh
>



-- 
Thanks,
-Utkarsh

Re: How to deal with tasks running too long?

Posted by Jacek Laskowski <ja...@japila.pl>.
Hi,

I'd check Details for Stage page in web UI.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Thu, Jun 16, 2016 at 6:45 AM, Utkarsh Sengar <ut...@gmail.com> wrote:
> This SO question was asked about 1yr ago.
> http://stackoverflow.com/questions/31799755/how-to-deal-with-tasks-running-too-long-comparing-to-others-in-job-in-yarn-cli
>
> I answered this question with a suggestion to try speculation but it doesn't
> quite do what the OP expects. I have been running into this issue more these
> days. Out of 5000 tasks, 4950 completes in 5mins but the last 50 never
> really completes, have tried waiting for 4hrs. This can be a memory issue or
> maybe the way spark's fine grained mode works with mesos, I am trying to
> enable jmxsink to get a heap dump.
>
> But in the mean time, is there a better fix for this? (in any version of
> spark, I am using 1.5.1 but can upgrade). It would be great if the last 50
> tasks in my example can be killed (timed out) and the stage completes
> successfully.
>
> --
> Thanks,
> -Utkarsh

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: How to deal with tasks running too long?

Posted by Jeff Zhang <zj...@gmail.com>.
This  may be due to data skew

On Thu, Jun 16, 2016 at 12:45 PM, Utkarsh Sengar <ut...@gmail.com>
wrote:

> This SO question was asked about 1yr ago.
>
> http://stackoverflow.com/questions/31799755/how-to-deal-with-tasks-running-too-long-comparing-to-others-in-job-in-yarn-cli
>
> I answered this question with a suggestion to try speculation but it
> doesn't quite do what the OP expects. I have been running into this issue
> more these days. Out of 5000 tasks, 4950 completes in 5mins but the last 50
> never really completes, have tried waiting for 4hrs. This can be a memory
> issue or maybe the way spark's fine grained mode works with mesos, I am
> trying to enable jmxsink to get a heap dump.
>
> But in the mean time, is there a better fix for this? (in any version of
> spark, I am using 1.5.1 but can upgrade). It would be great if the last 50
> tasks in my example can be killed (timed out) and the stage completes
> successfully.
>
> --
> Thanks,
> -Utkarsh
>



-- 
Best Regards

Jeff Zhang