You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2019/07/30 14:56:00 UTC

[jira] [Assigned] (SPARK-26755) Optimize Spark Scheduler to dequeue speculative tasks more efficiently

     [ https://issues.apache.org/jira/browse/SPARK-26755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Imran Rashid reassigned SPARK-26755:
------------------------------------

    Assignee: Parth Gandhi

> Optimize Spark Scheduler to dequeue speculative tasks more efficiently
> ----------------------------------------------------------------------
>
>                 Key: SPARK-26755
>                 URL: https://issues.apache.org/jira/browse/SPARK-26755
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>    Affects Versions: 3.0.0
>            Reporter: Parth Gandhi
>            Assignee: Parth Gandhi
>            Priority: Minor
>         Attachments: Screen Shot 2019-01-28 at 11.21.05 AM.png, Screen Shot 2019-01-28 at 11.21.25 AM.png, Screen Shot 2019-01-28 at 11.22.42 AM.png
>
>
> Currently, Spark Scheduler takes quite some time to dequeue speculative tasks for larger tasksets within a stage(like 100000 or more) when speculation is turned on. On further analysis, it was found that the "task-result-getter" threads remain blocked on one of the dispatcher-event-loop threads holding the lock on TaskSchedulerImpl object
> {code:java}
> def resourceOffers(offers: IndexedSeq[WorkerOffer]): Seq[Seq[TaskDescription]] = synchronized {
> {code}
> which takes quite some time to execute the method  "dequeueSpeculativeTask" in TaskSetManager.scala, thus, slowing down the overall running time of the spark job. We were monitoring the time utilization of that lock for the whole duration of the job and it was close to 50% i.e. the code within the synchronized block would run for almost half the duration of the entire spark job. The screenshots of the thread dump have been attached below for reference.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org