You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:11:04 UTC

[jira] [Resolved] (SPARK-21695) Spark scheduler locality algorithm can take longer then expected

     [ https://issues.apache.org/jira/browse/SPARK-21695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-21695.
----------------------------------
    Resolution: Incomplete

> Spark scheduler locality algorithm can take longer then expected
> ----------------------------------------------------------------
>
>                 Key: SPARK-21695
>                 URL: https://issues.apache.org/jira/browse/SPARK-21695
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.1.0
>            Reporter: Thomas Graves
>            Priority: Major
>              Labels: bulk-closed
>
> Reference jira https://issues.apache.org/jira/browse/SPARK-21656
> I'm seeing an issue with some jobs where the scheduler takes a long time to schedule tasks on executors.   The default locality wait is 3 seconds so I was expecting that an executor should get some task on it in max 9 seconds (node local, rack local, any), but its taking way more time then that.  In the case of spark-21656 it takes 60+ seconds and executors idle timeout.  
> We should investigate why and see if we can fix this.
> Upon an initial look it seems the scheduler resets the locality lastLaunchTime whenever it places any task on a node at that locality level. It appears this means it can take way longer then 3 seconds for any particular task to fall back, but this needs to be verified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org