You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:11:04 UTC
[jira] [Resolved] (SPARK-21695) Spark scheduler locality algorithm
can take longer then expected
[ https://issues.apache.org/jira/browse/SPARK-21695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-21695.
----------------------------------
Resolution: Incomplete
> Spark scheduler locality algorithm can take longer then expected
> ----------------------------------------------------------------
>
> Key: SPARK-21695
> URL: https://issues.apache.org/jira/browse/SPARK-21695
> Project: Spark
> Issue Type: Bug
> Components: Scheduler
> Affects Versions: 2.1.0
> Reporter: Thomas Graves
> Priority: Major
> Labels: bulk-closed
>
> Reference jira https://issues.apache.org/jira/browse/SPARK-21656
> I'm seeing an issue with some jobs where the scheduler takes a long time to schedule tasks on executors. The default locality wait is 3 seconds so I was expecting that an executor should get some task on it in max 9 seconds (node local, rack local, any), but its taking way more time then that. In the case of spark-21656 it takes 60+ seconds and executors idle timeout.
> We should investigate why and see if we can fix this.
> Upon an initial look it seems the scheduler resets the locality lastLaunchTime whenever it places any task on a node at that locality level. It appears this means it can take way longer then 3 seconds for any particular task to fall back, but this needs to be verified.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org