You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "gao (JIRA)" <ji...@apache.org> on 2018/05/22 09:38:00 UTC

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

     [ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gao updated SPARK-24342:
------------------------
    Description: 
When performing a set of concurrent tasks, if the relatively large task (long-time task) performs the first small-task execution, the overall execution time 
can be shortened.
Therefore, Spark needs to add a new function to perform Large Task of a group of task set prior scheduling and small tasks after scheduling
   The time span of the task can be identified based on the historical execution time. We can think that the task with a long execution time will longe in 
future. Record the last task execution time together with the task's key as a log file, and load the log file at the next execution time. use The 
RangePartitioner and partitioning partitioning methods prioritize large tasks and can achieve concurrent task optimization.

> Large Task prior scheduling to Reduce overall execution time
> ------------------------------------------------------------
>
>                 Key: SPARK-24342
>                 URL: https://issues.apache.org/jira/browse/SPARK-24342
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer
>    Affects Versions: 2.3.0
>            Reporter: gao
>            Priority: Minor
>
> When performing a set of concurrent tasks, if the relatively large task (long-time task) performs the first small-task execution, the overall execution time 
> can be shortened.
> Therefore, Spark needs to add a new function to perform Large Task of a group of task set prior scheduling and small tasks after scheduling
>    The time span of the task can be identified based on the historical execution time. We can think that the task with a long execution time will longe in 
> future. Record the last task execution time together with the task's key as a log file, and load the log file at the next execution time. use The 
> RangePartitioner and partitioning partitioning methods prioritize large tasks and can achieve concurrent task optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org