You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Erik van Oosten (JIRA)" <ji...@apache.org> on 2019/03/02 08:44:00 UTC

[jira] [Comment Edited] (SPARK-27025) Speed up toLocalIterator

    [ https://issues.apache.org/jira/browse/SPARK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782322#comment-16782322 ] 

Erik van Oosten edited comment on SPARK-27025 at 3/2/19 8:43 AM:
-----------------------------------------------------------------

The point is to _not_ fetch pro-actively.

I have a program in which several steps need to be executed before anything can be transferred to the driver. So why can't the executors start executing immediately, and only transfer the results to the driver when its ready?


was (Author: erikvanoosten):
I have a program in which several steps need to be executed before anything can be transferred to the driver. So why can't the executors start executing immediately, and only transfer the results to the driver when its ready?

> Speed up toLocalIterator
> ------------------------
>
>                 Key: SPARK-27025
>                 URL: https://issues.apache.org/jira/browse/SPARK-27025
>             Project: Spark
>          Issue Type: Wish
>          Components: Spark Core
>    Affects Versions: 2.3.3
>            Reporter: Erik van Oosten
>            Priority: Major
>
> Method {{toLocalIterator}} fetches the partitions to the driver one by one. However, as far as I can see, any required computation for the yet-to-be-fetched-partitions is not kicked off until it is fetched. Effectively only one partition is being computed at the same time. 
> Desired behavior: immediately start calculation of all partitions while retaining the download-a-partition at a time behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org