You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2014/09/14 01:22:33 UTC

[jira] [Resolved] (SPARK-3030) reuse python worker

     [ https://issues.apache.org/jira/browse/SPARK-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Rosen resolved SPARK-3030.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.2.0

Issue resolved by pull request 2259
[https://github.com/apache/spark/pull/2259]

> reuse python worker
> -------------------
>
>                 Key: SPARK-3030
>                 URL: https://issues.apache.org/jira/browse/SPARK-3030
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>            Reporter: Davies Liu
>            Assignee: Davies Liu
>             Fix For: 1.2.0
>
>
> Currently, it will fork an Python worker for each task, it will better if we can reuse the worker for later tasks.
> This will be very useful for large dataset with big broadcast, so it does not need to sending broadcast to worker again and again. Also it can reduce the overhead of launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org