You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2014/08/14 08:49:12 UTC

[jira] [Updated] (SPARK-3029) Disable local execution of Spark jobs by default

     [ https://issues.apache.org/jira/browse/SPARK-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reynold Xin updated SPARK-3029:
-------------------------------

            Priority: Blocker  (was: Major)
    Target Version/s: 1.1.0

> Disable local execution of Spark jobs by default
> ------------------------------------------------
>
>                 Key: SPARK-3029
>                 URL: https://issues.apache.org/jira/browse/SPARK-3029
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Aaron Davidson
>            Assignee: Aaron Davidson
>            Priority: Blocker
>
> Currently, local execution of Spark jobs is only used by take(), and it can be problematic as it can load a significant amount of data onto the driver. The worst case scenarios occur if the RDD is cached (guaranteed to load whole partition), has very large elements, or the partition is just large and we apply a filter with high selectivity or computational overhead.
> Additionally, jobs that run locally in this manner do not show up in the web UI, and are thus harder to track or understand what is occurring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org