You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andrew Ash (JIRA)" <ji...@apache.org> on 2014/11/11 09:28:33 UTC
[jira] [Commented] (SPARK-572) Forbid update of static mutable
variables
[ https://issues.apache.org/jira/browse/SPARK-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206143#comment-14206143 ]
Andrew Ash commented on SPARK-572:
----------------------------------
Static mutable variables are now a standard way of having code run on a per-executor basis.
To run per-entry, you can use map(), for per-partition you can use mapPartitions(), but for per-executor you need static variables or initializers. If for example you want to open a connection to another data storage system and write all of an executor's data into that system, a static connection object is the common way to do that.
I would propose closing this ticket as "Won't Fix". Using this technique is confusing, but prohibiting it is difficult and introduces additional roadblocks to Spark power users.
cc [~rxin]
> Forbid update of static mutable variables
> -----------------------------------------
>
> Key: SPARK-572
> URL: https://issues.apache.org/jira/browse/SPARK-572
> Project: Spark
> Issue Type: Improvement
> Reporter: tjhunter
>
> Consider the following piece of code:
> <pre>
> object Foo {
> var xx = -1
> def main() {
> xx = 1
> val sc = new SparkContext(...)
> sc.broadcast(xx)
> sc.parallelize(0 to 10).map(i=>{ ... xx ...})
> }
> }
> </pre>
> Can you guess the value of xx? It is 1 when you use the local scheduler and -1 when you use the mesos scheduler. Given the complications, it should probably just be forbidden for now...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org