You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Serge Franchois <se...@altran.com> on 2015/07/20 19:00:06 UTC

Broadcast variables in R

I've searched high and low to use broadcast variables in R.
Is is possible at all? I don't see them mentioned in the SparkR API.
Or is there another way of using this feature?

I need to share a large amount of data between executors. 
At the moment, I get warned about my task being too large. 

I have tried pyspark, and there I can use them.

Wkr,

Serge




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-in-R-tp23915.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Broadcast variables in R

Posted by "Eskilson,Aleksander" <Al...@Cerner.com>.
Hi Serge,

The broadcast function was made private when SparkR merged into Apache
Spark for the 1.4.0 release. You can still use broadcast by specifying the
private namespace though.

SparkR:::broadcast(sc, obj)

The RDD methods were considered very low-level, and the SparkR devs are
still figuring out which of them they¹d like to expose along with the
higher-level DataFrame API. You can see the rationale for the decision on
the project JIRA [1].

[1] -- https://issues.apache.org/jira/browse/SPARK-7230

Hope that helps,
Alek

On 7/20/15, 12:00 PM, "Serge Franchois" <se...@altran.com> wrote:

>I've searched high and low to use broadcast variables in R.
>Is is possible at all? I don't see them mentioned in the SparkR API.
>Or is there another way of using this feature?
>
>I need to share a large amount of data between executors.
>At the moment, I get warned about my task being too large.
>
>I have tried pyspark, and there I can use them.
>
>Wkr,
>
>Serge
>
>
>
>
>--
>View this message in context:
>http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-in
>-R-tp23915.html
>Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>For additional commands, e-mail: user-help@spark.apache.org
>

CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org