You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2014/08/30 21:10:54 UTC

[jira] [Commented] (SPARK-3327) Make broadcasted value mutable for caching useful information

    [ https://issues.apache.org/jira/browse/SPARK-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116532#comment-14116532 ] 

Apache Spark commented on SPARK-3327:
-------------------------------------

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/2217

> Make broadcasted value mutable for caching useful information
> -------------------------------------------------------------
>
>                 Key: SPARK-3327
>                 URL: https://issues.apache.org/jira/browse/SPARK-3327
>             Project: Spark
>          Issue Type: New Feature
>            Reporter: Liang-Chi Hsieh
>
> When implementing some algorithms, it is helpful that we can cache some useful information for using later.
> Specifically, we would like to performa operation "A" on each partition of data. Some variables are updated. Then we want to run operation "B" on the data too. "B" operation uses the variables updated by operation "A".
> One of the examples is the Liblinear on Spark from Dr. Lin. They discuss the problem in Section IV.D of the paper "Large-scale Logistic Regression and Linear Support Vector Machines Using Spark."
> Currently broadcasted variables can satisfy partial need for that. We can broadcast variables to reduce communication costs. However, because broadcasted variables can not be modified, it doesn't help solve the problem and we maybe need to collect updated variables back to master and broadcast them again before conducting next data operation.
> I would like to add an interface to broadcasted variables to make them mutable so later data operations can use them again.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org