You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Daedalus <tu...@gmail.com> on 2014/06/20 07:54:03 UTC
Repeated Broadcasts
I'm trying to use Spark (Java) for an optimization algorithm that needs
repeated server-node exchanges of information. (The ADMM algorithm for
whoever is familiar). In each iteration, I need to update a set of values on
the nodes, and collect them on the server, which will update it's own set of
values, and pass this to ALL nodes.
Say each node optimizes a variable X={x1, x2, x3...}
While the server optimizes a variable Z={z1, z2, z3...}
I am currently using an Accumulable object to collect the updated X's from
each node into an array maintained on the server.
Each node requires a copy of Z to optimize X, and this value of Z will
change on every iteration during optimization.
So, is there any computational advantage to using broadcasting Z at each
iteration over simply passing it as a parameter to each node?/ (Remember, Z
changes on each iteration)/
That is, which of the following snippets should I be implementing:
for(i=0; i<iters; i++){
broadVar=sc.broadcast(Z);
dataRDD.foreach(new voidFunction<Data>(){
public void call(Data d){
X=d.optimize(broadVar.value());
accum.add(X);
}
});
Z=optimize_Z(accum);
}
*OR*
for(i=0; i<iters; i++){
dataRDD.foreach(new voidFunction<Data>(){
public void call(Data d){
X=d.optimize(Z);
accum.add(X);
}
});
Z=optimize_Z(accum);
}
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Repeated-Broadcasts-tp7977.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Repeated Broadcasts
Posted by Daedalus <tu...@gmail.com>.
Anyone who has used this sort of construct? (Read: bump)
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Repeated-Broadcasts-tp7977p8063.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.