You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ghousia Taj <gh...@gmail.com> on 2014/06/19 14:26:36 UTC
Query on Merge Message (Graph: pregel operator)
Hi,
Can someone please clarify a small query on Graph.pregel operator. As per
the documentation on merge Message function, only two inbound messages can
be merged to a single value. Is it the actual case, if so how can one merge
n inbound messages .
Any help is truly appreciated.
Many Thanks,
Ghousia.
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Query-on-Merge-Message-Graph-pregel-operator-tp7909.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Query on Merge Message (Graph: pregel operator)
Posted by Ankur Dave <an...@gmail.com>.
Many merge operations can be broken up to work incrementally. For example,
if the merge operation is to sum *n* rank updates, then you can set mergeMsg
= (a, b) => a + b and this function will be applied to all *n* updates in
arbitrary order to yield a final sum. Addition, multiplication, min, max,
and mean are operations that work in this manner (they are associative and
commutative).
If you absolutely must operate on all *n* messages at once, for example to
find the median, then a workaround is to emit Array(m) instead of m in the
sendMsg function, and then to set mergeMsg = (a, b) => a ++ b. This will
accumulate all inbound messages into an array which you can access in vprog.
However, it will be much slower for graphs with high-degree vertices,
because the accumulated arrays can grow very large.
Ankur <http://www.ankurdave.com/>