You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ghousia Taj <gh...@gmail.com> on 2014/06/19 14:26:36 UTC

Query on Merge Message (Graph: pregel operator)

Hi,

Can someone please clarify a small query on Graph.pregel operator. As per
the documentation on merge Message function, only two inbound messages  can
be merged to a single value. Is it the actual case, if so how can one merge
n inbound messages .

Any help is truly appreciated.

Many Thanks,
Ghousia.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Query-on-Merge-Message-Graph-pregel-operator-tp7909.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Query on Merge Message (Graph: pregel operator)

Posted by Ankur Dave <an...@gmail.com>.
Many merge operations can be broken up to work incrementally. For example,
if the merge operation is to sum *n* rank updates, then you can set mergeMsg
= (a, b) => a + b and this function will be applied to all *n* updates in
arbitrary order to yield a final sum. Addition, multiplication, min, max,
and mean are operations that work in this manner (they are associative and
commutative).

If you absolutely must operate on all *n* messages at once, for example to
find the median, then a workaround is to emit Array(m) instead of m in the
sendMsg function, and then to set mergeMsg = (a, b) => a ++ b. This will
accumulate all inbound messages into an array which you can access in vprog.
However, it will be much slower for graphs with high-degree vertices,
because the accumulated arrays can grow very large.

Ankur <http://www.ankurdave.com/>