You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by li...@gmx.net on 2011/12/20 16:29:06 UTC

Is there a global state I can use?

Hi all,

a plan to use Giraph for a use case where nodes send messages depending on some global distribution of a value. For instance, nodes have a numeric value. Thus there is a global distribution of that value. Now I want all nodes to take an action, i.e., send messages, that have a value in say the top 1% of all values.
How could I do this?
Thinking in Hadoop MapReduce I'd use the distributed cache in order to maintain a fingerprint of the global distribution.
Would this work in giraph too?

Thanks and BR!
christoph

Re: Is there a global state I can use?

Posted by Claudio Martella <cl...@gmail.com>.
Hi,

a general way of collecting data from all the vertices is using an
Aggregator. An aggregator collects messages from all the vertices (who
decide to write to it) and it can be read by all the vertices. You
could easily implement your statistics from there. Aggregators are
computed both on the workers and on the master, so it could be quite
scalable.

Hope it helps,
Claudio

On Tue, Dec 20, 2011 at 4:29 PM,  <li...@gmx.net> wrote:
>
> Hi all,
>
> a plan to use Giraph for a use case where nodes send messages depending on some global distribution of a value. For instance, nodes have a numeric value. Thus there is a global distribution of that value. Now I want all nodes to take an action, i.e., send messages, that have a value in say the top 1% of all values.
> How could I do this?
> Thinking in Hadoop MapReduce I'd use the distributed cache in order to maintain a fingerprint of the global distribution.
> Would this work in giraph too?
>
> Thanks and BR!
> christoph



-- 
   Claudio Martella
   claudio.martella@gmail.com