You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Sebastian Schelter <ss...@apache.org> on 2013/01/16 17:05:34 UTC

Reduce memory footprint of RandomWalkVertex

Hi,

I'm currently working on GIRAPH-480 to add a convergence check to
RandomWalkVertex (which is an abstract version of PageRank and
RandomWalkWithRestart).

RandomWalkVertex extends LongDoubleFloatDoubleEdgeListVertex which means
that the edge values (the transition probabilities between the vertices)
are explicitly modeled. AFAIK in most cases these probabilities are
taken as uniform which means we could simply use 1 / getNumEdges() as
transition probability and save a lot of space by omitting the edge
values for each vertex. RandomWalkVertex could then simply extend
LongDoubleNullDoubleVertex.

I think this issue is pretty important, as RandomWalkVertex should be
the basis for a real-world PageRank implementation (that can deal with
dangling nodes and has a convergence check).

Best,
Sebastian

PS: It's great to see how much progress Giraph has made over the last
months!