You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Pradeep Gollakota <pr...@gmail.com> on 2013/01/10 21:23:34 UTC

PageRankBenchmark scaling

Hi All,

I'm trying to run some benchmarks using the PageRankBenchmark tool on my
cluster. However, I'm seeing some scaling issues.

My cluster has 4 nodes, configured to run 24 map tasks. I'm running the
benchmark with 23 workers. I've been able to get it scale up to 256 million
edges (16m vertices with 16 edges per vertex). However, when I try to scale
higher than that, I've been getting GC Overhead limit exceeded errors. I
tried to modify the PageRankComputation class to try to use object reuse,
but to no avail.

Does anyone have any thoughts on how I can scale this higher on my cluster?
I'm trying to get to about 50 million vertices with 150 edges per vertex
(7.5 billion edges).

Thanks
Pradeep

Re: PageRankBenchmark scaling

Posted by Claudio Martella <cl...@gmail.com>.
I suggest you start by trying the ByteArrayPartition and continue with
out of core messages and/or graph.
Also, make sure the mapper tasks can get enough memory on the heap in
the hadoop cluster configuration.

On Thu, Jan 10, 2013 at 9:23 PM, Pradeep Gollakota <pr...@gmail.com> wrote:
> Hi All,
>
> I'm trying to run some benchmarks using the PageRankBenchmark tool on my
> cluster. However, I'm seeing some scaling issues.
>
> My cluster has 4 nodes, configured to run 24 map tasks. I'm running the
> benchmark with 23 workers. I've been able to get it scale up to 256 million
> edges (16m vertices with 16 edges per vertex). However, when I try to scale
> higher than that, I've been getting GC Overhead limit exceeded errors. I
> tried to modify the PageRankComputation class to try to use object reuse,
> but to no avail.
>
> Does anyone have any thoughts on how I can scale this higher on my cluster?
> I'm trying to get to about 50 million vertices with 150 edges per vertex
> (7.5 billion edges).
>
> Thanks
> Pradeep



-- 
   Claudio Martella
   claudio.martella@gmail.com