You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@giraph.apache.org by Alexander Frolov <al...@gmail.com> on 2014/02/07 15:11:46 UTC

Performance tuning of Giraph on small cluster with Infiniband

Hi, team!

As I have read in previous threads, I've started evaluation of Giraph on
IB-cluster. So here I want to share my results (in case it will be useful
for anybody) and ask for your ideas of further improving of performance
characteristics.

Test system:
* 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
* Infiniband FDR Dual-Port 4x
* SUSE 11.2
* jdk1.7.0_51

At the moment I am performing experiment with
SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
wich shows scalability of Giraph up to 32 workers.

As can be seen from the plot up to 8 workers there is almost linear
scalability and then (from 8 to 32) speed is not going up. For me it seems
strange that using additional cores on nodes wont bring any performance
gain to the execution time. Have anybody meet with such behaviour?

Next I am going to use threads instead of workers for cores utilization.
 Also I am going to switch to the Hadoop-RDMA project.


If anybody has any suggestion how I can achieve maximum performance on
Giraph on the cluster, I will be obliged to you ;-)

Hope for your feedback.

Best,
  Alex

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Claudio Martella <cl...@gmail.com>.

that makes around 1M vertices and 32M edges. It's a pretty small graph. Can
you boost it up 1 order of magnitude?


On Fri, Feb 7, 2014 at 5:41 PM, Alexander Frolov
<al...@gmail.com>wrote:

> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
> 2^20 vertices, average degree is 32.
>
>
> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <
> claudio.martella@gmail.com> wrote:
>
>> looks like a very small graph. what's the size of the graph and the
>> topology?
>>
>>
>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <
>> alexndr.frolov@gmail.com> wrote:
>>
>>> Hi, team!
>>>
>>> As I have read in previous threads, I've started evaluation of Giraph on
>>> IB-cluster. So here I want to share my results (in case it will be useful
>>> for anybody) and ask for your ideas of further improving of performance
>>> characteristics.
>>>
>>> Test system:
>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>> * Infiniband FDR Dual-Port 4x
>>> * SUSE 11.2
>>> * jdk1.7.0_51
>>>
>>> At the moment I am performing experiment with
>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>> wich shows scalability of Giraph up to 32 workers.
>>>
>>> As can be seen from the plot up to 8 workers there is almost linear
>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>> strange that using additional cores on nodes wont bring any performance
>>> gain to the execution time. Have anybody meet with such behaviour?
>>>
>>> Next I am going to use threads instead of workers for cores utilization.
>>>  Also I am going to switch to the Hadoop-RDMA project.
>>>
>>>
>>> If anybody has any suggestion how I can achieve maximum performance on
>>> Giraph on the cluster, I will be obliged to you ;-)
>>>
>>> Hope for your feedback.
>>>
>>> Best,
>>>   Alex
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>>    Claudio Martella
>>
>>
>
>


-- 
   Claudio Martella

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Sebastian Schelter <ss...@apache.org>.

Hi Alexander,

I think this graph is far too small for serious performance benchmarks, 
it only has 1M vertices and 32M edges.

I'd suggest you either create a much larger graph or take a snapshot of 
a real one. I personally like to use a snapshot of the twitter follower 
graph [1] from 2009 with 50M vertices and 2B edges for experimenting.

Best,
Sebastian


[1] http://konect.uni-koblenz.de/networks/twitter_mpi


On 02/07/2014 05:41 PM, Alexander Frolov wrote:
> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
> 2^20 vertices, average degree is 32.
>
>
> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <claudio.martella@gmail.com
>> wrote:
>
>> looks like a very small graph. what's the size of the graph and the
>> topology?
>>
>>
>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <alexndr.frolov@gmail.com
>>> wrote:
>>
>>> Hi, team!
>>>
>>> As I have read in previous threads, I've started evaluation of Giraph on
>>> IB-cluster. So here I want to share my results (in case it will be useful
>>> for anybody) and ask for your ideas of further improving of performance
>>> characteristics.
>>>
>>> Test system:
>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>> * Infiniband FDR Dual-Port 4x
>>> * SUSE 11.2
>>> * jdk1.7.0_51
>>>
>>> At the moment I am performing experiment with
>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>> wich shows scalability of Giraph up to 32 workers.
>>>
>>> As can be seen from the plot up to 8 workers there is almost linear
>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>> strange that using additional cores on nodes wont bring any performance
>>> gain to the execution time. Have anybody meet with such behaviour?
>>>
>>> Next I am going to use threads instead of workers for cores utilization.
>>>   Also I am going to switch to the Hadoop-RDMA project.
>>>
>>>
>>> If anybody has any suggestion how I can achieve maximum performance on
>>> Giraph on the cluster, I will be obliged to you ;-)
>>>
>>> Hope for your feedback.
>>>
>>> Best,
>>>    Alex
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>>     Claudio Martella
>>
>>
>

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Alexander Frolov <al...@gmail.com>.

Hi,

I have obtained new results for larger graph (undirected RMAT, vertex #
2^24, avrg. degree 32). Here are
the results (only computation time included):

Workers        2^20           2^24
---------------------------------------------
1                  274.155         n/a
2                  103.843     11517.708
4                  47.619       815.444
8                  29.236       344.801
16                32.681       205.311
32                47.754       175.448


* For graph size 2^24 single worker failed on the superstep 3
  I did not understand the reason.
* For graph size 2^24 2 workers worked unnaturaly very slow (4hrs, 13min,
35sec). I monitored utilization of CPU, it was also very low i.e. ~5-8%
(during computation time).
  I have not found out reason yet. It looks like swapping, but I need to
check it. I dont know whether hadoop processes can swap if I have
set mapred.child.java.opts to -Xmx32g
  (I have 80G per node in the cluster).
* Scalabitlity of 2^32 is poor starting from 8 nodes.


Scalability plot and superstep time distribution plot are attached.

Next, I am going to try run greater graphs (2^26) and using of
multithreading. Also I am going to increase worker number.

Best,
   Alex


On Fri, Feb 7, 2014 at 8:48 PM, Alexander Frolov
<al...@gmail.com>wrote:

> I forgot to note that in time scale it is only included time of
> computation (e.i. sum of superstep times).
>
> Yes, this is not a big graph, I will come up with larger graphs soon.
>
> Best,
>   Alex
>
>
> On Fri, Feb 7, 2014 at 7:41 PM, Alexander Frolov <alexndr.frolov@gmail.com
> > wrote:
>
>> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
>> 2^20 vertices, average degree is 32.
>>
>>
>> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <
>> claudio.martella@gmail.com> wrote:
>>
>>> looks like a very small graph. what's the size of the graph and the
>>> topology?
>>>
>>>
>>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <
>>> alexndr.frolov@gmail.com> wrote:
>>>
>>>> Hi, team!
>>>>
>>>> As I have read in previous threads, I've started evaluation of Giraph
>>>> on IB-cluster. So here I want to share my results (in case it will be
>>>> useful for anybody) and ask for your ideas of further improving of
>>>> performance characteristics.
>>>>
>>>> Test system:
>>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>>> * Infiniband FDR Dual-Port 4x
>>>> * SUSE 11.2
>>>> * jdk1.7.0_51
>>>>
>>>> At the moment I am performing experiment with
>>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>>> wich shows scalability of Giraph up to 32 workers.
>>>>
>>>> As can be seen from the plot up to 8 workers there is almost linear
>>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>>> strange that using additional cores on nodes wont bring any performance
>>>> gain to the execution time. Have anybody meet with such behaviour?
>>>>
>>>> Next I am going to use threads instead of workers for cores
>>>> utilization.  Also I am going to switch to the Hadoop-RDMA project.
>>>>
>>>>
>>>> If anybody has any suggestion how I can achieve maximum performance on
>>>> Giraph on the cluster, I will be obliged to you ;-)
>>>>
>>>> Hope for your feedback.
>>>>
>>>> Best,
>>>>   Alex
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>    Claudio Martella
>>>
>>>
>>
>>
>

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Alexander Frolov <al...@gmail.com>.

I forgot to note that in time scale it is only included time of computation
(e.i. sum of superstep times).

Yes, this is not a big graph, I will come up with larger graphs soon.

Best,
  Alex


On Fri, Feb 7, 2014 at 7:41 PM, Alexander Frolov
<al...@gmail.com>wrote:

> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
> 2^20 vertices, average degree is 32.
>
>
> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <
> claudio.martella@gmail.com> wrote:
>
>> looks like a very small graph. what's the size of the graph and the
>> topology?
>>
>>
>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <
>> alexndr.frolov@gmail.com> wrote:
>>
>>> Hi, team!
>>>
>>> As I have read in previous threads, I've started evaluation of Giraph on
>>> IB-cluster. So here I want to share my results (in case it will be useful
>>> for anybody) and ask for your ideas of further improving of performance
>>> characteristics.
>>>
>>> Test system:
>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>> * Infiniband FDR Dual-Port 4x
>>> * SUSE 11.2
>>> * jdk1.7.0_51
>>>
>>> At the moment I am performing experiment with
>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>> wich shows scalability of Giraph up to 32 workers.
>>>
>>> As can be seen from the plot up to 8 workers there is almost linear
>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>> strange that using additional cores on nodes wont bring any performance
>>> gain to the execution time. Have anybody meet with such behaviour?
>>>
>>> Next I am going to use threads instead of workers for cores utilization.
>>>  Also I am going to switch to the Hadoop-RDMA project.
>>>
>>>
>>> If anybody has any suggestion how I can achieve maximum performance on
>>> Giraph on the cluster, I will be obliged to you ;-)
>>>
>>> Hope for your feedback.
>>>
>>> Best,
>>>   Alex
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>>    Claudio Martella
>>
>>
>
>

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Alexander Frolov <al...@gmail.com>.

Undirected RMAT graph, generated by tool extracted from Graph500. Size is
2^20 vertices, average degree is 32.


On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <claudio.martella@gmail.com
> wrote:

> looks like a very small graph. what's the size of the graph and the
> topology?
>
>
> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <alexndr.frolov@gmail.com
> > wrote:
>
>> Hi, team!
>>
>> As I have read in previous threads, I've started evaluation of Giraph on
>> IB-cluster. So here I want to share my results (in case it will be useful
>> for anybody) and ask for your ideas of further improving of performance
>> characteristics.
>>
>> Test system:
>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>> * Infiniband FDR Dual-Port 4x
>> * SUSE 11.2
>> * jdk1.7.0_51
>>
>> At the moment I am performing experiment with
>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>> wich shows scalability of Giraph up to 32 workers.
>>
>> As can be seen from the plot up to 8 workers there is almost linear
>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>> strange that using additional cores on nodes wont bring any performance
>> gain to the execution time. Have anybody meet with such behaviour?
>>
>> Next I am going to use threads instead of workers for cores utilization.
>>  Also I am going to switch to the Hadoop-RDMA project.
>>
>>
>> If anybody has any suggestion how I can achieve maximum performance on
>> Giraph on the cluster, I will be obliged to you ;-)
>>
>> Hope for your feedback.
>>
>> Best,
>>   Alex
>>
>>
>>
>>
>>
>
>
> --
>    Claudio Martella
>
>

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Claudio Martella <cl...@gmail.com>.

looks like a very small graph. what's the size of the graph and the
topology?


On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov
<al...@gmail.com>wrote:

> Hi, team!
>
> As I have read in previous threads, I've started evaluation of Giraph on
> IB-cluster. So here I want to share my results (in case it will be useful
> for anybody) and ask for your ideas of further improving of performance
> characteristics.
>
> Test system:
> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
> * Infiniband FDR Dual-Port 4x
> * SUSE 11.2
> * jdk1.7.0_51
>
> At the moment I am performing experiment with
> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
> wich shows scalability of Giraph up to 32 workers.
>
> As can be seen from the plot up to 8 workers there is almost linear
> scalability and then (from 8 to 32) speed is not going up. For me it seems
> strange that using additional cores on nodes wont bring any performance
> gain to the execution time. Have anybody meet with such behaviour?
>
> Next I am going to use threads instead of workers for cores utilization.
>  Also I am going to switch to the Hadoop-RDMA project.
>
>
> If anybody has any suggestion how I can achieve maximum performance on
> Giraph on the cluster, I will be obliged to you ;-)
>
> Hope for your feedback.
>
> Best,
>   Alex
>
>
>
>
>


-- 
   Claudio Martella

Re: Performance tuning of Giraph on small cluster with Infiniband

Posted by Sebastian Schelter <ss...@apache.org>.

Hi Alexander,

how large is the graph that you run your test on in terms of vertices 
and edges?

--sebastian

On 02/07/2014 03:11 PM, Alexander Frolov wrote:
> Hi, team!
>
> As I have read in previous threads, I've started evaluation of Giraph on
> IB-cluster. So here I want to share my results (in case it will be useful
> for anybody) and ask for your ideas of further improving of performance
> characteristics.
>
> Test system:
> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
> * Infiniband FDR Dual-Port 4x
> * SUSE 11.2
> * jdk1.7.0_51
>
> At the moment I am performing experiment with
> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
> wich shows scalability of Giraph up to 32 workers.
>
> As can be seen from the plot up to 8 workers there is almost linear
> scalability and then (from 8 to 32) speed is not going up. For me it seems
> strange that using additional cores on nodes wont bring any performance
> gain to the execution time. Have anybody meet with such behaviour?
>
> Next I am going to use threads instead of workers for cores utilization.
>   Also I am going to switch to the Hadoop-RDMA project.
>
>
> If anybody has any suggestion how I can achieve maximum performance on
> Giraph on the cluster, I will be obliged to you ;-)
>
> Hope for your feedback.
>
> Best,
>    Alex
>