You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Alexander Frolov <al...@gmail.com> on 2014/02/07 15:11:46 UTC
Performance tuning of Giraph on small cluster with Infiniband
Hi, team!
As I have read in previous threads, I've started evaluation of Giraph on
IB-cluster. So here I want to share my results (in case it will be useful
for anybody) and ask for your ideas of further improving of performance
characteristics.
Test system:
* 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
* Infiniband FDR Dual-Port 4x
* SUSE 11.2
* jdk1.7.0_51
At the moment I am performing experiment with
SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
wich shows scalability of Giraph up to 32 workers.
As can be seen from the plot up to 8 workers there is almost linear
scalability and then (from 8 to 32) speed is not going up. For me it seems
strange that using additional cores on nodes wont bring any performance
gain to the execution time. Have anybody meet with such behaviour?
Next I am going to use threads instead of workers for cores utilization.
Also I am going to switch to the Hadoop-RDMA project.
If anybody has any suggestion how I can achieve maximum performance on
Giraph on the cluster, I will be obliged to you ;-)
Hope for your feedback.
Best,
Alex
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Claudio Martella <cl...@gmail.com>.
that makes around 1M vertices and 32M edges. It's a pretty small graph. Can
you boost it up 1 order of magnitude?
On Fri, Feb 7, 2014 at 5:41 PM, Alexander Frolov
<al...@gmail.com>wrote:
> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
> 2^20 vertices, average degree is 32.
>
>
> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <
> claudio.martella@gmail.com> wrote:
>
>> looks like a very small graph. what's the size of the graph and the
>> topology?
>>
>>
>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <
>> alexndr.frolov@gmail.com> wrote:
>>
>>> Hi, team!
>>>
>>> As I have read in previous threads, I've started evaluation of Giraph on
>>> IB-cluster. So here I want to share my results (in case it will be useful
>>> for anybody) and ask for your ideas of further improving of performance
>>> characteristics.
>>>
>>> Test system:
>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>> * Infiniband FDR Dual-Port 4x
>>> * SUSE 11.2
>>> * jdk1.7.0_51
>>>
>>> At the moment I am performing experiment with
>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>> wich shows scalability of Giraph up to 32 workers.
>>>
>>> As can be seen from the plot up to 8 workers there is almost linear
>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>> strange that using additional cores on nodes wont bring any performance
>>> gain to the execution time. Have anybody meet with such behaviour?
>>>
>>> Next I am going to use threads instead of workers for cores utilization.
>>> Also I am going to switch to the Hadoop-RDMA project.
>>>
>>>
>>> If anybody has any suggestion how I can achieve maximum performance on
>>> Giraph on the cluster, I will be obliged to you ;-)
>>>
>>> Hope for your feedback.
>>>
>>> Best,
>>> Alex
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Claudio Martella
>>
>>
>
>
--
Claudio Martella
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Sebastian Schelter <ss...@apache.org>.
Hi Alexander,
I think this graph is far too small for serious performance benchmarks,
it only has 1M vertices and 32M edges.
I'd suggest you either create a much larger graph or take a snapshot of
a real one. I personally like to use a snapshot of the twitter follower
graph [1] from 2009 with 50M vertices and 2B edges for experimenting.
Best,
Sebastian
[1] http://konect.uni-koblenz.de/networks/twitter_mpi
On 02/07/2014 05:41 PM, Alexander Frolov wrote:
> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
> 2^20 vertices, average degree is 32.
>
>
> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <claudio.martella@gmail.com
>> wrote:
>
>> looks like a very small graph. what's the size of the graph and the
>> topology?
>>
>>
>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <alexndr.frolov@gmail.com
>>> wrote:
>>
>>> Hi, team!
>>>
>>> As I have read in previous threads, I've started evaluation of Giraph on
>>> IB-cluster. So here I want to share my results (in case it will be useful
>>> for anybody) and ask for your ideas of further improving of performance
>>> characteristics.
>>>
>>> Test system:
>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>> * Infiniband FDR Dual-Port 4x
>>> * SUSE 11.2
>>> * jdk1.7.0_51
>>>
>>> At the moment I am performing experiment with
>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>> wich shows scalability of Giraph up to 32 workers.
>>>
>>> As can be seen from the plot up to 8 workers there is almost linear
>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>> strange that using additional cores on nodes wont bring any performance
>>> gain to the execution time. Have anybody meet with such behaviour?
>>>
>>> Next I am going to use threads instead of workers for cores utilization.
>>> Also I am going to switch to the Hadoop-RDMA project.
>>>
>>>
>>> If anybody has any suggestion how I can achieve maximum performance on
>>> Giraph on the cluster, I will be obliged to you ;-)
>>>
>>> Hope for your feedback.
>>>
>>> Best,
>>> Alex
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Claudio Martella
>>
>>
>
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Alexander Frolov <al...@gmail.com>.
Hi,
I have obtained new results for larger graph (undirected RMAT, vertex #
2^24, avrg. degree 32). Here are
the results (only computation time included):
Workers 2^20 2^24
---------------------------------------------
1 274.155 n/a
2 103.843 11517.708
4 47.619 815.444
8 29.236 344.801
16 32.681 205.311
32 47.754 175.448
* For graph size 2^24 single worker failed on the superstep 3
I did not understand the reason.
* For graph size 2^24 2 workers worked unnaturaly very slow (4hrs, 13min,
35sec). I monitored utilization of CPU, it was also very low i.e. ~5-8%
(during computation time).
I have not found out reason yet. It looks like swapping, but I need to
check it. I dont know whether hadoop processes can swap if I have
set mapred.child.java.opts to -Xmx32g
(I have 80G per node in the cluster).
* Scalabitlity of 2^32 is poor starting from 8 nodes.
Scalability plot and superstep time distribution plot are attached.
Next, I am going to try run greater graphs (2^26) and using of
multithreading. Also I am going to increase worker number.
Best,
Alex
On Fri, Feb 7, 2014 at 8:48 PM, Alexander Frolov
<al...@gmail.com>wrote:
> I forgot to note that in time scale it is only included time of
> computation (e.i. sum of superstep times).
>
> Yes, this is not a big graph, I will come up with larger graphs soon.
>
> Best,
> Alex
>
>
> On Fri, Feb 7, 2014 at 7:41 PM, Alexander Frolov <alexndr.frolov@gmail.com
> > wrote:
>
>> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
>> 2^20 vertices, average degree is 32.
>>
>>
>> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <
>> claudio.martella@gmail.com> wrote:
>>
>>> looks like a very small graph. what's the size of the graph and the
>>> topology?
>>>
>>>
>>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <
>>> alexndr.frolov@gmail.com> wrote:
>>>
>>>> Hi, team!
>>>>
>>>> As I have read in previous threads, I've started evaluation of Giraph
>>>> on IB-cluster. So here I want to share my results (in case it will be
>>>> useful for anybody) and ask for your ideas of further improving of
>>>> performance characteristics.
>>>>
>>>> Test system:
>>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>>> * Infiniband FDR Dual-Port 4x
>>>> * SUSE 11.2
>>>> * jdk1.7.0_51
>>>>
>>>> At the moment I am performing experiment with
>>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>>> wich shows scalability of Giraph up to 32 workers.
>>>>
>>>> As can be seen from the plot up to 8 workers there is almost linear
>>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>>> strange that using additional cores on nodes wont bring any performance
>>>> gain to the execution time. Have anybody meet with such behaviour?
>>>>
>>>> Next I am going to use threads instead of workers for cores
>>>> utilization. Also I am going to switch to the Hadoop-RDMA project.
>>>>
>>>>
>>>> If anybody has any suggestion how I can achieve maximum performance on
>>>> Giraph on the cluster, I will be obliged to you ;-)
>>>>
>>>> Hope for your feedback.
>>>>
>>>> Best,
>>>> Alex
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Claudio Martella
>>>
>>>
>>
>>
>
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Alexander Frolov <al...@gmail.com>.
I forgot to note that in time scale it is only included time of computation
(e.i. sum of superstep times).
Yes, this is not a big graph, I will come up with larger graphs soon.
Best,
Alex
On Fri, Feb 7, 2014 at 7:41 PM, Alexander Frolov
<al...@gmail.com>wrote:
> Undirected RMAT graph, generated by tool extracted from Graph500. Size is
> 2^20 vertices, average degree is 32.
>
>
> On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <
> claudio.martella@gmail.com> wrote:
>
>> looks like a very small graph. what's the size of the graph and the
>> topology?
>>
>>
>> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <
>> alexndr.frolov@gmail.com> wrote:
>>
>>> Hi, team!
>>>
>>> As I have read in previous threads, I've started evaluation of Giraph on
>>> IB-cluster. So here I want to share my results (in case it will be useful
>>> for anybody) and ask for your ideas of further improving of performance
>>> characteristics.
>>>
>>> Test system:
>>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>>> * Infiniband FDR Dual-Port 4x
>>> * SUSE 11.2
>>> * jdk1.7.0_51
>>>
>>> At the moment I am performing experiment with
>>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>>> wich shows scalability of Giraph up to 32 workers.
>>>
>>> As can be seen from the plot up to 8 workers there is almost linear
>>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>>> strange that using additional cores on nodes wont bring any performance
>>> gain to the execution time. Have anybody meet with such behaviour?
>>>
>>> Next I am going to use threads instead of workers for cores utilization.
>>> Also I am going to switch to the Hadoop-RDMA project.
>>>
>>>
>>> If anybody has any suggestion how I can achieve maximum performance on
>>> Giraph on the cluster, I will be obliged to you ;-)
>>>
>>> Hope for your feedback.
>>>
>>> Best,
>>> Alex
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Claudio Martella
>>
>>
>
>
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Alexander Frolov <al...@gmail.com>.
Undirected RMAT graph, generated by tool extracted from Graph500. Size is
2^20 vertices, average degree is 32.
On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <claudio.martella@gmail.com
> wrote:
> looks like a very small graph. what's the size of the graph and the
> topology?
>
>
> On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <alexndr.frolov@gmail.com
> > wrote:
>
>> Hi, team!
>>
>> As I have read in previous threads, I've started evaluation of Giraph on
>> IB-cluster. So here I want to share my results (in case it will be useful
>> for anybody) and ask for your ideas of further improving of performance
>> characteristics.
>>
>> Test system:
>> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
>> * Infiniband FDR Dual-Port 4x
>> * SUSE 11.2
>> * jdk1.7.0_51
>>
>> At the moment I am performing experiment with
>> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
>> wich shows scalability of Giraph up to 32 workers.
>>
>> As can be seen from the plot up to 8 workers there is almost linear
>> scalability and then (from 8 to 32) speed is not going up. For me it seems
>> strange that using additional cores on nodes wont bring any performance
>> gain to the execution time. Have anybody meet with such behaviour?
>>
>> Next I am going to use threads instead of workers for cores utilization.
>> Also I am going to switch to the Hadoop-RDMA project.
>>
>>
>> If anybody has any suggestion how I can achieve maximum performance on
>> Giraph on the cluster, I will be obliged to you ;-)
>>
>> Hope for your feedback.
>>
>> Best,
>> Alex
>>
>>
>>
>>
>>
>
>
> --
> Claudio Martella
>
>
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Claudio Martella <cl...@gmail.com>.
looks like a very small graph. what's the size of the graph and the
topology?
On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov
<al...@gmail.com>wrote:
> Hi, team!
>
> As I have read in previous threads, I've started evaluation of Giraph on
> IB-cluster. So here I want to share my results (in case it will be useful
> for anybody) and ask for your ideas of further improving of performance
> characteristics.
>
> Test system:
> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
> * Infiniband FDR Dual-Port 4x
> * SUSE 11.2
> * jdk1.7.0_51
>
> At the moment I am performing experiment with
> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
> wich shows scalability of Giraph up to 32 workers.
>
> As can be seen from the plot up to 8 workers there is almost linear
> scalability and then (from 8 to 32) speed is not going up. For me it seems
> strange that using additional cores on nodes wont bring any performance
> gain to the execution time. Have anybody meet with such behaviour?
>
> Next I am going to use threads instead of workers for cores utilization.
> Also I am going to switch to the Hadoop-RDMA project.
>
>
> If anybody has any suggestion how I can achieve maximum performance on
> Giraph on the cluster, I will be obliged to you ;-)
>
> Hope for your feedback.
>
> Best,
> Alex
>
>
>
>
>
--
Claudio Martella
Re: Performance tuning of Giraph on small cluster with Infiniband
Posted by Sebastian Schelter <ss...@apache.org>.
Hi Alexander,
how large is the graph that you run your test on in terms of vertices
and edges?
--sebastian
On 02/07/2014 03:11 PM, Alexander Frolov wrote:
> Hi, team!
>
> As I have read in previous threads, I've started evaluation of Giraph on
> IB-cluster. So here I want to share my results (in case it will be useful
> for anybody) and ask for your ideas of further improving of performance
> characteristics.
>
> Test system:
> * 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
> * Infiniband FDR Dual-Port 4x
> * SUSE 11.2
> * jdk1.7.0_51
>
> At the moment I am performing experiment with
> SimpleShortestPathsComputation test on generated RMAT graph. I attach plot
> wich shows scalability of Giraph up to 32 workers.
>
> As can be seen from the plot up to 8 workers there is almost linear
> scalability and then (from 8 to 32) speed is not going up. For me it seems
> strange that using additional cores on nodes wont bring any performance
> gain to the execution time. Have anybody meet with such behaviour?
>
> Next I am going to use threads instead of workers for cores utilization.
> Also I am going to switch to the Hadoop-RDMA project.
>
>
> If anybody has any suggestion how I can achieve maximum performance on
> Giraph on the cluster, I will be obliged to you ;-)
>
> Hope for your feedback.
>
> Best,
> Alex
>