You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Bo Wang <da...@gmail.com> on 2012/04/24 08:11:38 UTC

Review Request: Improve concurrency of putMsg / putMsgList

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/
-----------------------------------------------------------

Review request for giraph.


Summary
-------

Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to message map. The concurrencyLevel of ConcurrentHashMap uses the default value. There may be some performance gain by tuning this value.


This addresses bug GIRAPH-185.
    https://issues.apache.org/jira/browse/GIRAPH-185


Diffs
-----

  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1328747 

Diff: https://reviews.apache.org/r/4852/diff


Testing
-------


Thanks,

Bo


Re: Review Request: Improve concurrency of putMsg / putMsgList

Posted by Avery Ching <av...@gmail.com>.

> On 2012-04-24 20:53:33, Avery Ching wrote:
> > http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java, lines 776-777
> > <https://reviews.apache.org/r/4852/diff/1/?file=104059#file104059line776>
> >
> >     Bo, I'm a little leery about converting the List and ArrayList to LinkedList and ConcurrentLinkedList.  I believe that linked list's will use more memory than the array list due to the double links (forward and backward).  Also, is ConcurrentLinkedList supposted to outperform a synchronized ArrayList?  I haven't seen much on that.
> >     
> >     The concurrenthashmap changes look good.
> 
> Bo Wang wrote:
>     Avery, thanks for the comments. I just measured the sizes of these classes and below are an estimation. 
>     
>     java.util.ArrayList: 149 bytes
>     java.util.LinkedList: 101 bytes
>     java.util.concurrent.ConcurrentLinkedQueue: 118 bytes
>     
>     The tool I was using is a program from the link below.
>     http://www.javapractices.com/topic/TopicAction.do?Id=83
>     
>     In terms of performance, here is a benchmark.
>     http://www.javacodegeeks.com/2010/09/java-best-practices-queue-battle-and.html
>     
>     In its test #1 (adding element), ConcurrentLinkedQueue performed slightly better than LinkedList. In test #3 (iterator), LinkedList outperformed ConcurrentLinkedQueue. I think the most time consuming part is add, while iteration is also heavily used but no concurrent accesses. 
>     
>

Thanks for the response Bo.

Those numbers are for the empty data structures I'm assuming.  I was referring to the incremental cost of adding elements (messages) to the data structures.  The performance isn't a a concern to me (unless we call size() somewhere).


- Avery


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
-----------------------------------------------------------


On 2012-04-24 06:11:38, Bo Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4852/
> -----------------------------------------------------------
> 
> (Updated 2012-04-24 06:11:38)
> 
> 
> Review request for giraph.
> 
> 
> Summary
> -------
> 
> Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to message map. The concurrencyLevel of ConcurrentHashMap uses the default value. There may be some performance gain by tuning this value.
> 
> 
> This addresses bug GIRAPH-185.
>     https://issues.apache.org/jira/browse/GIRAPH-185
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1328747 
> 
> Diff: https://reviews.apache.org/r/4852/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bo
> 
>


Re: Review Request: Improve concurrency of putMsg / putMsgList

Posted by Avery Ching <av...@gmail.com>.

> On 2012-04-24 20:53:33, Avery Ching wrote:
> > http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java, lines 776-777
> > <https://reviews.apache.org/r/4852/diff/1/?file=104059#file104059line776>
> >
> >     Bo, I'm a little leery about converting the List and ArrayList to LinkedList and ConcurrentLinkedList.  I believe that linked list's will use more memory than the array list due to the double links (forward and backward).  Also, is ConcurrentLinkedList supposted to outperform a synchronized ArrayList?  I haven't seen much on that.
> >     
> >     The concurrenthashmap changes look good.
> 
> Bo Wang wrote:
>     Avery, thanks for the comments. I just measured the sizes of these classes and below are an estimation. 
>     
>     java.util.ArrayList: 149 bytes
>     java.util.LinkedList: 101 bytes
>     java.util.concurrent.ConcurrentLinkedQueue: 118 bytes
>     
>     The tool I was using is a program from the link below.
>     http://www.javapractices.com/topic/TopicAction.do?Id=83
>     
>     In terms of performance, here is a benchmark.
>     http://www.javacodegeeks.com/2010/09/java-best-practices-queue-battle-and.html
>     
>     In its test #1 (adding element), ConcurrentLinkedQueue performed slightly better than LinkedList. In test #3 (iterator), LinkedList outperformed ConcurrentLinkedQueue. I think the most time consuming part is add, while iteration is also heavily used but no concurrent accesses. 
>     
>
> 
> Avery Ching wrote:
>     Thanks for the response Bo.
>     
>     Those numbers are for the empty data structures I'm assuming.  I was referring to the incremental cost of adding elements (messages) to the data structures.  The performance isn't a a concern to me (unless we call size() somewhere).

By the incremental cost, I mean the memory cost, sorry.


- Avery


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
-----------------------------------------------------------


On 2012-04-24 06:11:38, Bo Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4852/
> -----------------------------------------------------------
> 
> (Updated 2012-04-24 06:11:38)
> 
> 
> Review request for giraph.
> 
> 
> Summary
> -------
> 
> Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to message map. The concurrencyLevel of ConcurrentHashMap uses the default value. There may be some performance gain by tuning this value.
> 
> 
> This addresses bug GIRAPH-185.
>     https://issues.apache.org/jira/browse/GIRAPH-185
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1328747 
> 
> Diff: https://reviews.apache.org/r/4852/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bo
> 
>


Re: Review Request: Improve concurrency of putMsg / putMsgList

Posted by Bo Wang <da...@gmail.com>.

> On 2012-04-24 20:53:33, Avery Ching wrote:
> > http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java, lines 776-777
> > <https://reviews.apache.org/r/4852/diff/1/?file=104059#file104059line776>
> >
> >     Bo, I'm a little leery about converting the List and ArrayList to LinkedList and ConcurrentLinkedList.  I believe that linked list's will use more memory than the array list due to the double links (forward and backward).  Also, is ConcurrentLinkedList supposted to outperform a synchronized ArrayList?  I haven't seen much on that.
> >     
> >     The concurrenthashmap changes look good.

Avery, thanks for the comments. I just measured the sizes of these classes and below are an estimation. 

java.util.ArrayList: 149 bytes
java.util.LinkedList: 101 bytes
java.util.concurrent.ConcurrentLinkedQueue: 118 bytes

The tool I was using is a program from the link below.
http://www.javapractices.com/topic/TopicAction.do?Id=83

In terms of performance, here is a benchmark.
http://www.javacodegeeks.com/2010/09/java-best-practices-queue-battle-and.html

In its test #1 (adding element), ConcurrentLinkedQueue performed slightly better than LinkedList. In test #3 (iterator), LinkedList outperformed ConcurrentLinkedQueue. I think the most time consuming part is add, while iteration is also heavily used but no concurrent accesses. 


- Bo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
-----------------------------------------------------------


On 2012-04-24 06:11:38, Bo Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4852/
> -----------------------------------------------------------
> 
> (Updated 2012-04-24 06:11:38)
> 
> 
> Review request for giraph.
> 
> 
> Summary
> -------
> 
> Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to message map. The concurrencyLevel of ConcurrentHashMap uses the default value. There may be some performance gain by tuning this value.
> 
> 
> This addresses bug GIRAPH-185.
>     https://issues.apache.org/jira/browse/GIRAPH-185
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1328747 
> 
> Diff: https://reviews.apache.org/r/4852/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bo
> 
>


Re: Review Request: Improve concurrency of putMsg / putMsgList

Posted by Avery Ching <av...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7185
-----------------------------------------------------------



http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
<https://reviews.apache.org/r/4852/#comment15860>

    Bo, I'm a little leery about converting the List and ArrayList to LinkedList and ConcurrentLinkedList.  I believe that linked list's will use more memory than the array list due to the double links (forward and backward).  Also, is ConcurrentLinkedList supposted to outperform a synchronized ArrayList?  I haven't seen much on that.
    
    The concurrenthashmap changes look good.


- Avery


On 2012-04-24 06:11:38, Bo Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4852/
> -----------------------------------------------------------
> 
> (Updated 2012-04-24 06:11:38)
> 
> 
> Review request for giraph.
> 
> 
> Summary
> -------
> 
> Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to message map. The concurrencyLevel of ConcurrentHashMap uses the default value. There may be some performance gain by tuning this value.
> 
> 
> This addresses bug GIRAPH-185.
>     https://issues.apache.org/jira/browse/GIRAPH-185
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1328747 
> 
> Diff: https://reviews.apache.org/r/4852/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bo
> 
>


Re: Review Request: Improve concurrency of putMsg / putMsgList

Posted by Claudio Martella <cl...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4852/#review7169
-----------------------------------------------------------


Looks good to me, wound't hurt to see some stress test to check performance, although I wouldn't expect this to be slower than the synchronized block. Also, I'd agree that moving the messages directly from the inMessages to the Vertex could be a memory win.

- Claudio


On 2012-04-24 06:11:38, Bo Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4852/
> -----------------------------------------------------------
> 
> (Updated 2012-04-24 06:11:38)
> 
> 
> Review request for giraph.
> 
> 
> Summary
> -------
> 
> Use ConcurrentHashMap and ConcurrentLinkedQueue to allow concurrent assess to message map. The concurrencyLevel of ConcurrentHashMap uses the default value. There may be some performance gain by tuning this value.
> 
> 
> This addresses bug GIRAPH-185.
>     https://issues.apache.org/jira/browse/GIRAPH-185
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1328747 
> 
> Diff: https://reviews.apache.org/r/4852/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bo
> 
>