You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Thomas Jungblut (JIRA)" <ji...@apache.org> on 2012/06/18 11:01:46 UTC
[jira] [Created] (HAMA-593) Improve RPC scalability
Thomas Jungblut created HAMA-593:
------------------------------------
Summary: Improve RPC scalability
Key: HAMA-593
URL: https://issues.apache.org/jira/browse/HAMA-593
Project: Hama
Issue Type: Sub-task
Components: bsp core, messaging
Affects Versions: 0.5.0
Reporter: Thomas Jungblut
To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HAMA-593) Improve RPC scalability
Posted by "Mayank Mishra (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayank Mishra updated HAMA-593:
-------------------------------
Attachment: HAMA-593_2.patch
Got a day off for celebrating Independence Day. :) Adding the patch having LRU Cache support. We are closing the connections when element gets evicted from the cache. Please review it.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Attachments: HAMA-593_1.patch, HAMA-593_2.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433139#comment-13433139 ]
Thomas Jungblut edited comment on HAMA-593 at 8/14/12 12:50 AM:
----------------------------------------------------------------
We could also use the CacheBuilders from Guava.
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
Guava should be in our classpath.
bq. Sorry but I didn't got clues of using just LinkedHashSet for conceptualizing LRU cache.
When sending stuff this is a sequential operation, so caching the connections must not be synchronized. So no need to use synchronized structures. Simply subclassing LinkedHashMap and overriding removeEldestEntry should be enough.
See:
https://github.com/thomasjungblut/thomasjungblut-common/blob/master/src/de/jungblut/datastructure/LRUCache.java
was (Author: thomas.jungblut):
We could also use the CacheBuilders from Guava.
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
Guava should be in our classpath.
bq. Sorry but I didn't got clues of using just LinkedHashSet for conceptualizing LRU cache.
When sending stuff this is a sequential operation, so caching the connections must not be synchronized. So no need to use synchronized structures. Simply subclassing LinkedHashMap and overriding removeEldestEntry should be enough.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Mayank Mishra (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433135#comment-13433135 ]
Mayank Mishra commented on HAMA-593:
------------------------------------
Yes, I agree with the intend of using a LRU with an upper capacity. But, don't you think that rather than going with a synchronized+LinkedHashMap, we should go with LRUCache by extending ConcurrentHashMap, this way as the concurrent usage increase we should be able to get good performance. Sorry but I didn't got clues of using just LinkedHashSet for conceptualizing LRU cache.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435840#comment-13435840 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
Oh sorry, don't wanted to disturb your holiday!
Will review it when I'm back from work. Thanks!
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Attachments: HAMA-593_1.patch, HAMA-593_2.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436093#comment-13436093 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
Build is fine for me, I just made a constant key for configuration and mapped it into the default conf. Also I moved the LRU Cache to our utils, because I'm sure it can be used more throughout the project.
Thank you very much for that contribution. I will open a follow-up for the other scalability problem.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Fix For: 0.6.0
>
> Attachments: HAMA-593_1.patch, HAMA-593_2.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435991#comment-13435991 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
Looks good to me, I think we should move the constant to a higher abstraction level and make it configurable. But that is just a minor thing I will do.
Thanks!
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Attachments: HAMA-593_1.patch, HAMA-593_2.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433139#comment-13433139 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
We could also use the CacheBuilders from Guava.
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
Guava should be in our classpath.
bq. Sorry but I didn't got clues of using just LinkedHashSet for conceptualizing LRU cache.
When sending stuff this is a sequential operation, so caching the connections must not be synchronized. So no need to use synchronized structures. Simply subclassing LinkedHashMap and overriding removeEldestEntry should be enough.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Mayank Mishra (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431624#comment-13431624 ]
Mayank Mishra commented on HAMA-593:
------------------------------------
Can I get some more description on this issue. Are we talking about getting BSPPeerConnections (i.e. HadoopMessageManager) one after another on request basis or about Avro's Message Bundle transmission over NettyTransceiver?
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Jungblut resolved HAMA-593.
----------------------------------
Resolution: Fixed
Fix Version/s: 0.6.0
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Fix For: 0.6.0
>
> Attachments: HAMA-593_1.patch, HAMA-593_2.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433277#comment-13433277 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
Do you want to add this caching behaviour?
Patch otherwise looks good to me.
We have to add a follow up issue for the other scalability bottleneck.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435267#comment-13435267 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
Feel free to add another patch then, if not I will commit this on saturday and add the caching.
Thanks!
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431653#comment-13431653 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
In both of the managers (Hadoop|Avro) the RPC connections to the other peers are cached in a map called "peers".
This will always leave the connection open to every other task, imagine you have 1k tasks, each task will hold 1k connections to each other (999 outband, 1 local).
1. The caching must be removed and the connections must be closed when the messages were send.
Then, there is a problem when all 1k peers would attempt to send to a single peer (let's say a master task in a graph algorithm that aggregates). In this case the peer will start 1k-threads which is using enourmous amount of memory. I wouldn't mind if this can be done smarter, but I have no solution at hand currently.
Would be cool if 1. could be resolved :)
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HAMA-593) Improve RPC scalability
Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Edward J. Yoon reassigned HAMA-593:
-----------------------------------
Assignee: Mayank Mishra
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436397#comment-13436397 ]
Hudson commented on HAMA-593:
-----------------------------
Integrated in Hama-Nightly #644 (See [https://builds.apache.org/job/Hama-Nightly/644/])
[HAMA-593]: Improve RPC scalability (Mayank Mishra via tjungblut) (Revision 1373915)
Result = FAILURE
tjungblut :
Files :
* /hama/trunk/CHANGES.txt
* /hama/trunk/conf/hama-default.xml
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/AbstractMessageManager.java
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/AvroMessageManagerImpl.java
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/CompressableMessageManager.java
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/HadoopMessageManagerImpl.java
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/MemoryQueue.java
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/MessageManager.java
* /hama/trunk/core/src/main/java/org/apache/hama/bsp/message/Sender.java
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Assignee: Mayank Mishra
> Fix For: 0.6.0
>
> Attachments: HAMA-593_1.patch, HAMA-593_2.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HAMA-593) Improve RPC scalability
Posted by "Thomas Jungblut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432925#comment-13432925 ]
Thomas Jungblut commented on HAMA-593:
--------------------------------------
Hi will review this soon.
Do you think it may be beneficial to configure a number of tasks (at max) where the sockets are cached?
Maybe a simple LRU cache by a LinkedHashSet.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HAMA-593) Improve RPC scalability
Posted by "Mayank Mishra (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayank Mishra updated HAMA-593:
-------------------------------
Attachment: HAMA-593_1.patch
Patch for HAMA-593. Removed caching of RPC connections for both Hadoop|Avro Managers, connections are created and closed once transfer to peer happens.
> Improve RPC scalability
> -----------------------
>
> Key: HAMA-593
> URL: https://issues.apache.org/jira/browse/HAMA-593
> Project: Hama
> Issue Type: Sub-task
> Components: bsp core, messaging
> Affects Versions: 0.5.0
> Reporter: Thomas Jungblut
> Attachments: HAMA-593_1.patch
>
>
> To improve scalability we can start a RPC connection after another instead of keeping all possible N connections open the whole time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira