You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@giraph.apache.org by "Maja Kabiljo (JIRA)" <ji...@apache.org> on 2012/11/01 01:27:11 UTC

[jira] [Created] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Maja Kabiljo created GIRAPH-397:
-----------------------------------

             Summary: We should have copies of aggregators per thread to avoid synchronizing on aggregate()
                 Key: GIRAPH-397
                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
             Project: Giraph
          Issue Type: Improvement
            Reporter: Maja Kabiljo
            Assignee: Maja Kabiljo
            Priority: Minor


Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/GIRAPH-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo updated GIRAPH-397:
--------------------------------

    Attachment: GIRAPH-397.patch

Eli, thanks for looking! I don't think it makes a big difference, since there are just a few extra empty maps, but ok, I turned it off by default. That's the only change in the patch.
Will commit by the end of the day if nobody objects.
                
> We should have copies of aggregators per thread to avoid synchronizing on aggregate()
> -------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-397
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>            Priority: Minor
>         Attachments: GIRAPH-397.patch, GIRAPH-397.patch
>
>
> Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/GIRAPH-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo updated GIRAPH-397:
--------------------------------

    Attachment: GIRAPH-397.patch

I added this as an option (default to use it), for one of our applications with huge aggregators we don't want to use it since we don't want to spend that much memory on having a lot of copies of aggregators. 

Passes verify and AggregatorsBenchmark on cluster.
                
> We should have copies of aggregators per thread to avoid synchronizing on aggregate()
> -------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-397
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>            Priority: Minor
>         Attachments: GIRAPH-397.patch
>
>
> Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/GIRAPH-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491714#comment-13491714 ] 

Eli Reisman commented on GIRAPH-397:
------------------------------------

No problem! I like what you guys have been doing, and I think the speed
increases are great.

I am a bit worried about the changes making the low-memory per-worker,
scale-out model less effective, but in the end my goal there is to make the
use cases I was being pestered about work for anyone who might be curious
to adopt Giraph into their Hadoop solution set. If this stuff is helping
the FB case, lets get FB up on the "Powered By" page ASAP, this will also
help with wider adoption! Then we can worry about making it work for
different kinds of clusters/resource pools. Wide adoption for Giraph is my
motivating goal.

Great work, everyone. Can't wait to jump back in with some code!



                
> We should have copies of aggregators per thread to avoid synchronizing on aggregate()
> -------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-397
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>            Priority: Minor
>         Attachments: GIRAPH-397.patch, GIRAPH-397.patch
>
>
> Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/GIRAPH-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491207#comment-13491207 ] 

Hudson commented on GIRAPH-397:
-------------------------------

Integrated in Giraph-trunk-Commit #272 (See [https://builds.apache.org/job/Giraph-trunk-Commit/272/])
    GIRAPH-397: We should have copies of aggregators per thread to avoid synchronizing on aggregate() (Revision 1406040)

     Result = SUCCESS
maja : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406040
Files : 
* /giraph/trunk/CHANGELOG
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/aggregators/AggregatorUtils.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/ComputeCallable.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphState.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitsCallable.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/Vertex.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/WorkerAggregatorHandler.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/WorkerContext.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/WorkerThreadAggregatorUsage.java

                
> We should have copies of aggregators per thread to avoid synchronizing on aggregate()
> -------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-397
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>            Priority: Minor
>         Attachments: GIRAPH-397.patch, GIRAPH-397.patch
>
>
> Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/GIRAPH-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490306#comment-13490306 ] 

Eli Reisman commented on GIRAPH-397:
------------------------------------

I like this, but I would have the default not to use it since some applications don't use aggregators at all and definitely don't have the extra storage to spare. +1 from me!

                
> We should have copies of aggregators per thread to avoid synchronizing on aggregate()
> -------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-397
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>            Priority: Minor
>         Attachments: GIRAPH-397.patch
>
>
> Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (GIRAPH-397) We should have copies of aggregators per thread to avoid synchronizing on aggregate()

Posted by "Maja Kabiljo (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/GIRAPH-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maja Kabiljo resolved GIRAPH-397.
---------------------------------

    Resolution: Fixed
    
> We should have copies of aggregators per thread to avoid synchronizing on aggregate()
> -------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-397
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-397
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>            Priority: Minor
>         Attachments: GIRAPH-397.patch, GIRAPH-397.patch
>
>
> Fixing one of TODOs from GIRAPH-273. Adding copies of aggregators for each thread allows us not to have to synchronize on each aggregate call. Aggregated values from each thread can be aggregated together after computation is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira