You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (Created) (JIRA)" <ji...@apache.org> on 2011/12/13 20:14:30 UTC

[jira] [Created] (GIRAPH-104) Save half of maximum memory used from messaging

Save half of maximum memory used from messaging
-----------------------------------------------

                 Key: GIRAPH-104
                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
             Project: Giraph
          Issue Type: Improvement
            Reporter: Avery Ching
            Priority: Critical


Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:

Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.

Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.

Memory optimizations include:
- Clear the message list after computation 
- Free vertex messages on the source as the flush is going on 
- TreeMap -> HashMap for VertexMutations
- Sizing the ArrayList properly in transientInMessages


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Avery Ching (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching reassigned GIRAPH-104:
----------------------------------

    Assignee: Avery Ching
    
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Avery Ching (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching resolved GIRAPH-104.
--------------------------------

    Resolution: Fixed

Thanks for the quick review Claudio!  Onto GIRAPH-57...
                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168645#comment-13168645 ] 

jiraposter@reviews.apache.org commented on GIRAPH-104:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3175/
-----------------------------------------------------------

Review request for giraph.


Summary
-------

Currently, the amount of memory that Giraph uses for messaging is huge. This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging. Details are below:

Refactored RandomMessageBenchmark to an internal vertex class. Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging. Adjusted the postSuperstep() to be called after the flush() for more accurate timings.

Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark). This helps to see how progress is going and gives an ETA.

Memory optimizations include:

-Clear the message list after computation
-Free vertex messages on the source as the flush is going on
-TreeMap -> HashMap for VertexMutations
-Sizing the ArrayList properly in transientInMessages


This addresses bug GIRAPH-104.
    https://issues.apache.org/jira/browse/GIRAPH-104


Diffs
-----

  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/benchmark/RandomMessageBenchmark.java 1213849 
  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java 1213849 
  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/LongSumAggregator.java 1213849 
  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java 1213849 
  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java 1213849 
  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/WorkerContext.java 1213849 
  http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/utils/MemoryUtils.java PRE-CREATION 

Diff: https://reviews.apache.org/r/3175/diff


Testing
-------

Passed local and Hadoop unittests.  RandomMessageBenchmark was run at scale on a real cluster.


Thanks,

Avery


                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168646#comment-13168646 ] 

Avery Ching commented on GIRAPH-104:
------------------------------------

The reduction in the maximum amount of heap used for messaging during the life of an application is quite large.  As an example, here's some runs I did prior to the optimizations:

2011-12-12 22:57:51,961 INFO org.apache.giraph.graph.BspServiceWorker: startSuperstep: Superstep - after prepare 6 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 122.46955M
2011-12-12 22:57:52,354 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep: before flush - Superstep 6 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 119.091606M
2011-12-12 22:57:52,354 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: starting for superstep 6 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 119.091606M
2011-12-12 22:57:59,337 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: ended for superstep 6 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 4.349098M
2011-12-12 22:57:59,337 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep: Superstep 6 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 4.349098M
2011-12-12 22:58:01,403 INFO org.apache.giraph.comm.BasicRPCCommunications: prepareSuperstep: totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 4.156639M
2011-12-12 22:58:04,426 INFO org.apache.giraph.comm.BasicRPCCommunications: prepareSuperstep: Superstep - after inMessage assignmnt 7 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 121.982346M

Note how the free memory would dip to 4 MB at times.  After the fixes I don't see the dips:

2011-12-12 23:39:49,260 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep: Superstep 8 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 110.11537M
2011-12-12 23:39:49,274 INFO org.apache.giraph.comm.BasicRPCCommunications: prepareSuperstep: Superstep 9 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 110.102M
2011-12-12 23:39:49,458 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: starting for superstep 9 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 103.08128M
2011-12-12 23:39:51,728 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: ended for superstep 9 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 106.01724M
2011-12-12 23:39:51,728 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep: Superstep 9 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 106.01724M
2011-12-12 23:39:51,747 INFO org.apache.giraph.comm.BasicRPCCommunications: prepareSuperstep: Superstep 10 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 105.48416M
2011-12-12 23:39:51,786 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: starting for superstep 10 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 119.71583M
2011-12-12 23:39:51,786 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: ended for superstep 10 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 119.5272M
2011-12-12 23:39:51,786 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep: Superstep 10 totalMem = 252.8125M, maxMem = 252.8125M, freeMem = 119.5272M

We should include this ASAP.


                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168936#comment-13168936 ] 

Avery Ching commented on GIRAPH-104:
------------------------------------

Messaging pattern was from RandomMessageBenchmark (very regular).  =)  I was so happy to fix it and save a lot of messaging memory.  I'll wait until your final review before committing.  Thanks for taking a look!
                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168938#comment-13168938 ] 

Avery Ching commented on GIRAPH-104:
------------------------------------

By the way, here's example output from the changes to RandomMessageBenchmark.  It will help us qualify messaging improvements.

2011-12-12 23:58:54,887 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Outputing statistics for superstep 4
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: superstep total bytes sent : 60000000000
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: total bytes sent : 240000000000
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: superstep total messages : 6000000
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: total messages : 24000000
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: superstep total millis : 854309
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: total millis : 3718123
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: workers : 5
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Superstep megabytes / second = 334.8932235547969
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Total megabytes / second = 307.7921789267058
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Superstep messages / second = 35116.09967821947
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Total messages / second = 32274.349181024943
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Superstep megaabytes / second / worker = 66.97864471095939
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Total megabytes / second / worker = 61.55843578534116
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Superstep messages / second / worker = 7023.219935643894
2011-12-12 23:58:54,888 INFO org.apache.giraph.benchmark.RandomMessageBenchmark$RandomMessageBenchmarkWorkerContext: Total messages / second / worker = 6454.869836204989
2011-12-12 23:58:57,627 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: starting for superstep 4 totalMem = 20463.375M, maxMem = 20463.375M, freeMem = 6571.4233M

                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169611#comment-13169611 ] 

Hudson commented on GIRAPH-104:
-------------------------------

Integrated in Giraph-trunk-Commit #47 (See [https://builds.apache.org/job/Giraph-trunk-Commit/47/])
    GIRAPH-104: Save half of maximum memory used from messaging. (aching)

aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1214406
Files : 
* /incubator/giraph/trunk/CHANGELOG
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/benchmark/RandomMessageBenchmark.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/LongSumAggregator.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GraphMapper.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/WorkerContext.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/utils/MemoryUtils.java

                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169539#comment-13169539 ] 

Claudio Martella commented on GIRAPH-104:
-----------------------------------------

Went through it more carefully. Looks very clean,  great work.

+1 from me.
                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168794#comment-13168794 ] 

Claudio Martella commented on GIRAPH-104:
-----------------------------------------

supposing the messaging pattern doesn't change between superstep 6 and superstep 8 :)

this looks like a great improvement, great work. I went through the review, frankly quite quickly, and it looks very good.

I'll check it out better tomorrow and will +1.
                
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-104) Save half of maximum memory used from messaging

Posted by "Avery Ching (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching updated GIRAPH-104:
-------------------------------

    Attachment: GIRAPH-104.diff
    
> Save half of maximum memory used from messaging
> -----------------------------------------------
>
>                 Key: GIRAPH-104
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-104
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>            Priority: Critical
>         Attachments: GIRAPH-104.diff
>
>
> Currently, the amount of memory that Giraph uses for messaging is huge.  This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging.  Details are below:
> Refactored RandomMessageBenchmark to an internal vertex class.  Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging.  Adjusted the postSuperstep() to be called after the flush() for more accurate timings.
> Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark).  This helps to see how progress is going and gives an ETA.
> Memory optimizations include:
> - Clear the message list after computation 
> - Free vertex messages on the source as the flush is going on 
> - TreeMap -> HashMap for VertexMutations
> - Sizing the ArrayList properly in transientInMessages

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira