You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2011/08/31 04:59:09 UTC
[jira] [Updated] (GIRAPH-18) Refactor
BspServiceWorker::loadVertices()
[ https://issues.apache.org/jira/browse/GIRAPH-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jakob Homan updated GIRAPH-18:
------------------------------
Attachment: GIRAPH-18.patch
Patch:
* Refactors BspServiceWorker::loadVertices() into several smaller functions, each responsible for a subsection of the logic. This includes:
** gathering the input split from ZK
** actually reading the vertices from the input split
** generating the vertex ranges
** calculating the range size
** building the range stat instance that's returned.
* Replaces the single arraylist that's created with one that's generated for each inputsplitpath. This is a performance win because, with spikey or dense graphs the backing array of the arraylist was resized but never shrunk, putting pressure on the heap. These new per-call arraylists will come and go quickly, making them eligible for gc.
* Specifies the exact size of the list used in the maxIndexStatMap to 2, saving some space. This adds up!
Post-refactor these methods will be much easier to unit test since they all have defined inputs and outputs.
Patch follows current style and passes unit tests.
> Refactor BspServiceWorker::loadVertices()
> -----------------------------------------
>
> Key: GIRAPH-18
> URL: https://issues.apache.org/jira/browse/GIRAPH-18
> Project: Giraph
> Issue Type: Improvement
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Attachments: GIRAPH-18.patch
>
>
> Currently BspServiceWorker::loadVertices() is more than 200 lines and convoluted. I found it difficult to grok while debugging today.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira