You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Benjamin Heitmann (JIRA)" <ji...@apache.org> on 2012/05/22 14:43:41 UTC

[jira] [Commented] (GIRAPH-192) Move aggregators to a seperate sub-package

    [ https://issues.apache.org/jira/browse/GIRAPH-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280909#comment-13280909 ] 

Benjamin Heitmann commented on GIRAPH-192:
------------------------------------------

Jan, your description of this issue sounds like you already used aggregators extensively. 

I have recently tried using aggregators for generating statistics about my algorithm, 
and I am not completely sure if the reported numbers are accurate. 

One issue which I had, is that aggregators which get a reset e.g. between supersteps, do not report accurate numbers when manually compared to aggregators which count the same thing, but which did not get a reset. 

The other issue, is that some aggregators (without a reset) reported way to high overall numbers. 

The last issue might really be due to wrong code on my side. But the first issue really seemed like a bug to me. 


Jan, do have an insight into the inner workings of the aggregators? Are there unit tests for the aggregators? Is it possible that they are buggy in some way (similar to what I described), or would you certain that their code is reliable, and that they just need to be moved to their own sub-package ? 

It would be really good to have aggregators working properly outside of the example package, not just for me, but for everybody, as they form a part of the original Google Pregel BSP design. 
                
> Move aggregators to a seperate sub-package
> ------------------------------------------
>
>                 Key: GIRAPH-192
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-192
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>            Reporter: Jan van der Lugt
>            Priority: Minor
>         Attachments: GIRAPH-192.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Since aggregators will be re-used throughout many projects and algorithms, it makes sense to implement the most common ones in a separate sub-package. This will reduce the time required for users when they implement their projects based on Giraph, because the required aggregators are already in place. I implemented the following ones:
> for int/long/float/double: min, max, product, sum, overwrite
> for boolean: and, or, overwrite
> Most of them speak for themselves, except for the overwrite one. This aggregator simply overwrites the stored value when a new value is aggregated. This is useful when one node is in some way a master node (for example a source node in an routing algorithm), and this node wants to broadcast a value to all other nodes.
> Attached is a patch against trunk implementing the aggregators and patching some existing files so they use the .aggregators package instead of the .examples one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira