You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2011/08/29 04:57:37 UTC

[jira] [Created] (GIRAPH-10) Aggregators are not exported

Aggregators are not exported
----------------------------

                 Key: GIRAPH-10
                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
             Project: Giraph
          Issue Type: New Feature
            Reporter: Avery Ching


Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by Avery Ching <ac...@apache.org>.
Claudio,

For the AggregatorWriter, shouldn't your case be addressed?  Since only 
a single thread will use the AggregatorWriter, it can dump the 
aggregators to HDFS at every superstep.  Am I missing something?

Avery

On 9/30/11 2:20 AM, Claudio Martella wrote:
> The first option sounds like a good idea to me. It fits quite directly
> with the current interface.
>
> This design though doesn't fit with my current need for dumping data
> to disk. I'll write an email about it with a couple of new questions.
>
> I think that with some help I can produce some code on this JIRA.
>
> On Thu, Sep 29, 2011 at 9:49 PM, Avery Ching (Commented) (JIRA)
> <ji...@apache.org>  wrote:
>>     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117569#comment-13117569 ]
>>
>> Avery Ching commented on GIRAPH-10:
>> -----------------------------------
>>
>> This is definitely an open question here.  Here's a quick thought off the top of my head.
>>
>> How about something like:
>>
>> public interface AggregatorsWriter {
>>     void initialize(TaskAttemptContext context, int superstep) throws IOException;
>>
>>     void writeAggregators(Collection<Aggregator<? extends Writable>>  aggregators)
>>         throws IOException, InterruptedException;
>>
>>     void close(TaskAttemptContext context)
>>         throws IOException, InterruptedException;
>> }
>>
>> Then we could define a method that lets the user select the frequency of how frequently to write the aggregators maybe with an enum (ALL_SUPERSTEPS, LAST_SUPERSTEP...).
>>
>> We could easily implement a default AggregatorsWriter that simply dumps the name and aggregator values with the registered aggregator name (maybe class name too) and then the value.toString().  And users can implement something better if they like.
>>
>> Another alternative would be to support writing aggregators separately by the type, but might be a little excessive and could be done later....hopefully someone else also chimes in with some thoughts.
>>
>>> Aggregators are not exported
>>> ----------------------------
>>>
>>>                  Key: GIRAPH-10
>>>                  URL: https://issues.apache.org/jira/browse/GIRAPH-10
>>>              Project: Giraph
>>>           Issue Type: New Feature
>>>             Reporter: Avery Ching
>>>             Priority: Minor
>>>
>>> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>>
>>
>
>


Re: [jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by Claudio Martella <cl...@gmail.com>.
The first option sounds like a good idea to me. It fits quite directly
with the current interface.

This design though doesn't fit with my current need for dumping data
to disk. I'll write an email about it with a couple of new questions.

I think that with some help I can produce some code on this JIRA.

On Thu, Sep 29, 2011 at 9:49 PM, Avery Ching (Commented) (JIRA)
<ji...@apache.org> wrote:
>
>    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117569#comment-13117569 ]
>
> Avery Ching commented on GIRAPH-10:
> -----------------------------------
>
> This is definitely an open question here.  Here's a quick thought off the top of my head.
>
> How about something like:
>
> public interface AggregatorsWriter {
>    void initialize(TaskAttemptContext context, int superstep) throws IOException;
>
>    void writeAggregators(Collection<Aggregator<? extends Writable>> aggregators)
>        throws IOException, InterruptedException;
>
>    void close(TaskAttemptContext context)
>        throws IOException, InterruptedException;
> }
>
> Then we could define a method that lets the user select the frequency of how frequently to write the aggregators maybe with an enum (ALL_SUPERSTEPS, LAST_SUPERSTEP...).
>
> We could easily implement a default AggregatorsWriter that simply dumps the name and aggregator values with the registered aggregator name (maybe class name too) and then the value.toString().  And users can implement something better if they like.
>
> Another alternative would be to support writing aggregators separately by the type, but might be a little excessive and could be done later....hopefully someone else also chimes in with some thoughts.
>
>> Aggregators are not exported
>> ----------------------------
>>
>>                 Key: GIRAPH-10
>>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>>             Project: Giraph
>>          Issue Type: New Feature
>>            Reporter: Avery Ching
>>            Priority: Minor
>>
>> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>



-- 
    Claudio Martella
    claudio.martella@gmail.com

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated GIRAPH-10:
-----------------------------------

    Attachment: GIRAPH-10.diff

Exports Aggregators, adds unit test for the new class. Passes unit tests.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated GIRAPH-10:
-----------------------------------

    Attachment: GIRAPH-10.diff

Exports Aggregators, adds unit test for the new class. Passes unit tests.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163121#comment-13163121 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

great. thanks for the feedback.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117164#comment-13117164 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

was thinking: as BasicVertex is already have pre/post Application/Superstep methods, wouldn't it make sense to add them to Aggregator as well? That would allow better management of both the logics of aggregation and persistence (it's much easier to handle file opening and flushing/close through those methods). Maybe also access to JobConf too. 
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162173#comment-13162173 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

Please ignore the first one.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162640#comment-13162640 ] 

Avery Ching commented on GIRAPH-10:
-----------------------------------

Hi Claudio, great stuff!

A couple of questions.

I downloaded GIRAPH-10.diff and think you are missing AggragatorWriter.java and TextAggregatorWriter.java from the diff.  Also, I was thinking, shouldn't the default be to not write any aggregator data?
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117228#comment-13117228 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

Also, I've given a thought about the implementation (without considering my recent comment about the change of interface). The idea would be to create a saveAggregatedValues() at the beginning of BspServiceMaster.cleanup() just like saveVertices() is at beginning of BspServiceWorker.cleanup(), and just dump to configured hdfs outputfile (any better idea here?) the key:value pairs created in ZK by BsbServiceMaster.collectAndProcessAggregatedValues().

Any comment about this?
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated GIRAPH-10:
-----------------------------------

    Attachment: GIRAPH-10.diff

AggregatorWriter and TextAggregatorWriter are in this patch as well. Messed up the diff by creating the patch on a freshly checked out trunk :)
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (GIRAPH-10) Aggregators are not exported

Posted by "Avery Ching (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching resolved GIRAPH-10.
-------------------------------

    Resolution: Fixed

Nice work Claudio!
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Arun Suresh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150283#comment-13150283 ] 

Arun Suresh commented on GIRAPH-10:
-----------------------------------

Avery, I am assuming that a Vertex implemented by the user would have to implement this interface if he requires his Aggregates to be persisted.. rite ? 

Firstly, I feel that Aggregrators MUST be persisted at the end of each superstep if the Vertex has implemented AggregatorWriter. 

Secondly, the problem with exposing it as an interface I guess is.. Assume we have a default AggregateWriter, how would the user include the default behavior in his custom vertex ?
I guess a better approach would be to let the user specify the AggregaterWriter class in the Giraph JobConf and pass an instance of the class to methods in the BspServiceWorker class like [~cmartella] mentioned in his comment.

With respect to writing Aggregators by type.. do you mean persisting a Map<AggregatorType, List<Entry<String, Aggregator>>> for each job ?
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching updated GIRAPH-10:
------------------------------

    Priority: Minor  (was: Major)

> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella reassigned GIRAPH-10:
--------------------------------------

    Assignee: Claudio Martella
    
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated GIRAPH-10:
-----------------------------------

    Attachment: GIRAPH-10.diff

fixed according to Avery's feedback.

About tabs and spaces, sorry about that, i probably wrote a file before setting my IDE correctly.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Arun Suresh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150287#comment-13150287 ] 

Arun Suresh commented on GIRAPH-10:
-----------------------------------

wrt my previous comment, I meant Aggregators MUST be persisted at the end of EACH superstep rather than giving the user control over the frequency. One thought thought.. If we assume the Aggregator value is stored in the Zookeeper server reliably for the duration of the Job, why persist it in the first place. I guess the only reason would be that the actually artifact of interest after a Job has been run is the Aggregator value rather than the state of the Vertices.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated GIRAPH-10:
-----------------------------------

    Attachment:     (was: GIRAPH-10.diff)
    
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117569#comment-13117569 ] 

Avery Ching commented on GIRAPH-10:
-----------------------------------

This is definitely an open question here.  Here's a quick thought off the top of my head.

How about something like:

public interface AggregatorsWriter {
    void initialize(TaskAttemptContext context, int superstep) throws IOException;

    void writeAggregators(Collection<Aggregator<? extends Writable>> aggregators)
        throws IOException, InterruptedException;

    void close(TaskAttemptContext context)
        throws IOException, InterruptedException;
}

Then we could define a method that lets the user select the frequency of how frequently to write the aggregators maybe with an enum (ALL_SUPERSTEPS, LAST_SUPERSTEP...).

We could easily implement a default AggregatorsWriter that simply dumps the name and aggregator values with the registered aggregator name (maybe class name too) and then the value.toString().  And users can implement something better if they like.

Another alternative would be to support writing aggregators separately by the type, but might be a little excessive and could be done later....hopefully someone else also chimes in with some thoughts.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150327#comment-13150327 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

Yes, this is actually quite the point.

1) you may want to see the different values of your aggregators at different supersteps (for debugging reasons, just to mention the most obvious). On the other side, once you're debugged or for performance reason, you just may want to see the final result and avoid useless I/O.

2) for Aggregators by type we mean a Map<AggregatorType, AggregatorWriter>, so you can have a fine-grained aggregator writer for each type. You may want a different persistence depending on the type.

I think it makes sense to let the user decide how often they may be persisted, and the easiest way i was thinking about is the logic of the checkpointing frequency parameter. I was going to implement it that way.

About AggregatorWriter being an interface, they idea of having a default implementation is just to let the user have something that prints the values without needing to implement it. It makes sense. If the default behavior is not what he wants, he's free to implement his type. I can also see on the other side the AggregatorWriter to be a type to be extended by the user (or something even abstract, like we're doing with WorkerContext). I was expecting to have a more clear view once writing the code.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163111#comment-13163111 ] 

Hudson commented on GIRAPH-10:
------------------------------

Integrated in Giraph-trunk-Commit #46 (See [https://builds.apache.org/job/Giraph-trunk-Commit/46/])
    GIRAPH-10: Aggregators are not exported.

claudio : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1210662
Files : 
* /incubator/giraph/trunk/CHANGELOG
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleAggregatorWriter.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimplePageRankVertex.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/AggregatorWriter.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspUtils.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/TextAggregatorWriter.java
* /incubator/giraph/trunk/src/test/java/org/apache/giraph/BspCase.java
* /incubator/giraph/trunk/src/test/java/org/apache/giraph/TestBspBasic.java

                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163050#comment-13163050 ] 

Avery Ching commented on GIRAPH-10:
-----------------------------------

Much improved Claudio.

A couple more minor suggestions:

Please add a javadoc comment for the class SimpleAggregatorWriter on what it does (given that it is in examples and users will be looking for help on what it is doing).

TestBspBasic.java

Before line 356: assertTrue(job.run(true));

Should add something like (as in other tests in that file):

Path outputPath = new Path("/tmp/" + getCallingMethodName());
removeAndSetOutput(job, outputPath);

If you don't do this, the next time someone adds another test to this dir and it doesn't set the output dir, it could potentially cause issues I think if they are relying on specific stuff in that dir.

By the way, good bug fix in the test.

If you agree and make those changes, this is an effective +1, please upload your final diff and feel free to commit. =)

                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162722#comment-13162722 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

@Avery: yes, as you'll see, TextAggregatorWriter doesn't write anything as default.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Avery Ching (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162929#comment-13162929 ] 

Avery Ching commented on GIRAPH-10:
-----------------------------------

Thanks for the revised diff.  I have some suggestions, let me know what you think.

First, you are missing 

org.apache.giraph.examples.SimpleAggregatorWriter in the diff.

Hence I am getting errors in my build:

[ERROR] /Users/aching/Avery/source/giraph_trunk/src/test/java/org/apache/giraph/TestBspBasic.java:[24,33] cannot find symbol
symbol  : class SimpleAggregatorWriter
location: package org.apache.giraph.examples
[ERROR] /Users/aching/Avery/source/giraph_trunk/src/test/java/org/apache/giraph/TestBspBasic.java:[355,37] cannot find symbol

Also, your IDE is using tabs.  The CODE_CONVENTIONS asks for spaces instead of tabs.  Can you please convert all your tabs to spaces?

In AggreatorWriter.java
- Quite a few tabs in this file, please changes to spaces.
- Indentation issues lines: 60 and greater in AggregatorWriter.java
- line 52: "The methods is called at the" => "This method is called at the"
- line 60: map is a bit non-descriptive here.  Can you change it to something else, i.e. aggregatorNameValueMap or even just aggregatorMap?
- line 64: "successfull" => "successful"

TextAggregatorWriter.java
-line 44: aggreatos -> aggregators

TestBspBasic.java
- line 371:  "for (i=0; ; i++) {" =>  "for (i = 0; ; i++) {"


                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudio Martella updated GIRAPH-10:
-----------------------------------

    Attachment: GIRAPH-10.diff

Fixed according to last feedback. Committing
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Assignee: Claudio Martella
>            Priority: Minor
>         Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff
>
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (GIRAPH-10) Aggregators are not exported

Posted by "Claudio Martella (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131681#comment-13131681 ] 

Claudio Martella commented on GIRAPH-10:
----------------------------------------

OK, I think I'll take this one as you proposed in your last comment.
                
> Aggregators are not exported
> ----------------------------
>
>                 Key: GIRAPH-10
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-10
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Avery Ching
>            Priority: Minor
>
> Currently, aggregator values cannot be saved after a Giraph job.  There should be a way to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira