You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Armando <a....@student.vu.nl> on 2013/08/23 03:34:47 UTC

Review Request 13756: GIRAPH-732: EdgeOutputFormat API

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/
-----------------------------------------------------------

Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.


Bugs: GIRAPH-732
    https://issues.apache.org/jira/browse/GIRAPH-732


Repository: giraph-git


Description
-------

This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
and modified TextVertexOutputFormat so that a subdirectory can be specified.
The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.

passed "maven verify"


Diffs
-----

  giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
  giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
  giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 

Diff: https://reviews.apache.org/r/13756/diff/


Testing
-------


Thanks,

Armando


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.

> On Aug. 26, 2013, 12:34 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1093
> > <https://reviews.apache.org/r/13756/diff/3/?file=344713#file344713line1093>
> >
> >     can you count these in the for-loop above instead of using getNumEdges()? As you're already iterating them, it costs nothing, and we avoid recounting them for those implementations of OutEdges that actually count edges every time at that call (such as the byte-serialized ones).
> 
> Armando wrote:
>     I don't understand what you mean: since getNumEdges is a method of Vertex, I call it once per vertex not more, am I wrong?

I'm saying that for some implementations of OutEdges the cost of this method might not be O(1) (Vertex calls the method on the OutEdges). As you're already iterating over the edges, you might piggyback that count earlier.


- Claudio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25549
-----------------------------------------------------------


On Aug. 26, 2013, 12:17 p.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2013, 12:17 p.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java bd69586 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 26, 2013, 12:34 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1093
> > <https://reviews.apache.org/r/13756/diff/3/?file=344713#file344713line1093>
> >
> >     can you count these in the for-loop above instead of using getNumEdges()? As you're already iterating them, it costs nothing, and we avoid recounting them for those implementations of OutEdges that actually count edges every time at that call (such as the byte-serialized ones).

I don't understand what you mean: since getNumEdges is a method of Vertex, I call it once per vertex not more, am I wrong?


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25549
-----------------------------------------------------------


On Aug. 26, 2013, 12:17 p.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2013, 12:17 p.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java bd69586 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25549
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
<https://reviews.apache.org/r/13756/#comment50014>

    can you count these in the for-loop above instead of using getNumEdges()? As you're already iterating them, it costs nothing, and we avoid recounting them for those implementations of OutEdges that actually count edges every time at that call (such as the byte-serialized ones).


- Claudio Martella


On Aug. 26, 2013, 12:17 p.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2013, 12:17 p.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java bd69586 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/
-----------------------------------------------------------

(Updated Aug. 26, 2013, 12:17 p.m.)


Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.


Changes
-------

Uploaded the new diff fixed with the review suggestions. I think I processed everything, but do you see anything I forgot?


Bugs: GIRAPH-732
    https://issues.apache.org/jira/browse/GIRAPH-732


Repository: giraph-git


Description
-------

This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
and modified TextVertexOutputFormat so that a subdirectory can be specified.
The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.

passed "maven verify"


Diffs (updated)
-----

  giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
  giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java bd69586 
  giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
  giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 

Diff: https://reviews.apache.org/r/13756/diff/


Testing
-------


Thanks,

Armando


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 23, 2013, 12:18 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java, line 216
> > <https://reviews.apache.org/r/13756/diff/2/?file=344102#file344102line216>
> >
> >     i think a default should be set

I don't. Currently users will have this in the main output path and would expect it to be there. See next point.


> On Aug. 23, 2013, 12:18 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java, line 224
> > <https://reviews.apache.org/r/13756/diff/2/?file=344102#file344102line224>
> >
> >     i think a default should be set

This could be set by default. Related to vertex path, this could be more advisable. In this situation path/ will contain the verteces and path/edges/ the edges. This can avoid clashes and mess up the computation if the user forgets to specify the path.


> On Aug. 23, 2013, 12:18 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1096
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1096>
> >
> >     form? Typo?

yes a typo, I'll fix it.


> On Aug. 23, 2013, 12:18 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1114
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1114>
> >
> >     shouldn't be save-edges?

nope. I dont' have the statistics for the edges anyway and to avoid adding complexity I used a similar statistic information since edges are outputted for each vertex.


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25468
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 23, 2013, 12:18 p.m., Claudio Martella wrote:
> > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java, line 224
> > <https://reviews.apache.org/r/13756/diff/2/?file=344102#file344102line224>
> >
> >     i think a default should be set
> 
> Armando wrote:
>     This could be set by default. Related to vertex path, this could be more advisable. In this situation path/ will contain the verteces and path/edges/ the edges. This can avoid clashes and mess up the computation if the user forgets to specify the path.

Actually, I was thinking about this and maybe the best way to handle it is to add an exception to break the computation immediately. If the user does not provide both esb and vsb while providing both OutputFormats I can rise an illegalState exception and provide the user with the info. In this manner there would be no "path/" with edges and "path/edges" in it.


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25468
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25468
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
<https://reviews.apache.org/r/13756/#comment49892>

    i think a default should be set



giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
<https://reviews.apache.org/r/13756/#comment49893>

    i think a default should be set



giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49894>

    single space between variable name and "="



giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
<https://reviews.apache.org/r/13756/#comment49890>

    form? Typo?



giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
<https://reviews.apache.org/r/13756/#comment49891>

    shouldn't be save-edges?


- Claudio Martella


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1099
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1099>
> >
> >     Should we track and add something in here about number of edges saved instead of just vertices, seeing as this is all about edge writing?
> 
> Armando wrote:
>     something like this is acceptable?
>     
>     LoggerUtils.setStatusAndLog(getContext(), LOG, Level.INFO,         
>                             "saveEdges: Saved " + edges +                                  
>                             " edges out of " + partition.getEdgeCount() +                  
>                             " partition edges, on partition " + partitionIndex +           
>                             " out of " + numPartitions);

this could work, but you need to cache the edge count value for the partition, because for some implementations that method iterates through all the vertices to compute it.


- Claudio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java, line 98
> > <https://reviews.apache.org/r/13756/diff/2/?file=344110#file344110line98>
> >
> >     remove extra spaces in all these methods


> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1056
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1056>
> >
> >     remove final modifiers here, we don't use them in method variables
> 
> Armando wrote:
>     I fixing this also in the method "saveVertices" which had them there.

re-checking: the final here is needed to use the variables inside the CallableFactory anonymous class.


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Nitay Joffe <ni...@apache.org>.

On Aug. 25, 2013, 8:30 p.m., Armando wrote:
> > Overall logic looks good, but there is lots of style issues. Please fix them. Did you run checkstyle and everything by the way?
> 
> Armando wrote:
>     Yes I did, and it passed. Maybe some additional checkes could be useful (actually I ran mvn verify also and passed and it looked to me that it compiles and runs also checkstyle but anyway I did also run it). I'll fix all these things. Thanks for the feedback.

Yeah I think unfortunately some things checkstyle isn't actually able to check for, it's just our convention and we like to have all the code follow the patterns.


- Nitay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java, line 79
> > <https://reviews.apache.org/r/13756/diff/2/?file=344107#file344107line79>
> >
> >     what is the use case for this reversed output?
> 
> Armando wrote:
>     Honestly, I don't know. Inspecting the code of IdWithValueTextOutputFormat I assumed that having that use-case could be useful to someone. Can be dropped, just let me know

I also wondered about the uses cases, but I don't think it will make any damage, but potentially could help people who'd find one.


- Claudio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1099
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1099>
> >
> >     Should we track and add something in here about number of edges saved instead of just vertices, seeing as this is all about edge writing?

something like this is acceptable?

LoggerUtils.setStatusAndLog(getContext(), LOG, Level.INFO,         
                        "saveEdges: Saved " + edges +                                  
                        " edges out of " + partition.getEdgeCount() +                  
                        " partition edges, on partition " + partitionIndex +           
                        " out of " + numPartitions);


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java, line 54
> > <https://reviews.apache.org/r/13756/diff/2/?file=344106#file344106line54>
> >
> >     Giraph style (and Java in general) is to define things where they're used / needed rather than all up top like in C. Please fix.
> 
> Armando wrote:
>     I am just wondering: I usually declare a variable outside e.g. an if-statement if I need it both in the if-clause and in the else-clause. Is it ok or should I define it when used separately in both clauses?

In general, I think it's fine in both cases. The "definition" would want the variable to be declared inside the scope where it is used, hence inside both blocks (given that it is not used outside). I think these are details though. The point Nitay was making is that the declaration of a variable should be close to where the variable is used, so that people don't have to scroll pages of code to see the variable type. Once this condition is matched, you are more or less free.


- Claudio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Nitay Joffe <ni...@apache.org>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1056
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1056>
> >
> >     remove final modifiers here, we don't use them in method variables
> 
> Armando wrote:
>     I fixing this also in the method "saveVertices" which had them there.
> 
> Armando wrote:
>     re-checking: the final here is needed to use the variables inside the CallableFactory anonymous class.

Ah ok didn't notice that, in that case it's fine.


- Nitay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 26, 2013, 12:17 p.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2013, 12:17 p.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java bd69586 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java, line 82
> > <https://reviews.apache.org/r/13756/diff/2/?file=344107#file344107line82>
> >
> >     Use a StringBuilder instead of all this first/second/third stuff.

I am fixing this also for IdWithValueTextOutputFormat, if this is ok.


> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java, line 54
> > <https://reviews.apache.org/r/13756/diff/2/?file=344106#file344106line54>
> >
> >     Giraph style (and Java in general) is to define things where they're used / needed rather than all up top like in C. Please fix.

I am just wondering: I usually declare a variable outside e.g. an if-statement if I need it both in the if-clause and in the else-clause. Is it ok or should I define it when used separately in both clauses?


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

On Aug. 25, 2013, 8:30 p.m., Armando wrote:
> > Overall logic looks good, but there is lots of style issues. Please fix them. Did you run checkstyle and everything by the way?

Yes I did, and it passed. Maybe some additional checkes could be useful (actually I ran mvn verify also and passed and it looked to me that it compiles and runs also checkstyle but anyway I did also run it). I'll fix all these things. Thanks for the feedback.


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java, line 1056
> > <https://reviews.apache.org/r/13756/diff/2/?file=344112#file344112line1056>
> >
> >     remove final modifiers here, we don't use them in method variables

I fixing this also in the method "saveVertices" which had them there.


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Armando <a....@student.vu.nl>.

> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java, line 79
> > <https://reviews.apache.org/r/13756/diff/2/?file=344107#file344107line79>
> >
> >     what is the use case for this reversed output?

Honestly, I don't know. Inspecting the code of IdWithValueTextOutputFormat I assumed that having that use-case could be useful to someone. Can be dropped, just let me know


> On Aug. 25, 2013, 8:30 p.m., Nitay Joffe wrote:
> > giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java, line 87
> > <https://reviews.apache.org/r/13756/diff/2/?file=344113#file344113line87>
> >
> >     srcIdDstIdEdgeValueTestWorker, only classes begin with upper case

True, mistake.


- Armando


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Nitay Joffe <ni...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25523
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49963>

    Giraph style (and Java in general) is to define things where they're used / needed rather than all up top like in C. Please fix.



giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49962>

    inline this with above.



giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49964>

    inline with above



giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49966>

    what is the use case for this reversed output?



giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49965>

    Use a StringBuilder instead of all this first/second/third stuff.



giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49968>

    This whole function should be one line:
    return EDGE_OUTPUT_FORMAT_SUBDIR.get(getConf());



giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49969>

    return VERTEX_OUTPUT_FORMAT_SUBDIR.get(getConf());



giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49971>

    final



giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49970>

    remove space



giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49972>

    remove extra spaces in all these methods



giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
<https://reviews.apache.org/r/13756/#comment49973>

    remove final modifiers here, we don't use them in method variables



giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
<https://reviews.apache.org/r/13756/#comment49974>

    Should we track and add something in here about number of edges saved instead of just vertices, seeing as this is all about edge writing?



giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java
<https://reviews.apache.org/r/13756/#comment49975>

    srcIdDstIdEdgeValueTestWorker, only classes begin with upper case


Overall logic looks good, but there is lots of style issues. Please fix them. Did you run checkstyle and everything by the way?

- Nitay Joffe


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>


Re: Review Request 13756: GIRAPH-732: EdgeOutputFormat API

Posted by Claudio Martella <cl...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13756/#review25517
-----------------------------------------------------------

Ship it!


Ship It!

- Claudio Martella


On Aug. 23, 2013, 1:34 a.m., Armando wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13756/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2013, 1:34 a.m.)
> 
> 
> Review request for giraph, Armando, Claudio Martella, and Nitay Joffe.
> 
> 
> Bugs: GIRAPH-732
>     https://issues.apache.org/jira/browse/GIRAPH-732
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch is a possible solution to provide an EdgeOutputFormat API in Giraph.
> The idea is to add the same logic as the one for the vertices for edges. For this reason I add saveEdges after saveVertices.
> This requires the user to have to different paths for the resulting files. For this reason I have implemented TextEdgeOutputFormat
> and modified TextVertexOutputFormat so that a subdirectory can be specified.
> The code I provide should be compliant with previous versions of hadoop as well as YARN (I was yet not able to test it on YARN).
> It is also retro-compatible since when not specified, TextVertexOutputFormat behaves as before.
> I am also providing an actual usable implementation with the associated tests (SrcIdDstIdEdgeValueOutputFormat).
> Also very interesting is the fact that this implementation is totally transparent to the classes implementing TextVertexOutputFormat.
> 
> passed "maven verify"
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 1bd79b5 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 71fe885 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 23bcd32 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c276c2a 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 49a2ebc 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/EdgeWriter.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/GiraphTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/SrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java c91d543 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeOutputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java 745764b 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java da1e7fb 
>   giraph-core/src/test/java/org/apache/giraph/io/TestSrcIdDstIdEdgeValueTextOutputFormat.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13756/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Armando
> 
>