You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Maja Kabiljo <ma...@fb.com> on 2013/04/21 19:40:26 UTC
Review Request: GIRAPH-648: Allow IO formats to add parameters to
Configuration
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10690/
-----------------------------------------------------------
Review request for giraph.
Description
-------
Currently we heavily rely on some runners (HCatGiraphRunner and HiveGiraphRunner) to prepare Configuration before application starts, and we have no way of using hcat/hive io without these runners. It would be better and more flexible if io formats would add what's needed for underlying io to Configuration themselves.
Unfortunately this is not as straightforward as it sounds, because methods from io formats, readers/writers/OutputCommitter have JobContext or TaskAttemptContext as an argument, and in some cases those hold the copy of Configuration, not the original. So I added a way to track which parameters where added to GiraphConfiguration, and wrapped all io related calls to append those parameters to JobContext/TaskAttemptContext before passing control to actual io formats.
Cleaned up HiveGiraphRunner and moved all control to its io formats, I can do similar for HCatalog in a separate patch.
This will also help us do GIRAPH-639 in a cleaner way, and it will actually be possible to mix different kind of input formats (hcat, hive, hbase, or whatever).
This addresses bug GIRAPH-648.
https://issues.apache.org/jira/browse/GIRAPH-648
Diffs
-----
giraph-core/src/main/java/org/apache/giraph/bsp/BspOutputFormat.java 574895c
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java 43cc7be
giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java b3f234f
giraph-core/src/main/java/org/apache/giraph/io/VertexOutputFormat.java 71eb665
giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeInputFormat.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedVertexInputFormat.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedVertexOutputFormat.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/internal/package-info.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/superstep_output/MultiThreadedSuperstepOutput.java af086e1
giraph-core/src/main/java/org/apache/giraph/io/superstep_output/SynchronizedSuperstepOutput.java 2a7af29
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java c426032
giraph-examples/src/test/java/org/apache/giraph/TestBspBasic.java e034b2f
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java 097aeef
giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java 45c9ca3
giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java 0215428
Diff: https://reviews.apache.org/r/10690/diff/
Testing
-------
mvn clean verify
Real application run with hive io
Thanks,
Maja Kabiljo
Re: Review Request: GIRAPH-648: Allow IO formats to add parameters to
Configuration
Posted by Nitay Joffe <ni...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10690/#review19593
-----------------------------------------------------------
Ship it!
Ship It!
- Nitay Joffe
On April 21, 2013, 5:40 p.m., Maja Kabiljo wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10690/
> -----------------------------------------------------------
>
> (Updated April 21, 2013, 5:40 p.m.)
>
>
> Review request for giraph.
>
>
> Description
> -------
>
> Currently we heavily rely on some runners (HCatGiraphRunner and HiveGiraphRunner) to prepare Configuration before application starts, and we have no way of using hcat/hive io without these runners. It would be better and more flexible if io formats would add what's needed for underlying io to Configuration themselves.
>
> Unfortunately this is not as straightforward as it sounds, because methods from io formats, readers/writers/OutputCommitter have JobContext or TaskAttemptContext as an argument, and in some cases those hold the copy of Configuration, not the original. So I added a way to track which parameters where added to GiraphConfiguration, and wrapped all io related calls to append those parameters to JobContext/TaskAttemptContext before passing control to actual io formats.
>
> Cleaned up HiveGiraphRunner and moved all control to its io formats, I can do similar for HCatalog in a separate patch.
>
> This will also help us do GIRAPH-639 in a cleaner way, and it will actually be possible to mix different kind of input formats (hcat, hive, hbase, or whatever).
>
>
> This addresses bug GIRAPH-648.
> https://issues.apache.org/jira/browse/GIRAPH-648
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/bsp/BspOutputFormat.java 574895c
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
> giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
> giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java 43cc7be
> giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java b3f234f
> giraph-core/src/main/java/org/apache/giraph/io/VertexOutputFormat.java 71eb665
> giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeInputFormat.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedVertexInputFormat.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedVertexOutputFormat.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/internal/package-info.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/superstep_output/MultiThreadedSuperstepOutput.java af086e1
> giraph-core/src/main/java/org/apache/giraph/io/superstep_output/SynchronizedSuperstepOutput.java 2a7af29
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
> giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java c426032
> giraph-examples/src/test/java/org/apache/giraph/TestBspBasic.java e034b2f
> giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
> giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
> giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
> giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java PRE-CREATION
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
> giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java 097aeef
> giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java 45c9ca3
> giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java 0215428
>
> Diff: https://reviews.apache.org/r/10690/diff/
>
>
> Testing
> -------
>
> mvn clean verify
> Real application run with hive io
>
>
> Thanks,
>
> Maja Kabiljo
>
>