You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Maja Kabiljo <ma...@fb.com> on 2013/04/17 00:42:55 UTC
Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/
-----------------------------------------------------------
Review request for giraph.
Description
-------
For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
This addresses bug GIRAPH-639.
https://issues.apache.org/jira/browse/GIRAPH-639
Diffs
-----
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
giraph-core/src/main/java/org/apache/giraph/io/InputFormatWithIndex.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 09476cd
Diff: https://reviews.apache.org/r/10564/diff/
Testing
-------
mvn clean verify
Run application with two edge input tables - verified results.
Thanks,
Maja Kabiljo
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Alessandro Presta <al...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/#review19954
-----------------------------------------------------------
This is looking much cleaner now.
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/EdgeInputFormatDescription.java
<https://reviews.apache.org/r/10564/#comment41184>
I would specify in the javadoc that this is only relevant to multiple inputs (I know it's in "multi", but it would make it clearer).
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/EdgeInputFormatDescription.java
<https://reviews.apache.org/r/10564/#comment41187>
"ret" is not a great name for a variable.
How about "edgeInputFormats"?
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/InputFormatDescription.java
<https://reviews.apache.org/r/10564/#comment41185>
Same as for EdgeInputFormatDescription.
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment41188>
I think this multitude of comma-separated options, where order is relevant, is not that robust.
I'd prefer a representation with tuples (format, table, filter) akin to what you did with the multi-input description in giraph-core. It doesn't have to be one JSON object. You could make it so that an arbitrary number of inputs can be specified. Example:
-edgeInput (MyFormat, my_table, my_partition)
-edgeInput (MyOtherFormat, my_other_table, my_other_partition)
...
- Alessandro Presta
On April 30, 2013, 7:05 a.m., Maja Kabiljo wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10564/
> -----------------------------------------------------------
>
> (Updated April 30, 2013, 7:05 a.m.)
>
>
> Review request for giraph.
>
>
> Description
> -------
>
> For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
> Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
>
>
> This addresses bug GIRAPH-639.
> https://issues.apache.org/jira/browse/GIRAPH-639
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 795cd0c
> giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java f5a926f
> giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java 2aac1f0
> giraph-core/src/main/java/org/apache/giraph/io/GiraphInputFormat.java 13efc93
> giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java c4d7fe2
> giraph-core/src/main/java/org/apache/giraph/io/formats/multi/EdgeInputFormatDescription.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/formats/multi/InputFormatDescription.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/formats/multi/MultiEdgeInputFormat.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/formats/multi/package-info.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeInputFormat.java 9b14727
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 2c1a679
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 01937ab
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java de1e774
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
> giraph-core/src/main/java/org/apache/giraph/worker/InputSplitsCallable.java a3a9ab7
> giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java 224856e
> giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallableFactory.java 4eff3b8
> giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 8d67c1d
>
> Diff: https://reviews.apache.org/r/10564/diff/
>
>
> Testing
> -------
>
> mvn clean verify
> Run application with two edge input tables - verified results.
>
>
> Thanks,
>
> Maja Kabiljo
>
>
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/
-----------------------------------------------------------
(Updated April 30, 2013, 7:05 a.m.)
Review request for giraph.
Changes
-------
Updated per offline discussion with Alessandro - instead of having infrastructure handle several input formats, we can create special input format which wraps several input formats. Some things from the infrastructure needed to be changed in order to make this work, like moving the read/write InputSplit functionality to input formats. The regular use case is now unchanged, HiveGiraphRunner has direct support for several input formats.
Description
-------
For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
This addresses bug GIRAPH-639.
https://issues.apache.org/jira/browse/GIRAPH-639
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 795cd0c
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java f5a926f
giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java 2aac1f0
giraph-core/src/main/java/org/apache/giraph/io/GiraphInputFormat.java 13efc93
giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java c4d7fe2
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/EdgeInputFormatDescription.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/InputFormatDescription.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/MultiEdgeInputFormat.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/formats/multi/package-info.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedEdgeInputFormat.java 9b14727
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 2c1a679
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 01937ab
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java de1e774
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
giraph-core/src/main/java/org/apache/giraph/worker/InputSplitsCallable.java a3a9ab7
giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java 224856e
giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallableFactory.java 4eff3b8
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 8d67c1d
Diff: https://reviews.apache.org/r/10564/diff/
Testing
-------
mvn clean verify
Run application with two edge input tables - verified results.
Thanks,
Maja Kabiljo
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/
-----------------------------------------------------------
(Updated April 25, 2013, 12:27 a.m.)
Review request for giraph.
Changes
-------
I revisited my approach - instead of each input format having to know its index and having configuration parameters set based on that index, I am putting all input format specific settings inside of one Configuration parameter, in JSON format, like
giraph.edgeInputFormatDescriptions=[["EIF1",{"p":"v1"}],["EIF2",{"p":"v2","q":"v"}]]
This would mean that we are using two edge input formats - EIF1 and EIF2. What happens in the infrastructure is that when EIF1 is created we also create a copy of Configuration in which we set p=v1,
and for EIF2 p=v2 and q=v. This means that input formats can just call conf.get("p") and they will be getting the parameters they need - they don't need to know anything about existence of other input formats or their index.
Again I did only edge input part, and added support only in HiveGiraphRunner, hoping for constructive feedback before proceeding :-)
Description
-------
For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
This addresses bug GIRAPH-639.
https://issues.apache.org/jira/browse/GIRAPH-639
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 795cd0c
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java f5a926f
giraph-core/src/main/java/org/apache/giraph/io/internal/EdgeInputFormatDescription.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/io/internal/InputFormatDescription.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 2c1a679
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 01937ab
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java de1e774
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 8d67c1d
Diff: https://reviews.apache.org/r/10564/diff/
Testing
-------
mvn clean verify
Run application with two edge input tables - verified results.
Thanks,
Maja Kabiljo
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Maja Kabiljo <ma...@fb.com>.
> On April 17, 2013, 5:11 p.m., Alessandro Presta wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java, line 664
> > <https://reviews.apache.org/r/10564/diff/1-2/?file=281712#file281712line664>
> >
> > Isn't the VertexInputFormat also always the same? Looks like we should only be logging the HiveToVertex class and input description.
> > Also, for uniformity with edge input formats, you may want to prepend the line "Vertex input format:".
As I said initially, I did everything just for edge input to get some feedback, once we agree on design I'll go ahead and make changes for vertex input too.
> On April 17, 2013, 5:11 p.m., Alessandro Presta wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java, line 47
> > <https://reviews.apache.org/r/10564/diff/1/?file=281713#file281713line47>
> >
> > Shouldn't this be "classes" too?
This is going to be used as:
giraph.hive.to.edge.class.0
giraph.hive.to.edge.class.1
and so on, so I think this name is fine.
- Maja
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/#review19322
-----------------------------------------------------------
On April 17, 2013, 4:19 a.m., Maja Kabiljo wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10564/
> -----------------------------------------------------------
>
> (Updated April 17, 2013, 4:19 a.m.)
>
>
> Review request for giraph.
>
>
> Description
> -------
>
> For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
> Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
>
>
> This addresses bug GIRAPH-639.
> https://issues.apache.org/jira/browse/GIRAPH-639
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
> giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
> giraph-core/src/main/java/org/apache/giraph/io/InputFormatWithIndex.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
> giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
> giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
> giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 09476cd
>
> Diff: https://reviews.apache.org/r/10564/diff/
>
>
> Testing
> -------
>
> mvn clean verify
> Run application with two edge input tables - verified results.
>
>
> Thanks,
>
> Maja Kabiljo
>
>
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Alessandro Presta <al...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/#review19322
-----------------------------------------------------------
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment39999>
Isn't the VertexInputFormat also always the same? Looks like we should only be logging the HiveToVertex class and input description.
Also, for uniformity with edge input formats, you may want to prepend the line "Vertex input format:".
giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java
<https://reviews.apache.org/r/10564/#comment40001>
Shouldn't this be "classes" too?
- Alessandro Presta
On April 17, 2013, 4:19 a.m., Maja Kabiljo wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10564/
> -----------------------------------------------------------
>
> (Updated April 17, 2013, 4:19 a.m.)
>
>
> Review request for giraph.
>
>
> Description
> -------
>
> For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
> Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
>
>
> This addresses bug GIRAPH-639.
> https://issues.apache.org/jira/browse/GIRAPH-639
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
> giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
> giraph-core/src/main/java/org/apache/giraph/io/InputFormatWithIndex.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
> giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
> giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
> giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 09476cd
>
> Diff: https://reviews.apache.org/r/10564/diff/
>
>
> Testing
> -------
>
> mvn clean verify
> Run application with two edge input tables - verified results.
>
>
> Thanks,
>
> Maja Kabiljo
>
>
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/
-----------------------------------------------------------
(Updated April 17, 2013, 4:19 a.m.)
Review request for giraph.
Changes
-------
Alessandro's comments
Description
-------
For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
This addresses bug GIRAPH-639.
https://issues.apache.org/jira/browse/GIRAPH-639
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
giraph-core/src/main/java/org/apache/giraph/io/InputFormatWithIndex.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 09476cd
Diff: https://reviews.apache.org/r/10564/diff/
Testing
-------
mvn clean verify
Run application with two edge input tables - verified results.
Thanks,
Maja Kabiljo
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Maja Kabiljo <ma...@fb.com>.
> On April 17, 2013, 12:37 a.m., Alessandro Presta wrote:
> > Can you add a test case with multiple edge input formats?
We have no tests in giraph-hive, and that's the only place where I added a way to set multiple edge inputs. I am not sure it makes sense to add method to InternalVertexRunner which accepts several String[], one for each edge input (and then for vertex input to).
> On April 17, 2013, 12:37 a.m., Alessandro Presta wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java, line 687
> > <https://reviews.apache.org/r/10564/diff/1/?file=281712#file281712line687>
> >
> > This is going to print "-hiveToEdgeClass=XXX" for multiple classes. I think this is confusing, seems like the same option is overridden multiple times. Probably better to rename the option to "-hiveToEdgeClasses" and print the comma-separated list.
I changed the logging a bit, now it will print this for each edge input:
Edge input format:
hiveToEdgeClass=org.apache.giraph.hive.SomeInputFormat
edgeInputTable=table_name
edgeInputFilter="ds='2013-01-01'"
The whole HiveGiraphRunner is a bit messy, I think adding something like BenchmarkOption stuff I added some time ago could make it much cleaner, but I'll leave that for a separate issue.
> On April 17, 2013, 12:37 a.m., Alessandro Presta wrote:
> > giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java, line 86
> > <https://reviews.apache.org/r/10564/diff/1/?file=281713#file281713line86>
> >
> > Is there a way you can map EdgeInputFormat classes to HiveToEdge classes without going through the index?
> > The index seems a bit low-level, and also forces Giraph to maintain the ordering.
Can you please give some details about the design which you have in mind?
- Maja
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/#review19285
-----------------------------------------------------------
On April 17, 2013, 4:19 a.m., Maja Kabiljo wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10564/
> -----------------------------------------------------------
>
> (Updated April 17, 2013, 4:19 a.m.)
>
>
> Review request for giraph.
>
>
> Description
> -------
>
> For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
> Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
>
>
> This addresses bug GIRAPH-639.
> https://issues.apache.org/jira/browse/GIRAPH-639
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
> giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
> giraph-core/src/main/java/org/apache/giraph/io/InputFormatWithIndex.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
> giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
> giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
> giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 09476cd
>
> Diff: https://reviews.apache.org/r/10564/diff/
>
>
> Testing
> -------
>
> mvn clean verify
> Run application with two edge input tables - verified results.
>
>
> Thanks,
>
> Maja Kabiljo
>
>
Re: Review Request: GIRAPH-639: Add support for multiple Vertex/Edge inputs
Posted by Alessandro Presta <al...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10564/#review19285
-----------------------------------------------------------
Can you add a test case with multiple edge input formats?
giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java
<https://reviews.apache.org/r/10564/#comment39951>
class->classes
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment39958>
If this can hold multiple classes, it should be called -hiveToEdgeClasses.
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment39956>
If this can hold multiple tables, it should be called -edgeInputTables.
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment39957>
Same here.
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment39959>
This is going to print "-hiveToEdgeClass=XXX" for multiple classes. I think this is confusing, seems like the same option is overridden multiple times. Probably better to rename the option to "-hiveToEdgeClasses" and print the comma-separated list.
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
<https://reviews.apache.org/r/10564/#comment39960>
Why do we print both the HiveToEdge class and the EdgeInputFormat class? Isn't the latter always HiveEdgeInputFormat?
giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java
<https://reviews.apache.org/r/10564/#comment39961>
Is there a way you can map EdgeInputFormat classes to HiveToEdge classes without going through the index?
The index seems a bit low-level, and also forces Giraph to maintain the ordering.
- Alessandro Presta
On April 16, 2013, 10:42 p.m., Maja Kabiljo wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10564/
> -----------------------------------------------------------
>
> (Updated April 16, 2013, 10:42 p.m.)
>
>
> Review request for giraph.
>
>
> Description
> -------
>
> For now, I did this only for Edge input, once I get some feedback I'll do the exactly same thing for vertex input.
> Also, I added direct support only to HiveGiraphRunner, we can extend it later to others as well.
>
>
> This addresses bug GIRAPH-639.
> https://issues.apache.org/jira/browse/GIRAPH-639
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 178c96f
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 4a0e8f7
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 7f9e38e
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6a5949e
> giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 8dfe546
> giraph-core/src/main/java/org/apache/giraph/io/InputFormatWithIndex.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java cc6b126
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java d01dbb4
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 029cb5d
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 037cdfc
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java afb636b
> giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallableFactory.java 4a1705b
> giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 6e40b7f
> giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java f8363b1
> giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 892d443
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java c482cf0
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java 09476cd
>
> Diff: https://reviews.apache.org/r/10564/diff/
>
>
> Testing
> -------
>
> mvn clean verify
> Run application with two edge input tables - verified results.
>
>
> Thanks,
>
> Maja Kabiljo
>
>