You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Pavan Kumar Athivarapu <pa...@outlook.com> on 2014/06/04 17:11:48 UTC

Review Request 22234: support for partitioned input in giraph

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/
-----------------------------------------------------------

Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.


Repository: giraph-git


Description
-------

There are many changes here:

A new input format - MappingInputFormat - and related dependencies has been defined
New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
New mapping section defined to define & declare MappingStore format & some sample implementations
The code can take 2 paths based on what the user wants

Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]

Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]


Diffs
-----

  giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
  giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
  giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
  giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
  giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
  giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
  giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
  giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
  giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
  giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
  giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
  giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
  giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
  giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 

Diff: https://reviews.apache.org/r/22234/diff/


Testing
-------

ran pagerank jobs multiple times
mvn clean verify


Thanks,

Pavan Kumar Athivarapu


Re: Review Request 22234: support for partitioned input in giraph

Posted by Igor Kabiljo <ik...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44804
-----------------------------------------------------------

Ship it!


Cool, all looks good.


- Igor Kabiljo


On June 5, 2014, 6:47 a.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 5, 2014, 6:47 a.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/LongMappingStorePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>


Re: Review Request 22234: support for partitioned input in giraph

Posted by Pavan Kumar Athivarapu <pa...@outlook.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/
-----------------------------------------------------------

(Updated June 5, 2014, 6:47 a.m.)


Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.


Changes
-------

address comments.


Repository: giraph-git


Description
-------

There are many changes here:

A new input format - MappingInputFormat - and related dependencies has been defined
New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
New mapping section defined to define & declare MappingStore format & some sample implementations
The code can take 2 paths based on what the user wants

Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]

Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]


Diffs (updated)
-----

  giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
  giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
  giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
  giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
  giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
  giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
  giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
  giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
  giraph-core/src/main/java/org/apache/giraph/partition/LongMappingStorePartitionerFactory.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
  giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
  giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
  giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
  giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
  giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
  giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
  giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
  giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 

Diff: https://reviews.apache.org/r/22234/diff/


Testing
-------

ran pagerank jobs multiple times
mvn clean verify


Thanks,

Pavan Kumar Athivarapu


Re: Review Request 22234: support for partitioned input in giraph

Posted by Igor Kabiljo <ik...@fb.com>.

> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java, line 47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603360#file603360line47>
> >
> >     it is not clear that unset can ever happen - all vertex IDs and edges are translated (with target being -1 here)
> >     
> >     I assume you need to add 1, so that if ID is not in the map is treated the same as if embedTargetInfo was never called (i.e. user creating vertices with specific IDs later)
> 
> Pavan Kumar Athivarapu wrote:
>     sorry, I mean not-set

yeah I understood that - but it is not clear that things can ever be not-set - all input is going to be transformed (as oppose to just not being in the mapping, which will return -1, which can be encoded and read as -1 without having +1)


> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java, lines 39-47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603359#file603359line39>
> >
> >     as is, you don't need to override these functions?
> >     
> >     Also, it might be better to add:
> >     byte getEmbededInfo(long id),
> >     and have implementation of getPartition.
> >     
> >     (you could also implement above two methods, through two abstract primitive methods:
> >     long embedTargetInfo(long id);
> >     long removeTargetInfo(long id);
> 
> Pavan Kumar Athivarapu wrote:
>     If u see subclasses of AbstractLongByteOps -
>     DefaultEmbeddedLongByteOps - this implements both embedTargetInfo & removeTargetInfo, as it needs them
>     but DefaultLongByteOps - need not, because it computes partition by looking up the MappingStore everytime it is called
>     
>     sure byte getEmbededInfo(long id) was in the initial implementation I had, just merged it into getPartition, since that is the only place which uses it
>     
>     so if u want it to return long, that becomes very restrictive, what about generic vertexIds, which are not longs, the interface MappingStoreOps needs to account for all those as well right?
>

Well, I would then put those unsupported implementations inside of DefaultLongByteOps, instead of in AbstractLongByteOps.

Also, seems like it might be better to add MappingStoreOps function boolean hasEmbedding, instead of having flags giraph.embedInfoVertexIds and giraph.edgeTranslationClass. (better not to have redudnant flags)
I.e. if hasEmbedding of active MappingStoreOps returns true, embedding should be used throughout. 


- Igor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44768
-----------------------------------------------------------


On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 4, 2014, 3:11 p.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>


Re: Review Request 22234: support for partitioned input in giraph

Posted by Pavan Kumar Athivarapu <pa...@outlook.com>.

> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java, line 47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603360#file603360line47>
> >
> >     it is not clear that unset can ever happen - all vertex IDs and edges are translated (with target being -1 here)
> >     
> >     I assume you need to add 1, so that if ID is not in the map is treated the same as if embedTargetInfo was never called (i.e. user creating vertices with specific IDs later)
> 
> Pavan Kumar Athivarapu wrote:
>     sorry, I mean not-set
> 
> Igor Kabiljo wrote:
>     yeah I understood that - but it is not clear that things can ever be not-set - all input is going to be transformed (as oppose to just not being in the mapping, which will return -1, which can be encoded and read as -1 without having +1)

but what about stuff created later - that is not in the mapping, so it would have prefix bits 0, now 0 for them does not mean all such vertices belong in worker 0
and such created vertices are not translated, translation is done only in input phase


> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java, lines 39-47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603359#file603359line39>
> >
> >     as is, you don't need to override these functions?
> >     
> >     Also, it might be better to add:
> >     byte getEmbededInfo(long id),
> >     and have implementation of getPartition.
> >     
> >     (you could also implement above two methods, through two abstract primitive methods:
> >     long embedTargetInfo(long id);
> >     long removeTargetInfo(long id);
> 
> Pavan Kumar Athivarapu wrote:
>     If u see subclasses of AbstractLongByteOps -
>     DefaultEmbeddedLongByteOps - this implements both embedTargetInfo & removeTargetInfo, as it needs them
>     but DefaultLongByteOps - need not, because it computes partition by looking up the MappingStore everytime it is called
>     
>     sure byte getEmbededInfo(long id) was in the initial implementation I had, just merged it into getPartition, since that is the only place which uses it
>     
>     so if u want it to return long, that becomes very restrictive, what about generic vertexIds, which are not longs, the interface MappingStoreOps needs to account for all those as well right?
>
> 
> Igor Kabiljo wrote:
>     Well, I would then put those unsupported implementations inside of DefaultLongByteOps, instead of in AbstractLongByteOps.
>     
>     Also, seems like it might be better to add MappingStoreOps function boolean hasEmbedding, instead of having flags giraph.embedInfoVertexIds and giraph.edgeTranslationClass. (better not to have redudnant flags)
>     I.e. if hasEmbedding of active MappingStoreOps returns true, embedding should be used throughout. 
>

makes total sense, this mappingstoreops was a last minute re-org, so did not modify the original structure reg, embedInfo flag.
I made all the changes, any other comments? I'll prepare a diff & put it up addressing all comments at once.


- Pavan Kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44768
-----------------------------------------------------------


On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 4, 2014, 3:11 p.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>


Re: Review Request 22234: support for partitioned input in giraph

Posted by Igor Kabiljo <ik...@fb.com>.

> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java, line 47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603360#file603360line47>
> >
> >     it is not clear that unset can ever happen - all vertex IDs and edges are translated (with target being -1 here)
> >     
> >     I assume you need to add 1, so that if ID is not in the map is treated the same as if embedTargetInfo was never called (i.e. user creating vertices with specific IDs later)
> 
> Pavan Kumar Athivarapu wrote:
>     sorry, I mean not-set
> 
> Igor Kabiljo wrote:
>     yeah I understood that - but it is not clear that things can ever be not-set - all input is going to be transformed (as oppose to just not being in the mapping, which will return -1, which can be encoded and read as -1 without having +1)
> 
> Pavan Kumar Athivarapu wrote:
>     but what about stuff created later - that is not in the mapping, so it would have prefix bits 0, now 0 for them does not mean all such vertices belong in worker 0
>     and such created vertices are not translated, translation is done only in input phase

I meant just add more explanation in the comments, leave the code as is. 


> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java, lines 39-47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603359#file603359line39>
> >
> >     as is, you don't need to override these functions?
> >     
> >     Also, it might be better to add:
> >     byte getEmbededInfo(long id),
> >     and have implementation of getPartition.
> >     
> >     (you could also implement above two methods, through two abstract primitive methods:
> >     long embedTargetInfo(long id);
> >     long removeTargetInfo(long id);
> 
> Pavan Kumar Athivarapu wrote:
>     If u see subclasses of AbstractLongByteOps -
>     DefaultEmbeddedLongByteOps - this implements both embedTargetInfo & removeTargetInfo, as it needs them
>     but DefaultLongByteOps - need not, because it computes partition by looking up the MappingStore everytime it is called
>     
>     sure byte getEmbededInfo(long id) was in the initial implementation I had, just merged it into getPartition, since that is the only place which uses it
>     
>     so if u want it to return long, that becomes very restrictive, what about generic vertexIds, which are not longs, the interface MappingStoreOps needs to account for all those as well right?
>
> 
> Igor Kabiljo wrote:
>     Well, I would then put those unsupported implementations inside of DefaultLongByteOps, instead of in AbstractLongByteOps.
>     
>     Also, seems like it might be better to add MappingStoreOps function boolean hasEmbedding, instead of having flags giraph.embedInfoVertexIds and giraph.edgeTranslationClass. (better not to have redudnant flags)
>     I.e. if hasEmbedding of active MappingStoreOps returns true, embedding should be used throughout. 
>
> 
> Pavan Kumar Athivarapu wrote:
>     makes total sense, this mappingstoreops was a last minute re-org, so did not modify the original structure reg, embedInfo flag.
>     I made all the changes, any other comments? I'll prepare a diff & put it up addressing all comments at once.

Cool - everything else looks good.


- Igor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44768
-----------------------------------------------------------


On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 4, 2014, 3:11 p.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>


Re: Review Request 22234: support for partitioned input in giraph

Posted by Pavan Kumar Athivarapu <pa...@outlook.com>.

> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java, line 47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603360#file603360line47>
> >
> >     it is not clear that unset can ever happen - all vertex IDs and edges are translated (with target being -1 here)
> >     
> >     I assume you need to add 1, so that if ID is not in the map is treated the same as if embedTargetInfo was never called (i.e. user creating vertices with specific IDs later)

sorry, I mean not-set


> On June 4, 2014, 10:53 p.m., Igor Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java, lines 39-47
> > <https://reviews.apache.org/r/22234/diff/1/?file=603359#file603359line39>
> >
> >     as is, you don't need to override these functions?
> >     
> >     Also, it might be better to add:
> >     byte getEmbededInfo(long id),
> >     and have implementation of getPartition.
> >     
> >     (you could also implement above two methods, through two abstract primitive methods:
> >     long embedTargetInfo(long id);
> >     long removeTargetInfo(long id);

If u see subclasses of AbstractLongByteOps -
DefaultEmbeddedLongByteOps - this implements both embedTargetInfo & removeTargetInfo, as it needs them
but DefaultLongByteOps - need not, because it computes partition by looking up the MappingStore everytime it is called

sure byte getEmbededInfo(long id) was in the initial implementation I had, just merged it into getPartition, since that is the only place which uses it

so if u want it to return long, that becomes very restrictive, what about generic vertexIds, which are not longs, the interface MappingStoreOps needs to account for all those as well right?


- Pavan Kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44768
-----------------------------------------------------------


On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 4, 2014, 3:11 p.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>


Re: Review Request 22234: support for partitioned input in giraph

Posted by Igor Kabiljo <ik...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44768
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java
<https://reviews.apache.org/r/22234/#comment79311>

    as is, you don't need to override these functions?
    
    Also, it might be better to add:
    byte getEmbededInfo(long id),
    and have implementation of getPartition.
    
    (you could also implement above two methods, through two abstract primitive methods:
    long embedTargetInfo(long id);
    long removeTargetInfo(long id);



giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java
<https://reviews.apache.org/r/22234/#comment79312>

    it is not clear that unset can ever happen - all vertex IDs and edges are translated (with target being -1 here)
    
    I assume you need to add 1, so that if ID is not in the map is treated the same as if embedTargetInfo was never called (i.e. user creating vertices with specific IDs later)


- Igor Kabiljo


On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 4, 2014, 3:11 p.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>


Re: Review Request 22234: support for partitioned input in giraph

Posted by Igor Kabiljo <ik...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22234/#review44791
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java
<https://reviews.apache.org/r/22234/#comment79329>

    Lets rename it to something meaningful - MappingStoreLongPartitionerFactory
    
    There is no reason to have ByteWritable anywhere in it, just put '?'. 


- Igor Kabiljo


On June 4, 2014, 3:11 p.m., Pavan Kumar Athivarapu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22234/
> -----------------------------------------------------------
> 
> (Updated June 4, 2014, 3:11 p.m.)
> 
> 
> Review request for giraph, Avery Ching, Sergey Edunov, Igor Kabiljo, and Maja Kabiljo.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> There are many changes here:
> 
> A new input format - MappingInputFormat - and related dependencies has been defined
> New hive input format classes to read Mapping table + some examples which can be used for running hellopagerank with modifications in test plan have been defined
> Changes to main giraph classes to read the mapping & use it for getting partition info - in worker, master & partition sections
> New mapping section defined to define & declare MappingStore format & some sample implementations
> The code can take 2 paths based on what the user wants
> 
> Embed info into vertexId by implementing proper contracts (TranslateEdge.interface + 2 methods in MappingStore)
> then use an EmbeddedGraphPartitioner to directly read worker info off of vertex ids [translate once & user freely for the rest of app]
> 
> Always read from the MappingStore (this has more overhead probably because of cache misses in processor) [never translate but pay cost of map lookup each time partition / worker info is needed]
> 
> 
> Diffs
> -----
> 
>   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java ec0ddbb 
>   giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java bda967d 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java 3337621 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 6b36418 
>   giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 95e029d 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/MappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingInputFormat.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/internal/WrappedMappingReader.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/io/iterables/MappingReaderWrapper.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/AbstractLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultEmbeddedLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/DefaultLongByteOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/LongByteMappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingEntry.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStore.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/MappingStoreOps.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/LongByteTranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/TranslateEdge.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/mapping/translate/package-info.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 90dc9f3 
>   giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 15dbe07 
>   giraph-core/src/main/java/org/apache/giraph/partition/DefaultLongBytePartitionerFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/partition/GraphPartitionerFactory.java 4200d79 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashPartitionerFactory.java 7cc5651 
>   giraph-core/src/main/java/org/apache/giraph/partition/HashRangePartitionerFactory.java 1eeece7 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java 8ab692f 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java 2989598 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimplePartitionerFactory.java 15b0756 
>   giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 600d7a3 
>   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java aff7084 
>   giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java 828eac4 
>   giraph-core/src/main/java/org/apache/giraph/worker/FullInputSplitCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/LocalData.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallable.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/MappingInputSplitsCallableFactory.java PRE-CREATION 
>   giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java e3e04d6 
>   giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 4e19cd2 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 603910b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstants.java c7ad63b 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 2388673 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/AbstractHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingInputFormat.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveMappingReader.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/HiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/SimpleHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/LongInt2ByteHiveToMapping.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/examples/package-info.java PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/mapping/package-info.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22234/diff/
> 
> 
> Testing
> -------
> 
> ran pagerank jobs multiple times
> mvn clean verify
> 
> 
> Thanks,
> 
> Pavan Kumar Athivarapu
> 
>