You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by ok...@apache.org on 2015/04/06 15:33:01 UTC

incubator-tinkerpop git commit: tested Hadoop-Gremlin plugin and updated docs accordingly. A few twiddels here and there on the OLAP classes.

Repository: incubator-tinkerpop
Updated Branches:
  refs/heads/master fd6013178 -> 41be51fb0


tested Hadoop-Gremlin plugin and updated docs accordingly. A few twiddels here and there on the OLAP classes.


Project: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/commit/41be51fb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/tree/41be51fb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/diff/41be51fb

Branch: refs/heads/master
Commit: 41be51fb08bbe0d9c327db1d462456175b0c7598
Parents: fd60131
Author: Marko A. Rodriguez <ok...@gmail.com>
Authored: Mon Apr 6 07:32:43 2015 -0600
Committer: Marko A. Rodriguez <ok...@gmail.com>
Committed: Mon Apr 6 07:32:58 2015 -0600

----------------------------------------------------------------------
 docs/src/implementations.asciidoc                            | 8 ++++----
 .../tinkerpop/gremlin/process/computer/VertexProgram.java    | 2 --
 .../gremlin/hadoop/process/computer/spark/SparkExecutor.java | 3 +--
 .../hadoop/process/computer/spark/SparkMessenger.java        | 2 +-
 4 files changed, 6 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/41be51fb/docs/src/implementations.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/implementations.asciidoc b/docs/src/implementations.asciidoc
index 8341247..56ec85e 100644
--- a/docs/src/implementations.asciidoc
+++ b/docs/src/implementations.asciidoc
@@ -452,9 +452,9 @@ Installing Hadoop-Gremlin
 To the `.bash_profile` file, add the following environmental variable (of course, be sure the directories are respective of the local machine locations). The `HADOOP_GREMLIN_LIBS` is the location of all the Hadoop-Gremlin jars. It is possible to place developer jars into this directory for loading into the Hadoop job's classpath. Or, better yet, note that `HADOOP_GREMLIN_LIBS` can be a colon-separated (`:`) list of locations and thus will load all jars into the cluster at all provided locations.
 
 [source,shell]
-export HADOOP_GREMLIN_LIBS=/usr/local/gremlin-console/ext/hadoop-gremlin:/usr/local/gremlin-console/lib
+export HADOOP_GREMLIN_LIBS=/usr/local/gremlin-console/ext/hadoop-gremlin/lib
 
-If using <<gremlin-console,Gremlin Console>>, it is important to install the Hadoop-Gremlin plugin. Note that Hadoop-Gremlin requires a Gremlin Console restart after installing,
+If using <<gremlin-console,Gremlin Console>>, it is important to install the Hadoop-Gremlin plugin. Note that Hadoop-Gremlin requires a Gremlin Console restart after installing.
 
 [source,text]
 ----
@@ -525,7 +525,7 @@ A review of the Hadoop-Gremlin specific properties are provided in the table bel
 |gremlin.hadoop.graphInputFormat |The format that the graph input file(s) are represented in.
 |gremlin.hadoop.outputLocation |The location to write the computed HadoopGraph to.
 |gremlin.hadoop.graphOutputFormat |The format that the output file(s) should be represented in.
-|gremlin.hadoop.memoryOutputFormat |The format of any resultant GraphComputer Memory.
+|gremlin.hadoop.memoryOutputFormat |The format of any resultant GraphComputer Memory (best to always have this be `SequenceFileOutputFormat`).
 |gremlin.hadoop.jarsInDistributedCache |Whether to upload the Hadoop-Gremlin jars to Hadoop's distributed cache (necessary if jars are not on machines' classpaths).
 |=========================================================
 
@@ -572,7 +572,7 @@ A `Graph` in TinkerPop3 can support any number of `GraphComputer` implementation
 * <<giraphgraphcomputer,`GiraphGraphComputer`>>: Leverages Giraph to execute TinkerPop3 OLAP computations.
 ** The graph should fit within the total RAM of the Hadoop cluster (graph size restriction), though "out-of-core" processing is possible. Messages passing is coordinated via ZooKeeper for the in-memory graph (speedy traversals).
 * <<sparkgraphcomputer,`SparkGraphComputer`>>: Leverages Spark to execute TinkerPop3 OLAP computations.
-** The graph may fit within the total RAM of the cluster (supports larger graphs). Message passing is coordinated via Spark map/reduce operations on in-memory and disk-cached data (middle speed traversals).
+** The graph may fit within the total RAM of the cluster (supports larger graphs). Message passing is coordinated via Spark map/reduce/join operations on in-memory and disk-cached data (average speed traversals).
 * <<mapreducegraphcomputer,`MapReduceGraphComputer`>>: Leverages Hadoop's MapReduce to execute TinkerPop3 OLAP computations. (*coming soon*)
 ** The graph must fit within the total disk space of the Hadoop cluster (supports massive graphs). Message passing is coordinated via MapReduce jobs over the on-disk graph (slow traversals).
 

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/41be51fb/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/VertexProgram.java
----------------------------------------------------------------------
diff --git a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/VertexProgram.java b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/VertexProgram.java
index 06f5f4d..533ea60 100644
--- a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/VertexProgram.java
+++ b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/VertexProgram.java
@@ -45,8 +45,6 @@ public interface VertexProgram<M> extends Cloneable {
 
     public static final String VERTEX_PROGRAM = "gremlin.vertexProgram";
 
-    public enum Status {SUCCESS, FAILURE}
-
     /**
      * When it is necessary to store the state of the VertexProgram, this method is called.
      * This is typically required when the VertexProgram needs to be serialized to another machine.

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/41be51fb/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkExecutor.java
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkExecutor.java b/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkExecutor.java
index c7a18dd..00ae3ab 100644
--- a/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkExecutor.java
+++ b/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkExecutor.java
@@ -127,10 +127,9 @@ public final class SparkExecutor {
                         (ViewIncomingPayload<M>) payload :                    // this happens if there is a vertex with incoming messages
                         new ViewIncomingPayload<>((ViewPayload) payload));    // this happens if there is a vertex with no incoming messages
 
-        newViewIncomingRDD // TODO? .cache() // cache so there is no history recomputation which can effect the computer memory
+        newViewIncomingRDD
                 .foreachPartition(partitionIterator -> {
                 }); // need to complete a task so its BSP and the memory for this iteration is updated
-        // TODO? if(null != viewIncomingRDD) viewIncomingRDD.unpersist(); // unpersist the previous view and messages at this point because they are no longer needed
         return newViewIncomingRDD;
     }
 

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/41be51fb/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkMessenger.java
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkMessenger.java b/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkMessenger.java
index 49847de..5d9223c 100644
--- a/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkMessenger.java
+++ b/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/process/computer/spark/SparkMessenger.java
@@ -36,7 +36,7 @@ import java.util.List;
 /**
  * @author Marko A. Rodriguez (http://markorodriguez.com)
  */
-public class SparkMessenger<M> implements Messenger<M> {
+public final class SparkMessenger<M> implements Messenger<M> {
 
     private Vertex vertex;
     private Iterable<M> incomingMessages;