You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by ok...@apache.org on 2017/11/01 18:07:43 UTC
[03/14] tinkerpop git commit: Moved gremlin.spark.persistContext back to recipe

Moved gremlin.spark.persistContext back to recipe


Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/df73338f
Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/df73338f
Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/df73338f

Branch: refs/heads/master
Commit: df73338fe29a8ba1faa1c873a6ed8ff597607b60
Parents: 16f3ee7
Author: HadoopMarc <vt...@xs4all.nl>
Authored: Tue Oct 3 21:20:16 2017 +0200
Committer: HadoopMarc <vt...@xs4all.nl>
Committed: Thu Oct 12 21:55:28 2017 +0200

----------------------------------------------------------------------
 docs/src/recipes/olap-spark-yarn.asciidoc  | 6 ++++++
 hadoop-gremlin/conf/hadoop-gryo.properties | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/df73338f/docs/src/recipes/olap-spark-yarn.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/recipes/olap-spark-yarn.asciidoc b/docs/src/recipes/olap-spark-yarn.asciidoc
index 01aedcb..1bcc443 100644
--- a/docs/src/recipes/olap-spark-yarn.asciidoc
+++ b/docs/src/recipes/olap-spark-yarn.asciidoc
@@ -104,6 +104,7 @@ conf.setProperty('spark.yarn.appMasterEnv.CLASSPATH', "./__spark_libs__/*:$hadoo
 conf.setProperty('spark.executor.extraClassPath', "./__spark_libs__/*:$hadoopConfDir")
 conf.setProperty('spark.driver.extraLibraryPath', "$hadoop/lib/native:$hadoop/lib/native/Linux-amd64-64")
 conf.setProperty('spark.executor.extraLibraryPath', "$hadoop/lib/native:$hadoop/lib/native/Linux-amd64-64")
+conf.setProperty('gremlin.spark.persistContext', 'true')
 graph = GraphFactory.open(conf)
 g = graph.traversal().withComputer(SparkGraphComputer)
 g.V().group().by(values('name')).by(both().count())
@@ -125,9 +126,14 @@ as the directory named `+__spark_libs__+` in the Yarn containers. The `spark.exe
 `spark.yarn.appMasterEnv.CLASSPATH` properties point to the jars inside this directory.
 This is why they contain the `+./__spark_lib__/*+` item. Just because a Spark executor got the archive with
 jars loaded into its container, does not mean it knows how to access them.
+
 Also the `HADOOP_GREMLIN_LIBS` mechanism is not used because it can not work for Spark on Yarn as implemented (jars
 added to the `SparkContext` are not available to the Yarn application master).
 
+The `gremlin.spark.persistContext` property is explained in the reference documentation of
+http://tinkerpop.apache.org/docs/current/reference/#sparkgraphcomputer[SparkGraphComputer]: it helps in getting
+follow-up OLAP queries answered faster, because you skip the overhead for getting resources from Yarn.
+
 Additional configuration options
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 This recipe does most of the graph configuration in the gremlin console so that environment variables can be used and

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/df73338f/hadoop-gremlin/conf/hadoop-gryo.properties
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/conf/hadoop-gryo.properties b/hadoop-gremlin/conf/hadoop-gryo.properties
index ec56abc..c156a98 100644
--- a/hadoop-gremlin/conf/hadoop-gryo.properties
+++ b/hadoop-gremlin/conf/hadoop-gryo.properties
@@ -29,11 +29,11 @@ gremlin.hadoop.outputLocation=output
 ####################################
 spark.master=local[4]
 spark.executor.memory=1g
-gremlin.spark.persistContext=true
 spark.serializer=org.apache.spark.serializer.KryoSerializer
 spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
 # spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer (3.2.x model)
 # gremlin.spark.graphStorageLevel=MEMORY_AND_DISK
+# gremlin.spark.persistContext=true
 # gremlin.spark.graphWriter=org.apache.tinkerpop.gremlin.spark.structure.io.PersistedOutputRDD
 # gremlin.spark.persistStorageLevel=DISK_ONLY
 # spark.kryo.registrationRequired=true