You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by ok...@apache.org on 2016/10/25 14:02:13 UTC

tinkerpop git commit: UnshadedKryoShimService is now the default for SparkGraphComputer. Users no longer have to System.setProperty() as its all handled in the SparkGraphComputer constructor. I would really like to get rid of the 'service model' (cc @dal

Repository: tinkerpop
Updated Branches:
  refs/heads/TINKERPOP-1389 a0ce3178e -> 532ed59c2


UnshadedKryoShimService is now the default for SparkGraphComputer. Users no longer have to System.setProperty() as its all handled in the SparkGraphComputer constructor. I would really like to get rid of the 'service model' (cc @dalaro) as UnshadedKryoShimService only works with Spark and thus, we can't have it leak over to Giraph (and other Hadoop-based graph computeters). Its leak-free right now, but in general, it would be good to not even have this as a priority/service thing. Trying to think how to do this....


Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/532ed59c
Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/532ed59c
Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/532ed59c

Branch: refs/heads/TINKERPOP-1389
Commit: 532ed59c2ed05d48e9a3eeed67105fb4b9ce32fa
Parents: a0ce317
Author: Marko A. Rodriguez <ok...@gmail.com>
Authored: Tue Oct 25 08:02:07 2016 -0600
Committer: Marko A. Rodriguez <ok...@gmail.com>
Committed: Tue Oct 25 08:02:07 2016 -0600

----------------------------------------------------------------------
 CHANGELOG.asciidoc                              |  1 +
 docs/src/upgrade/release-3.3.x.asciidoc         | 21 ++++++++
 .../process/computer/GiraphGraphComputer.java   |  3 ++
 hadoop-gremlin/conf/hadoop-graphson.properties  |  2 +
 .../conf/hadoop-grateful-gryo.properties        | 25 +++++-----
 hadoop-gremlin/conf/hadoop-gryo.properties      |  6 ++-
 hadoop-gremlin/conf/hadoop-script.properties    |  5 +-
 .../tinkerpop/gremlin/hadoop/Constants.java     |  2 +
 .../process/computer/SparkGraphComputer.java    | 11 +++++
 .../unshaded/UnshadedKryoShimService.java       | 22 +++++----
 ...tratorGraphComputerProcessIntegrateTest.java | 33 -------------
 ...alizerGraphComputerProcessIntegrateTest.java | 33 +++++++++++++
 ...SparkHadoopGraphGryoRegistratorProvider.java | 52 --------------------
 .../SparkHadoopGraphGryoSerializerProvider.java | 45 +++++++++++++++++
 .../computer/SparkHadoopGraphProvider.java      | 17 +++++--
 15 files changed, 162 insertions(+), 116 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/CHANGELOG.asciidoc
----------------------------------------------------------------------
diff --git a/CHANGELOG.asciidoc b/CHANGELOG.asciidoc
index 11d747e..b2db5c1 100644
--- a/CHANGELOG.asciidoc
+++ b/CHANGELOG.asciidoc
@@ -27,6 +27,7 @@ TinkerPop 3.3.0 (Release Date: NOT OFFICIALLY RELEASED YET)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 * Bumped to support Spark 2.0.0.
+* Added `UnshadedKryoShimService` as the new default serializer model for `SparkGraphComputer`.
 * Replaced term `REST` with `HTTP` to remove any confusion as to the design of the API.
 * Moved `gremlin-benchmark` under `gremlin-tools` module.
 * Added `gremlin-tools` and its submodule `gremlin-coverage`.

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/docs/src/upgrade/release-3.3.x.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/upgrade/release-3.3.x.asciidoc b/docs/src/upgrade/release-3.3.x.asciidoc
index a026771..d1a19d9 100644
--- a/docs/src/upgrade/release-3.3.x.asciidoc
+++ b/docs/src/upgrade/release-3.3.x.asciidoc
@@ -31,3 +31,24 @@ Please see the link:https://github.com/apache/tinkerpop/blob/3.3.3/CHANGELOG.asc
 
 Upgrading for Users
 ~~~~~~~~~~~~~~~~~~~
+
+SparkGraphComputer GryoRegistrator
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Historically, `SparkGraphComputer` has  used `GryoSerializer` to handle the serialization of objects in Spark. The reason
+this exists is because TinkerPop uses a shaded version of Kryo and thus, couldn't use the standard `KryoSerializer`-model
+provided by Spark. However, a "shim model" was created which allows for the shaded and unshaded versions of Kryo to
+interact with one another. To this end, `KryoSerializer` can now be used with a `GryoRegistrator`. The properties file
+for a `SparkGraphComputer` now looks as follows:
+
+```
+spark.serializer=org.apache.spark.serializer.KryoSerializer
+spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
+```
+
+If the old `GryoSerializer` model is desired, then the properties file should simply look as before:
+
+```
+spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
+```
+

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/giraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/giraph/process/computer/GiraphGraphComputer.java
----------------------------------------------------------------------
diff --git a/giraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/giraph/process/computer/GiraphGraphComputer.java b/giraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/giraph/process/computer/GiraphGraphComputer.java
index b06b40a..6ffd5ea 100644
--- a/giraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/giraph/process/computer/GiraphGraphComputer.java
+++ b/giraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/giraph/process/computer/GiraphGraphComputer.java
@@ -44,6 +44,7 @@ import org.apache.tinkerpop.gremlin.hadoop.process.computer.util.MapReduceHelper
 import org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.FileSystemStorage;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.GraphFilterAware;
+import org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.InputOutputHelper;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.ObjectWritable;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.ObjectWritableIterator;
@@ -57,6 +58,7 @@ import org.apache.tinkerpop.gremlin.process.computer.VertexProgram;
 import org.apache.tinkerpop.gremlin.process.computer.util.DefaultComputerResult;
 import org.apache.tinkerpop.gremlin.process.computer.util.MapMemory;
 import org.apache.tinkerpop.gremlin.structure.io.Storage;
+import org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimServiceLoader;
 import org.apache.tinkerpop.gremlin.util.Gremlin;
 import org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils;
 
@@ -82,6 +84,7 @@ public final class GiraphGraphComputer extends AbstractHadoopGraphComputer imple
 
     public GiraphGraphComputer(final HadoopGraph hadoopGraph) {
         super(hadoopGraph);
+        System.setProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE, HadoopPoolShimService.class.getCanonicalName()); // HadoopPools only with Giraph
         final Configuration configuration = hadoopGraph.configuration();
         configuration.getKeys().forEachRemaining(key -> this.giraphConfiguration.set(key, configuration.getProperty(key).toString()));
         this.giraphConfiguration.setMasterComputeClass(GiraphMemory.class);

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/hadoop-gremlin/conf/hadoop-graphson.properties
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/conf/hadoop-graphson.properties b/hadoop-gremlin/conf/hadoop-graphson.properties
index 10025de..c37cf28 100644
--- a/hadoop-gremlin/conf/hadoop-graphson.properties
+++ b/hadoop-gremlin/conf/hadoop-graphson.properties
@@ -30,6 +30,8 @@ gremlin.hadoop.jarsInDistributedCache=true
 # SparkGraphComputer Configuration #
 ####################################
 spark.master=local[4]
+spark.serializer=org.apache.spark.serializer.KryoSerializer
+spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
 
 #####################################
 # GiraphGraphComputer Configuration #

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/hadoop-gremlin/conf/hadoop-grateful-gryo.properties
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/conf/hadoop-grateful-gryo.properties b/hadoop-gremlin/conf/hadoop-grateful-gryo.properties
index 17aeadd..92ed942 100644
--- a/hadoop-gremlin/conf/hadoop-grateful-gryo.properties
+++ b/hadoop-gremlin/conf/hadoop-grateful-gryo.properties
@@ -15,9 +15,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-#
-# Hadoop Graph Configuration
-#
 gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
 gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat
 gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
@@ -25,9 +22,17 @@ gremlin.hadoop.inputLocation=grateful-dead.kryo
 gremlin.hadoop.outputLocation=output
 gremlin.hadoop.jarsInDistributedCache=true
 
-#
-# GiraphGraphComputer Configuration
-#
+####################################
+# SparkGraphComputer Configuration #
+####################################
+spark.master=local[1]
+spark.executor.memory=1g
+spark.serializer=org.apache.spark.serializer.KryoSerializer
+spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
+
+#####################################
+# GiraphGraphComputer Configuration #
+#####################################
 giraph.minWorkers=1
 giraph.maxWorkers=1
 giraph.useOutOfCoreGraph=true
@@ -37,11 +42,3 @@ mapred.reduce.child.java.opts=-Xmx1024m
 giraph.numInputThreads=4
 giraph.numComputeThreads=4
 giraph.maxMessagesInMemory=100000
-
-#
-# SparkGraphComputer Configuration
-#
-spark.master=local[1]
-spark.executor.memory=1g
-spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
-

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/hadoop-gremlin/conf/hadoop-gryo.properties
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/conf/hadoop-gryo.properties b/hadoop-gremlin/conf/hadoop-gryo.properties
index aaab24d..c156a98 100644
--- a/hadoop-gremlin/conf/hadoop-gryo.properties
+++ b/hadoop-gremlin/conf/hadoop-gryo.properties
@@ -14,6 +14,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+
 gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
 gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat
 gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
@@ -28,7 +29,9 @@ gremlin.hadoop.outputLocation=output
 ####################################
 spark.master=local[4]
 spark.executor.memory=1g
-spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
+spark.serializer=org.apache.spark.serializer.KryoSerializer
+spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
+# spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer (3.2.x model)
 # gremlin.spark.graphStorageLevel=MEMORY_AND_DISK
 # gremlin.spark.persistContext=true
 # gremlin.spark.graphWriter=org.apache.tinkerpop.gremlin.spark.structure.io.PersistedOutputRDD
@@ -39,7 +42,6 @@ spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerial
 # spark.eventLog.dir=/tmp/spark-event-logs
 # spark.ui.killEnabled=true
 
-
 #####################################
 # GiraphGraphComputer Configuration #
 #####################################

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/hadoop-gremlin/conf/hadoop-script.properties
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/conf/hadoop-script.properties b/hadoop-gremlin/conf/hadoop-script.properties
index 0c27dd7..88bf6d9 100644
--- a/hadoop-gremlin/conf/hadoop-script.properties
+++ b/hadoop-gremlin/conf/hadoop-script.properties
@@ -14,6 +14,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+
 gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
 gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat
 gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat
@@ -28,7 +29,9 @@ gremlin.hadoop.outputLocation=output
 ####################################
 spark.master=local[4]
 spark.executor.memory=1g
-spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
+spark.serializer=org.apache.spark.serializer.KryoSerializer
+spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
+# spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer (3.2.x model)
 # spark.kryo.registrationRequired=true
 # spark.storage.memoryFraction=0.2
 # spark.eventLog.enabled=true

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java
----------------------------------------------------------------------
diff --git a/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java b/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java
index 9c79b53..3ff8e2a 100644
--- a/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java
+++ b/hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java
@@ -61,6 +61,8 @@ public final class Constants {
     public static final String GREMLIN_SPARK_SKIP_PARTITIONER = "gremlin.spark.skipPartitioner"; // don't partition the loadedGraphRDD
     public static final String GREMLIN_SPARK_SKIP_GRAPH_CACHE = "gremlin.spark.skipGraphCache";  // don't cache the loadedGraphRDD (ignores graphStorageLevel)
     public static final String SPARK_SERIALIZER = "spark.serializer";
+    public static final String SPARK_KRYO_REGISTRATOR = "spark.kryo.registrator";
+    public static final String SPARK_KRYO_REGISTRATION_REQUIRED = "spark.kryo.registrationRequired";
 
     public static String getGraphLocation(final String location) {
         return location.endsWith("/") ? location + Constants.HIDDEN_G : location + "/" + Constants.HIDDEN_G;

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java b/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java
index 36618a1..c7d0cfb 100644
--- a/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java
+++ b/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java
@@ -35,6 +35,7 @@ import org.apache.spark.SparkContext;
 import org.apache.spark.api.java.JavaPairRDD;
 import org.apache.spark.api.java.JavaSparkContext;
 import org.apache.spark.launcher.SparkLauncher;
+import org.apache.spark.serializer.KryoSerializer;
 import org.apache.spark.storage.StorageLevel;
 import org.apache.tinkerpop.gremlin.hadoop.Constants;
 import org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer;
@@ -43,6 +44,7 @@ import org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration;
 import org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.FileSystemStorage;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.GraphFilterAware;
+import org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService;
 import org.apache.tinkerpop.gremlin.hadoop.structure.io.VertexWritable;
 import org.apache.tinkerpop.gremlin.hadoop.structure.util.ConfUtil;
 import org.apache.tinkerpop.gremlin.process.computer.ComputerResult;
@@ -67,7 +69,9 @@ import org.apache.tinkerpop.gremlin.spark.structure.io.OutputRDD;
 import org.apache.tinkerpop.gremlin.spark.structure.io.PersistedInputRDD;
 import org.apache.tinkerpop.gremlin.spark.structure.io.PersistedOutputRDD;
 import org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage;
+import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator;
 import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer;
+import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoShimService;
 import org.apache.tinkerpop.gremlin.structure.Direction;
 import org.apache.tinkerpop.gremlin.structure.io.Storage;
 import org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimServiceLoader;
@@ -106,6 +110,13 @@ public final class SparkGraphComputer extends AbstractHadoopGraphComputer {
         super(hadoopGraph);
         this.sparkConfiguration = new HadoopConfiguration();
         ConfigurationUtils.copy(this.hadoopGraph.configuration(), this.sparkConfiguration);
+        if (KryoSerializer.class.getCanonicalName().equals(this.sparkConfiguration.getString(Constants.SPARK_SERIALIZER, null)) &&
+                GryoRegistrator.class.getCanonicalName().equals(this.sparkConfiguration.getString(Constants.SPARK_KRYO_REGISTRATOR, null))) {
+            System.setProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE, UnshadedKryoShimService.class.getCanonicalName());
+        } else if (GryoSerializer.class.getCanonicalName().equals(this.sparkConfiguration.getString(Constants.SPARK_SERIALIZER, null)) &&
+                !this.sparkConfiguration.containsKey(Constants.SPARK_KRYO_REGISTRATOR)) {
+            System.setProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE, HadoopPoolShimService.class.getCanonicalName());
+        }
         if (null != System.getProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE, null)) {
             final String shimService = System.getProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE);
             this.sparkConfiguration.setProperty(SparkLauncher.EXECUTOR_EXTRA_JAVA_OPTIONS,

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/structure/io/gryo/kryoshim/unshaded/UnshadedKryoShimService.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/structure/io/gryo/kryoshim/unshaded/UnshadedKryoShimService.java b/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/structure/io/gryo/kryoshim/unshaded/UnshadedKryoShimService.java
index 41e0001..0789d6a 100644
--- a/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/structure/io/gryo/kryoshim/unshaded/UnshadedKryoShimService.java
+++ b/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/structure/io/gryo/kryoshim/unshaded/UnshadedKryoShimService.java
@@ -30,6 +30,7 @@ import com.esotericsoftware.kryo.io.Output;
 import org.apache.commons.configuration.BaseConfiguration;
 import org.apache.commons.configuration.Configuration;
 import org.apache.spark.SparkConf;
+import org.apache.tinkerpop.gremlin.hadoop.Constants;
 import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.IoRegistryAwareKryoSerializer;
 import org.apache.tinkerpop.gremlin.structure.io.gryo.GryoPool;
 import org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimService;
@@ -48,7 +49,8 @@ public class UnshadedKryoShimService implements KryoShimService {
 
     private static volatile boolean initialized;
 
-    public UnshadedKryoShimService() { }
+    public UnshadedKryoShimService() {
+    }
 
     @Override
     public Object readClassAndObject(final InputStream source) {
@@ -123,23 +125,23 @@ public class UnshadedKryoShimService implements KryoShimService {
                         sparkConf.set(GryoPool.CONFIG_IO_REGISTRY, regStr);
                     }
                     // Setting spark.serializer here almost certainly isn't necessary, but it doesn't hurt
-                    sparkConf.set("spark.serializer", IoRegistryAwareKryoSerializer.class.getCanonicalName());
+                    sparkConf.set(Constants.SPARK_SERIALIZER, IoRegistryAwareKryoSerializer.class.getCanonicalName());
 
-                    final String registrator = conf.getString("spark.kryo.registrator");
+                    final String registrator = conf.getString(Constants.SPARK_KRYO_REGISTRATOR);
                     if (null != registrator) {
-                        sparkConf.set("spark.kryo.registrator", registrator);
-                        log.info("Copied spark.kryo.registrator: {}", registrator);
+                        sparkConf.set(Constants.SPARK_KRYO_REGISTRATOR, registrator);
+                        log.info("Copied " + Constants.SPARK_KRYO_REGISTRATOR + ": {}", registrator);
                     } else {
-                        log.info("Not copying spark.kryo.registrator");
+                        log.info("Not copying " + Constants.SPARK_KRYO_REGISTRATOR);
                     }
 
-                    // Reuse Gryo poolsize for Kryo poolsize (no need to copy this to SparkConf)
-                    final int poolSize = conf.getInt(GryoPool.CONFIG_IO_GRYO_POOL_SIZE,
-                            GryoPool.CONFIG_IO_GRYO_POOL_SIZE_DEFAULT);
                     // Instantiate the spark.serializer
                     final IoRegistryAwareKryoSerializer ioReg = new IoRegistryAwareKryoSerializer(sparkConf);
-                    // Setup a pool backed by our spark.serializer instance
 
+                    // Setup a pool backed by our spark.serializer instance
+                    // Reuse Gryo poolsize for Kryo poolsize (no need to copy this to SparkConf)
+                    final int poolSize = conf.getInt(GryoPool.CONFIG_IO_GRYO_POOL_SIZE,
+                            GryoPool.CONFIG_IO_GRYO_POOL_SIZE_DEFAULT);
                     for (int i = 0; i < poolSize; i++) {
                         KRYOS.add(ioReg.newKryo());
                     }

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoRegistratorGraphComputerProcessIntegrateTest.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoRegistratorGraphComputerProcessIntegrateTest.java b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoRegistratorGraphComputerProcessIntegrateTest.java
deleted file mode 100644
index 29f627d..0000000
--- a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoRegistratorGraphComputerProcessIntegrateTest.java
+++ /dev/null
@@ -1,33 +0,0 @@
-/*
- *  Licensed to the Apache Software Foundation (ASF) under one
- *  or more contributor license agreements.  See the NOTICE file
- *  distributed with this work for additional information
- *  regarding copyright ownership.  The ASF licenses this file
- *  to you under the Apache License, Version 2.0 (the
- *  "License"); you may not use this file except in compliance
- *  with the License.  You may obtain a copy of the License at
- *
- *  http://www.apache.org/licenses/LICENSE-2.0
- *
- *  Unless required by applicable law or agreed to in writing,
- *  software distributed under the License is distributed on an
- *  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- *  KIND, either express or implied.  See the License for the
- *  specific language governing permissions and limitations
- *  under the License.
- */
-
-package org.apache.tinkerpop.gremlin.spark.process.computer;
-
-import org.apache.tinkerpop.gremlin.GraphProviderClass;
-import org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph;
-import org.apache.tinkerpop.gremlin.process.ProcessComputerSuite;
-import org.junit.runner.RunWith;
-
-/**
- * @author Marko A. Rodriguez (http://markorodriguez.com)
- */
-@RunWith(ProcessComputerSuite.class)
-@GraphProviderClass(provider = SparkHadoopGraphGryoRegistratorProvider.class, graph = HadoopGraph.class)
-public class SparkGryoRegistratorGraphComputerProcessIntegrateTest {
-}

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoSerializerGraphComputerProcessIntegrateTest.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoSerializerGraphComputerProcessIntegrateTest.java b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoSerializerGraphComputerProcessIntegrateTest.java
new file mode 100644
index 0000000..b04513c
--- /dev/null
+++ b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGryoSerializerGraphComputerProcessIntegrateTest.java
@@ -0,0 +1,33 @@
+/*
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing,
+ *  software distributed under the License is distributed on an
+ *  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ *  KIND, either express or implied.  See the License for the
+ *  specific language governing permissions and limitations
+ *  under the License.
+ */
+
+package org.apache.tinkerpop.gremlin.spark.process.computer;
+
+import org.apache.tinkerpop.gremlin.GraphProviderClass;
+import org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph;
+import org.apache.tinkerpop.gremlin.process.ProcessComputerSuite;
+import org.junit.runner.RunWith;
+
+/**
+ * @author Marko A. Rodriguez (http://markorodriguez.com)
+ */
+@RunWith(ProcessComputerSuite.class)
+@GraphProviderClass(provider = SparkHadoopGraphGryoSerializerProvider.class, graph = HadoopGraph.class)
+public class SparkGryoSerializerGraphComputerProcessIntegrateTest {
+}

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoRegistratorProvider.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoRegistratorProvider.java b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoRegistratorProvider.java
deleted file mode 100644
index fcebbd0..0000000
--- a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoRegistratorProvider.java
+++ /dev/null
@@ -1,52 +0,0 @@
-/*
- *  Licensed to the Apache Software Foundation (ASF) under one
- *  or more contributor license agreements.  See the NOTICE file
- *  distributed with this work for additional information
- *  regarding copyright ownership.  The ASF licenses this file
- *  to you under the Apache License, Version 2.0 (the
- *  "License"); you may not use this file except in compliance
- *  with the License.  You may obtain a copy of the License at
- *
- *  http://www.apache.org/licenses/LICENSE-2.0
- *
- *  Unless required by applicable law or agreed to in writing,
- *  software distributed under the License is distributed on an
- *  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- *  KIND, either express or implied.  See the License for the
- *  specific language governing permissions and limitations
- *  under the License.
- */
-
-package org.apache.tinkerpop.gremlin.spark.process.computer;
-
-import org.apache.spark.serializer.KryoSerializer;
-import org.apache.tinkerpop.gremlin.LoadGraphWith;
-import org.apache.tinkerpop.gremlin.hadoop.Constants;
-import org.apache.tinkerpop.gremlin.spark.structure.Spark;
-import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator;
-import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoShimService;
-import org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimServiceLoader;
-
-import java.util.Map;
-
-import static org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimServiceLoader.KRYO_SHIM_SERVICE;
-
-/**
- * @author Marko A. Rodriguez (http://markorodriguez.com)
- */
-public final class SparkHadoopGraphGryoRegistratorProvider extends SparkHadoopGraphProvider {
-
-    public Map<String, Object> getBaseConfiguration(final String graphName, final Class<?> test, final String testMethodName, final LoadGraphWith.GraphData loadGraphWith) {
-        Spark.close();
-        final Map<String, Object> config = super.getBaseConfiguration(graphName, test, testMethodName, loadGraphWith);
-        // ensure the context doesn't stay open for the GryoSerializer tests to follow
-        // this is primarily to ensure that the KryoShimService loaded specifically in these tests don't leak to the other tests
-        config.put(Constants.GREMLIN_SPARK_PERSIST_CONTEXT, false);
-        config.put("spark.serializer", KryoSerializer.class.getCanonicalName());
-        config.put("spark.kryo.registrator", GryoRegistrator.class.getCanonicalName());
-        System.setProperty(KRYO_SHIM_SERVICE, UnshadedKryoShimService.class.getCanonicalName());
-        KryoShimServiceLoader.load(true);
-        System.clearProperty(KRYO_SHIM_SERVICE);
-        return config;
-    }
-}

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoSerializerProvider.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoSerializerProvider.java b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoSerializerProvider.java
new file mode 100644
index 0000000..9820b7b
--- /dev/null
+++ b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphGryoSerializerProvider.java
@@ -0,0 +1,45 @@
+/*
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing,
+ *  software distributed under the License is distributed on an
+ *  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ *  KIND, either express or implied.  See the License for the
+ *  specific language governing permissions and limitations
+ *  under the License.
+ */
+
+package org.apache.tinkerpop.gremlin.spark.process.computer;
+
+import org.apache.tinkerpop.gremlin.LoadGraphWith;
+import org.apache.tinkerpop.gremlin.hadoop.Constants;
+import org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService;
+import org.apache.tinkerpop.gremlin.spark.structure.Spark;
+import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer;
+import org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimServiceLoader;
+
+import java.util.Map;
+
+/**
+ * @author Marko A. Rodriguez (http://markorodriguez.com)
+ */
+public final class SparkHadoopGraphGryoSerializerProvider extends SparkHadoopGraphProvider {
+
+    public Map<String, Object> getBaseConfiguration(final String graphName, final Class<?> test, final String testMethodName, final LoadGraphWith.GraphData loadGraphWith) {
+        if (this.getClass().equals(SparkHadoopGraphGryoSerializerProvider.class) &&
+                !HadoopPoolShimService.class.getCanonicalName().equals(System.getProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE, null)))
+            Spark.close();
+        final Map<String, Object> config = super.getBaseConfiguration(graphName, test, testMethodName, loadGraphWith);
+        config.put(Constants.SPARK_SERIALIZER, GryoSerializer.class.getCanonicalName());
+        config.remove(Constants.SPARK_KRYO_REGISTRATOR);
+        return config;
+    }
+}

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/532ed59c/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphProvider.java
----------------------------------------------------------------------
diff --git a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphProvider.java b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphProvider.java
index d4201b5..c5b5083 100644
--- a/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphProvider.java
+++ b/spark-gremlin/src/test/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkHadoopGraphProvider.java
@@ -18,6 +18,8 @@
  */
 package org.apache.tinkerpop.gremlin.spark.process.computer;
 
+import org.apache.spark.launcher.SparkLauncher;
+import org.apache.spark.serializer.KryoSerializer;
 import org.apache.tinkerpop.gremlin.GraphProvider;
 import org.apache.tinkerpop.gremlin.LoadGraphWith;
 import org.apache.tinkerpop.gremlin.groovy.util.SugarTestHelper;
@@ -39,8 +41,10 @@ import org.apache.tinkerpop.gremlin.spark.structure.Spark;
 import org.apache.tinkerpop.gremlin.spark.structure.io.PersistedOutputRDD;
 import org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorageCheck;
 import org.apache.tinkerpop.gremlin.spark.structure.io.ToyGraphInputRDD;
-import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer;
+import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator;
+import org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoShimService;
 import org.apache.tinkerpop.gremlin.structure.Graph;
+import org.apache.tinkerpop.gremlin.structure.io.gryo.kryoshim.KryoShimServiceLoader;
 
 import java.util.Map;
 
@@ -52,6 +56,10 @@ public class SparkHadoopGraphProvider extends HadoopGraphProvider {
 
     @Override
     public Map<String, Object> getBaseConfiguration(final String graphName, final Class<?> test, final String testMethodName, final LoadGraphWith.GraphData loadGraphWith) {
+        if (this.getClass().equals(SparkHadoopGraphProvider.class) &&
+                !UnshadedKryoShimService.class.getCanonicalName().equals(System.getProperty(KryoShimServiceLoader.KRYO_SHIM_SERVICE, null)))
+            Spark.close();
+
         final Map<String, Object> config = super.getBaseConfiguration(graphName, test, testMethodName, loadGraphWith);
         config.put(Constants.GREMLIN_SPARK_PERSIST_CONTEXT, true);  // this makes the test suite go really fast
 
@@ -82,9 +90,10 @@ public class SparkHadoopGraphProvider extends HadoopGraphProvider {
         }
 
         config.put(Constants.GREMLIN_HADOOP_DEFAULT_GRAPH_COMPUTER, SparkGraphComputer.class.getCanonicalName());
-        config.put("spark.master", "local[4]");
-        config.put("spark.serializer", GryoSerializer.class.getCanonicalName());
-        config.put("spark.kryo.registrationRequired", true);
+        config.put(SparkLauncher.SPARK_MASTER, "local[4]");
+        config.put(Constants.SPARK_SERIALIZER, KryoSerializer.class.getCanonicalName());
+        config.put(Constants.SPARK_KRYO_REGISTRATOR, GryoRegistrator.class.getCanonicalName());
+        config.put(Constants.SPARK_KRYO_REGISTRATION_REQUIRED, true);
         return config;
     }