You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Dylan Bethune-Waddell (JIRA)" <ji...@apache.org> on 2016/06/16 20:23:05 UTC
[jira] [Created] (TINKERPOP-1341) UnshadedKryoAdapter fails to
deserialize StarGraph when SparkConf sets spark.rdd.compress=true whereas
GryoSerializer works
Dylan Bethune-Waddell created TINKERPOP-1341:
------------------------------------------------
Summary: UnshadedKryoAdapter fails to deserialize StarGraph when SparkConf sets spark.rdd.compress=true whereas GryoSerializer works
Key: TINKERPOP-1341
URL: https://issues.apache.org/jira/browse/TINKERPOP-1341
Project: TinkerPop
Issue Type: Bug
Components: io
Affects Versions: 3.2.1, 3.3.0
Reporter: Dylan Bethune-Waddell
Priority: Minor
When trying to bulk load a large dataset into Titan I was running into OOM errors and decided to try tweaking some spark configuration settings - although I am having trouble bulk loading with the new GryoRegistrator/UnshadedKryo serialization shim stuff in master whereby a few hundred tasks into the edge loading stage (stage 5) exceptions are thrown complaining about the need to explicitly register CompactBuffer[].class with Kryo, this approach with spark.rdd.compress=true fails a few hundred tasks into the vertex loading stage (stage 1) of BulkLoaderVertexProgram. GryoSerializer instead of KryoSerializer with GryoRegistrator does not fail and successfully loads the data with this compression flag flipped on whereas before I would just get OOM errors until eventually the job was set back so far that it just failed. So it would seem it is desirable in some instances to use this setting, and the new Serialization stuff seems to break it. Could be a Spark upstream issue based on this open JIRA ticket (https://issues.apache.org/jira/browse/SPARK-3630). Here is the exception that is thrown with the middle bits cut out:
com.esotericsoftware.kryo.KryoException: java.io.IOException: PARSING_ERROR(2)
at com.esotericsoftware.kryo.io.Input.fill(Input.java:142)
at com.esotericsoftware.kryo.io.Input.require(Input.java:169)
at com.esotericsoftware.kryo.io.Input.readLong_slow(Input.java:715)
at com.esotericsoftware.kryo.io.Input.readLong(Input.java:665)
at com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:113)
at com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:103)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:48)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:30)
at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.readEdges(StarGraphSerializer.java:134)
at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:91)
at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:45)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:42)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:30)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:46)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:36)
at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
........................................................ and so on .....................................
Caused by: java.io.IOException: PARSING_ERROR(2)
at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84)
at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)
at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:594)
at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:358)
at org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:167)
at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:150)
at com.esotericsoftware.kryo.io.Input.fill(Input.java:140)
... 51 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)