You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Dylan Bethune-Waddell (JIRA)" <ji...@apache.org> on 2016/06/16 20:23:05 UTC

[jira] [Created] (TINKERPOP-1341) UnshadedKryoAdapter fails to deserialize StarGraph when SparkConf sets spark.rdd.compress=true whereas GryoSerializer works

Dylan Bethune-Waddell created TINKERPOP-1341:
------------------------------------------------

             Summary: UnshadedKryoAdapter fails to deserialize StarGraph when SparkConf sets spark.rdd.compress=true whereas GryoSerializer works
                 Key: TINKERPOP-1341
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1341
             Project: TinkerPop
          Issue Type: Bug
          Components: io
    Affects Versions: 3.2.1, 3.3.0
            Reporter: Dylan Bethune-Waddell
            Priority: Minor


When trying to bulk load a large dataset into Titan I was running into OOM errors and decided to try tweaking some spark configuration settings - although I am having trouble bulk loading with the new GryoRegistrator/UnshadedKryo serialization shim stuff in master whereby a few hundred tasks into the edge loading stage (stage 5) exceptions are thrown complaining about the need to explicitly register CompactBuffer[].class with Kryo, this approach with spark.rdd.compress=true fails a few hundred tasks into the vertex loading stage (stage 1) of BulkLoaderVertexProgram. GryoSerializer instead of KryoSerializer with GryoRegistrator does not fail and successfully loads the data with this compression flag flipped on whereas before I would just get OOM errors until eventually the job was set back so far that it just failed. So it would seem it is desirable in some instances to use this setting, and the new Serialization stuff seems to break it. Could be a Spark upstream issue based on this open JIRA ticket (https://issues.apache.org/jira/browse/SPARK-3630). Here is the exception that is thrown with the middle bits cut out:

com.esotericsoftware.kryo.KryoException: java.io.IOException: PARSING_ERROR(2)
        at com.esotericsoftware.kryo.io.Input.fill(Input.java:142)
        at com.esotericsoftware.kryo.io.Input.require(Input.java:169)
        at com.esotericsoftware.kryo.io.Input.readLong_slow(Input.java:715)
        at com.esotericsoftware.kryo.io.Input.readLong(Input.java:665)
        at com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:113)
        at com.esotericsoftware.kryo.serializers.DefaultSerializers$LongSerializer.read(DefaultSerializers.java:103)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:48)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readClassAndObject(UnshadedKryoAdapter.java:30)
        at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.readEdges(StarGraphSerializer.java:134)
        at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:91)
        at org.apache.tinkerpop.gremlin.structure.util.star.StarGraphSerializer.read(StarGraphSerializer.java:45)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
        at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:42)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedKryoAdapter.readObject(UnshadedKryoAdapter.java:30)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:46)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.VertexWritableSerializer.read(VertexWritableSerializer.java:36)
        at org.apache.tinkerpop.gremlin.spark.structure.io.gryo.kryoshim.unshaded.UnshadedSerializerAdapter.read(UnshadedSerializerAdapter.java:55)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
        at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)

........................................................ and so on .....................................

Caused by: java.io.IOException: PARSING_ERROR(2)
        at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:84)
        at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)
        at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:594)
        at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:358)
        at org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:167)
        at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:150)
        at com.esotericsoftware.kryo.io.Input.fill(Input.java:140)
        ... 51 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)