You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Maciej Szymkiewicz <ms...@gmail.com> on 2016/09/28 07:18:02 UTC

java.util.NoSuchElementException when serializing Map with default value

Hi everyone,

I suspect there is no point in submitting a JIRA to fix this (not a
Spark issue?) but I would like to know if this problem is documented
anywhere. Somehow Kryo is loosing default value during serialization:

    scala> import org.apache.spark.{SparkContext, SparkConf}
    import org.apache.spark.{SparkContext, SparkConf}

    scala> val aMap = Map[String, Long]().withDefaultValue(0L)
    aMap: scala.collection.immutable.Map[String,Long] = Map()

    scala> aMap("a")
    res6: Long = 0

    scala> val sc = new SparkContext(new
    SparkConf().setAppName("bar").set("spark.serializer",
    "org.apache.spark.serializer.KryoSerializer"))

    scala> sc.parallelize(Seq(aMap)).map(_("a")).first
    16/09/28 09:13:47 ERROR Executor: Exception in task 2.0 in stage 2.0
    (TID 7)
    java.util.NoSuchElementException: key not found: a

while Java serializer works just fine:

    scala> val sc = new SparkContext(new
    SparkConf().setAppName("bar").set("spark.serializer",
    "org.apache.spark.serializer.JavaSerializer"))

    scala> sc.parallelize(Seq(aMap)).map(_("a")).first
    res9: Long = 0

-- 
Best regards,
Maciej


Re: java.util.NoSuchElementException when serializing Map with default value

Posted by Maciej Szymkiewicz <ms...@gmail.com>.
Thanks guys.

This is not a big issue in general. More an annoyance and can be rather
confusing when encountered for the first time.


On 09/29/2016 02:05 AM, Jakob Odersky wrote:
> I agree with Sean's answer, you can check out the relevant serializer
> here https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/Traversable.scala
>
> On Wed, Sep 28, 2016 at 3:11 AM, Sean Owen <so...@cloudera.com> wrote:
>> My guess is that Kryo specially handles Maps generically or relies on
>> some mechanism that does, and it happens to iterate over all
>> key/values as part of that and of course there aren't actually any
>> key/values in the map. The Java serialization is a much more literal
>> (expensive) field-by-field serialization which works here because
>> there's no special treatment. I think you could register a custom
>> serializer that handles this case. Or work around it in your client
>> code. I know there have been other issues with Kryo and Map because,
>> for example, sometimes a Map in an application is actually some
>> non-serializable wrapper view.
>>
>> On Wed, Sep 28, 2016 at 3:18 AM, Maciej Szymkiewicz
>> <ms...@gmail.com> wrote:
>>> Hi everyone,
>>>
>>> I suspect there is no point in submitting a JIRA to fix this (not a Spark
>>> issue?) but I would like to know if this problem is documented anywhere.
>>> Somehow Kryo is loosing default value during serialization:
>>>
>>> scala> import org.apache.spark.{SparkContext, SparkConf}
>>> import org.apache.spark.{SparkContext, SparkConf}
>>>
>>> scala> val aMap = Map[String, Long]().withDefaultValue(0L)
>>> aMap: scala.collection.immutable.Map[String,Long] = Map()
>>>
>>> scala> aMap("a")
>>> res6: Long = 0
>>>
>>> scala> val sc = new SparkContext(new
>>> SparkConf().setAppName("bar").set("spark.serializer",
>>> "org.apache.spark.serializer.KryoSerializer"))
>>>
>>> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
>>> 16/09/28 09:13:47 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 7)
>>> java.util.NoSuchElementException: key not found: a
>>>
>>> while Java serializer works just fine:
>>>
>>> scala> val sc = new SparkContext(new
>>> SparkConf().setAppName("bar").set("spark.serializer",
>>> "org.apache.spark.serializer.JavaSerializer"))
>>>
>>> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
>>> res9: Long = 0
>>>
>>> --
>>> Best regards,
>>> Maciej
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>

-- 
Best regards,
Maciej



---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: java.util.NoSuchElementException when serializing Map with default value

Posted by Jakob Odersky <ja...@odersky.com>.
I agree with Sean's answer, you can check out the relevant serializer
here https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/Traversable.scala

On Wed, Sep 28, 2016 at 3:11 AM, Sean Owen <so...@cloudera.com> wrote:
> My guess is that Kryo specially handles Maps generically or relies on
> some mechanism that does, and it happens to iterate over all
> key/values as part of that and of course there aren't actually any
> key/values in the map. The Java serialization is a much more literal
> (expensive) field-by-field serialization which works here because
> there's no special treatment. I think you could register a custom
> serializer that handles this case. Or work around it in your client
> code. I know there have been other issues with Kryo and Map because,
> for example, sometimes a Map in an application is actually some
> non-serializable wrapper view.
>
> On Wed, Sep 28, 2016 at 3:18 AM, Maciej Szymkiewicz
> <ms...@gmail.com> wrote:
>> Hi everyone,
>>
>> I suspect there is no point in submitting a JIRA to fix this (not a Spark
>> issue?) but I would like to know if this problem is documented anywhere.
>> Somehow Kryo is loosing default value during serialization:
>>
>> scala> import org.apache.spark.{SparkContext, SparkConf}
>> import org.apache.spark.{SparkContext, SparkConf}
>>
>> scala> val aMap = Map[String, Long]().withDefaultValue(0L)
>> aMap: scala.collection.immutable.Map[String,Long] = Map()
>>
>> scala> aMap("a")
>> res6: Long = 0
>>
>> scala> val sc = new SparkContext(new
>> SparkConf().setAppName("bar").set("spark.serializer",
>> "org.apache.spark.serializer.KryoSerializer"))
>>
>> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
>> 16/09/28 09:13:47 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 7)
>> java.util.NoSuchElementException: key not found: a
>>
>> while Java serializer works just fine:
>>
>> scala> val sc = new SparkContext(new
>> SparkConf().setAppName("bar").set("spark.serializer",
>> "org.apache.spark.serializer.JavaSerializer"))
>>
>> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
>> res9: Long = 0
>>
>> --
>> Best regards,
>> Maciej
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: java.util.NoSuchElementException when serializing Map with default value

Posted by Sean Owen <so...@cloudera.com>.
My guess is that Kryo specially handles Maps generically or relies on
some mechanism that does, and it happens to iterate over all
key/values as part of that and of course there aren't actually any
key/values in the map. The Java serialization is a much more literal
(expensive) field-by-field serialization which works here because
there's no special treatment. I think you could register a custom
serializer that handles this case. Or work around it in your client
code. I know there have been other issues with Kryo and Map because,
for example, sometimes a Map in an application is actually some
non-serializable wrapper view.

On Wed, Sep 28, 2016 at 3:18 AM, Maciej Szymkiewicz
<ms...@gmail.com> wrote:
> Hi everyone,
>
> I suspect there is no point in submitting a JIRA to fix this (not a Spark
> issue?) but I would like to know if this problem is documented anywhere.
> Somehow Kryo is loosing default value during serialization:
>
> scala> import org.apache.spark.{SparkContext, SparkConf}
> import org.apache.spark.{SparkContext, SparkConf}
>
> scala> val aMap = Map[String, Long]().withDefaultValue(0L)
> aMap: scala.collection.immutable.Map[String,Long] = Map()
>
> scala> aMap("a")
> res6: Long = 0
>
> scala> val sc = new SparkContext(new
> SparkConf().setAppName("bar").set("spark.serializer",
> "org.apache.spark.serializer.KryoSerializer"))
>
> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
> 16/09/28 09:13:47 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 7)
> java.util.NoSuchElementException: key not found: a
>
> while Java serializer works just fine:
>
> scala> val sc = new SparkContext(new
> SparkConf().setAppName("bar").set("spark.serializer",
> "org.apache.spark.serializer.JavaSerializer"))
>
> scala> sc.parallelize(Seq(aMap)).map(_("a")).first
> res9: Long = 0
>
> --
> Best regards,
> Maciej

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org