You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Tobias Pfeiffer <tg...@preferred.jp> on 2015/03/04 10:17:41 UTC
scala.Double vs java.lang.Double in RDD
Hi,
I have a function with signature
def aggFun1(rdd: RDD[(Long, (Long, Double))]):
RDD[(Long, Any)]
and one with
def aggFun2[_Key: ClassTag, _Index](rdd: RDD[(_Key, (_Index, Double))]):
RDD[(_Key, Double)]
where all "Double" classes involved are "scala.Double" classes (according
to IDEA) and my implementation of aggFun1 is just calling aggFun2 (type
parameters _Key and _Index are inferred by the Scala compiler).
Now I am writing a test as follows:
val result: Map[Long, Any] = aggFun1(input).collect().toMap
result.values.foreach(v => println(v.getClass))
result.values.foreach(_ shouldBe a[Double])
and I get the following output:
class java.lang.Double
class java.lang.Double
[info] avg
[info] - should compute the average *** FAILED ***
[info] 1.75 was not an instance of double, but an instance of
java.lang.Double
So I am wondering about what magic is going on here. Are scala.Double
values in RDDs automatically converted to java.lang.Doubles or am I just
missing the implicit back-conversion etc.?
Any help appreciated,
Tobias
Re: scala.Double vs java.lang.Double in RDD
Posted by Tobias Pfeiffer <tg...@preferred.jp>.
Hi,
On Thu, Mar 5, 2015 at 12:20 AM, Imran Rashid <ir...@cloudera.com> wrote:
> This doesn't involve spark at all, I think this is entirely an issue with
> how scala deals w/ primitives and boxing. Often it can hide the details
> for you, but IMO it just leads to far more confusing errors when things
> don't work out. The issue here is that your map has value type Any, which
> leads scala to leave it as a boxed java.lang.Double.
>
I see, thank you very much for your explanation and the code examples!
Helps very much!
Thanks
Tobias
Re: scala.Double vs java.lang.Double in RDD
Posted by Imran Rashid <ir...@cloudera.com>.
This doesn't involve spark at all, I think this is entirely an issue with
how scala deals w/ primitives and boxing. Often it can hide the details
for you, but IMO it just leads to far more confusing errors when things
don't work out. The issue here is that your map has value type Any, which
leads scala to leave it as a boxed java.lang.Double.
scala> val x = 1.5
> x: Double = 1.5
> scala> x.getClass()
> res0: Class[Double] = double
> scala> x.getClass() == classOf[java.lang.Double]
> res1: Boolean = false
> scala> x.getClass() == classOf[Double]
> res2: Boolean = true
> scala> val arr = Array(1.5,2.5)
> arr: Array[Double] = Array(1.5, 2.5)
> scala> arr.getClass().getComponentType() == x.getClass()
> res5: Boolean = true
> scala> arr.getClass().getComponentType() == classOf[java.lang.Double]
> res6: Boolean = false
//this map has java.lang.Double
> scala> val map: Map[String, Any] = arr.map{x => x.toString -> x}.toMap
> map: Map[String,Any] = Map(1.5 -> 1.5, 2.5 -> 2.5)
> scala> map("1.5").getClass()
> res15: Class[_] = class java.lang.Double
> scala> map("1.5").getClass() == x.getClass()
> res10: Boolean = false
> scala> map("1.5").getClass() == classOf[java.lang.Double]
> res11: Boolean = true
> //this one has Double
> scala> val map2: Map[String, Double] = arr.map{x => x.toString -> x}.toMap
> map2: Map[String,Double] = Map(1.5 -> 1.5, 2.5 -> 2.5)
> scala> map2("1.5").getClass()
> res12: Class[Double] = double
> scala> map2("1.5").getClass() == x.getClass()
> res13: Boolean = true
> scala> map2("1.5").getClass() == classOf[java.lang.Double]
> res14: Boolean = false
On Wed, Mar 4, 2015 at 3:17 AM, Tobias Pfeiffer <tg...@preferred.jp> wrote:
> Hi,
>
> I have a function with signature
>
> def aggFun1(rdd: RDD[(Long, (Long, Double))]):
> RDD[(Long, Any)]
>
> and one with
>
> def aggFun2[_Key: ClassTag, _Index](rdd: RDD[(_Key, (_Index, Double))]):
> RDD[(_Key, Double)]
>
> where all "Double" classes involved are "scala.Double" classes (according
> to IDEA) and my implementation of aggFun1 is just calling aggFun2 (type
> parameters _Key and _Index are inferred by the Scala compiler).
>
> Now I am writing a test as follows:
>
> val result: Map[Long, Any] = aggFun1(input).collect().toMap
> result.values.foreach(v => println(v.getClass))
> result.values.foreach(_ shouldBe a[Double])
>
> and I get the following output:
>
> class java.lang.Double
> class java.lang.Double
> [info] avg
> [info] - should compute the average *** FAILED ***
> [info] 1.75 was not an instance of double, but an instance of
> java.lang.Double
>
> So I am wondering about what magic is going on here. Are scala.Double
> values in RDDs automatically converted to java.lang.Doubles or am I just
> missing the implicit back-conversion etc.?
>
> Any help appreciated,
> Tobias
>
>