You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Aureliano Buendia <bu...@gmail.com> on 2014/04/18 19:25:48 UTC

sc.makeRDD bug with NumericRange

Hi,

I just notices that sc.makeRDD() does not make all values given with input
type of NumericRange, try this in spark shell:


$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

*8*


The expected length is 11. This works correctly when lanching spark with
only one core:


$ MASTER=local[1] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

*11*


This also works correctly when using toArray():

$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length

*8*

Re: sc.makeRDD bug with NumericRange

Posted by Aureliano Buendia <bu...@gmail.com>.
Good catch, Daniel. Looks like this is a scala bug, not a spark one. Yet,
spark users got to be careful not using NumericRange.


On Fri, Apr 18, 2014 at 9:05 PM, Daniel Darabos <
daniel.darabos@lynxanalytics.com> wrote:

> To make up for mocking Scala, I've filed a bug (
> https://issues.scala-lang.org/browse/SI-8518) and will try to patch this.
>
>
> On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos <
> daniel.darabos@lynxanalytics.com> wrote:
>
>> Looks like NumericRange in Scala is just a joke.
>>
>> scala> val x = 0.0 to 1.0 by 0.1
>> x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0,
>> 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999,
>> 0.8999999999999999, 0.9999999999999999)
>>
>> scala> x.take(3)
>> res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0,
>> 0.1, 0.2)
>>
>> scala> x.drop(3)
>> res2: scala.collection.immutable.NumericRange[Double] =
>> NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999,
>> 0.8999999999999999, 0.9999999999999999)
>>
>> So far so good.
>>
>> scala> x.drop(3).take(3)
>> res3: scala.collection.immutable.NumericRange[Double] =
>> NumericRange(0.30000000000000004, 0.4)
>>
>> Why only two values? Where's 0.5?
>>
>> scala> x.drop(6)
>> res4: scala.collection.immutable.NumericRange[Double] =
>> NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9)
>>
>> And where did the last value disappear now?
>>
>> You have to approach Scala with a healthy amount of distrust. You're on
>> the right track with toArray.
>>
>>
>> On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <ma...@clearstorydata.com>wrote:
>>
>>> Please file an issue: Spark Project JIRA<https://issues.apache.org/jira/browse/SPARK>
>>>
>>>
>>>
>>> On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <
>>> buendia360@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I just notices that sc.makeRDD() does not make all values given with
>>>> input type of NumericRange, try this in spark shell:
>>>>
>>>>
>>>> $ MASTER=local[4] bin/spark-shell
>>>>
>>>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>>>>
>>>> *8*
>>>>
>>>>
>>>> The expected length is 11. This works correctly when lanching spark
>>>> with only one core:
>>>>
>>>>
>>>> $ MASTER=local[1] bin/spark-shell
>>>>
>>>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>>>>
>>>> *11*
>>>>
>>>>
>>>> This also works correctly when using toArray():
>>>>
>>>> $ MASTER=local[4] bin/spark-shell
>>>>
>>>> scala> sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length
>>>>
>>>> *8*
>>>>
>>>
>>>
>>
>

Re: sc.makeRDD bug with NumericRange

Posted by Daniel Darabos <da...@lynxanalytics.com>.
To make up for mocking Scala, I've filed a bug (
https://issues.scala-lang.org/browse/SI-8518) and will try to patch this.


On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos <
daniel.darabos@lynxanalytics.com> wrote:

> Looks like NumericRange in Scala is just a joke.
>
> scala> val x = 0.0 to 1.0 by 0.1
> x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0,
> 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999,
> 0.8999999999999999, 0.9999999999999999)
>
> scala> x.take(3)
> res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0,
> 0.1, 0.2)
>
> scala> x.drop(3)
> res2: scala.collection.immutable.NumericRange[Double] =
> NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999,
> 0.8999999999999999, 0.9999999999999999)
>
> So far so good.
>
> scala> x.drop(3).take(3)
> res3: scala.collection.immutable.NumericRange[Double] =
> NumericRange(0.30000000000000004, 0.4)
>
> Why only two values? Where's 0.5?
>
> scala> x.drop(6)
> res4: scala.collection.immutable.NumericRange[Double] =
> NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9)
>
> And where did the last value disappear now?
>
> You have to approach Scala with a healthy amount of distrust. You're on
> the right track with toArray.
>
>
> On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <ma...@clearstorydata.com>wrote:
>
>> Please file an issue: Spark Project JIRA<https://issues.apache.org/jira/browse/SPARK>
>>
>>
>>
>> On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <buendia360@gmail.com
>> > wrote:
>>
>>> Hi,
>>>
>>> I just notices that sc.makeRDD() does not make all values given with
>>> input type of NumericRange, try this in spark shell:
>>>
>>>
>>> $ MASTER=local[4] bin/spark-shell
>>>
>>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>>>
>>> *8*
>>>
>>>
>>> The expected length is 11. This works correctly when lanching spark with
>>> only one core:
>>>
>>>
>>> $ MASTER=local[1] bin/spark-shell
>>>
>>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>>>
>>> *11*
>>>
>>>
>>> This also works correctly when using toArray():
>>>
>>> $ MASTER=local[4] bin/spark-shell
>>>
>>> scala> sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length
>>>
>>> *8*
>>>
>>
>>
>

Re: sc.makeRDD bug with NumericRange

Posted by Daniel Darabos <da...@lynxanalytics.com>.
Looks like NumericRange in Scala is just a joke.

scala> val x = 0.0 to 1.0 by 0.1
x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1,
0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999,
0.8999999999999999, 0.9999999999999999)

scala> x.take(3)
res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0,
0.1, 0.2)

scala> x.drop(3)
res2: scala.collection.immutable.NumericRange[Double] =
NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999,
0.8999999999999999, 0.9999999999999999)

So far so good.

scala> x.drop(3).take(3)
res3: scala.collection.immutable.NumericRange[Double] =
NumericRange(0.30000000000000004, 0.4)

Why only two values? Where's 0.5?

scala> x.drop(6)
res4: scala.collection.immutable.NumericRange[Double] =
NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9)

And where did the last value disappear now?

You have to approach Scala with a healthy amount of distrust. You're on the
right track with toArray.


On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <ma...@clearstorydata.com>wrote:

> Please file an issue: Spark Project JIRA<https://issues.apache.org/jira/browse/SPARK>
>
>
>
> On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <bu...@gmail.com>wrote:
>
>> Hi,
>>
>> I just notices that sc.makeRDD() does not make all values given with
>> input type of NumericRange, try this in spark shell:
>>
>>
>> $ MASTER=local[4] bin/spark-shell
>>
>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>>
>> *8*
>>
>>
>> The expected length is 11. This works correctly when lanching spark with
>> only one core:
>>
>>
>> $ MASTER=local[1] bin/spark-shell
>>
>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>>
>> *11*
>>
>>
>> This also works correctly when using toArray():
>>
>> $ MASTER=local[4] bin/spark-shell
>>
>> scala> sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length
>>
>> *8*
>>
>
>

Re: sc.makeRDD bug with NumericRange

Posted by Mark Hamstra <ma...@clearstorydata.com>.
Please file an issue: Spark Project
JIRA<https://issues.apache.org/jira/browse/SPARK>



On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <bu...@gmail.com>wrote:

> Hi,
>
> I just notices that sc.makeRDD() does not make all values given with input
> type of NumericRange, try this in spark shell:
>
>
> $ MASTER=local[4] bin/spark-shell
>
> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>
> *8*
>
>
> The expected length is 11. This works correctly when lanching spark with
> only one core:
>
>
> $ MASTER=local[1] bin/spark-shell
>
> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length
>
> *11*
>
>
> This also works correctly when using toArray():
>
> $ MASTER=local[4] bin/spark-shell
>
> scala> sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length
>
> *8*
>