You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by liancheng <gi...@git.apache.org> on 2014/08/13 18:10:54 UTC

[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

GitHub user liancheng opened a pull request:

    https://github.com/apache/spark/pull/1925

    [SQL] Using safe floating-point numbers in doctest

    Test code in `sql.py` tries to compare two floating-point numbers directly, and cased [build failure(s)](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18365/consoleFull).
    
    [Doctest documentation](https://docs.python.org/3/library/doctest.html#warnings) recommends using numbers in the form of `I/2**J` to avoid the precision issue.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark fix-pysql-fp-test

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1925
    
----
commit e8059d4115cc286bfe87bad82bbd3b2fe8e16db0
Author: Cheng Lian <li...@gmail.com>
Date:   2014-08-13T16:07:47Z

    Using safe floating-point numbers in doctest

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1925#discussion_r16194633
  
    --- Diff: python/pyspark/sql.py ---
    @@ -1093,8 +1093,8 @@ def applySchema(self, rdd, schema):
             >>> sqlCtx.sql(
             ...   "SELECT byte1 - 1 AS byte1, byte2 + 1 AS byte2, " +
             ...     "short1 + 1 AS short1, short2 - 1 AS short2, int - 1 AS int, " +
    -        ...     "float + 1.1 as float FROM table2").collect()
    -        [Row(byte1=126, byte2=-127, short1=-32767, short2=32766, int=2147483646, float=2.1...)]
    +        ...     "float + 1.5 as float FROM table2").collect()
    +        [Row(byte1=126, byte2=-127, short1=-32767, short2=32766, int=2147483646, float=2.5...)]
    --- End diff --
    
    The ... is not needed any more


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52401882
  
    Thanks, I've merged this into master and 1.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52089379
  
    This issue is introduced by convert Float to Double in JVM:
    
    scala> (1.0f + 1.1).toFloat.toDouble
    res13: Double = 2.0999999046325684
    
    scala> (1.0f + 1.1f).toDouble
    res14: Double = 2.0999999046325684
    
    scala> 1.0 + 1.1f
    res16: Double = 2.100000023841858


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52217840
  
    @davies so does this solution seem reasonable?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1925#discussion_r16215978
  
    --- Diff: python/pyspark/sql.py ---
    @@ -1093,8 +1093,8 @@ def applySchema(self, rdd, schema):
             >>> sqlCtx.sql(
             ...   "SELECT byte1 - 1 AS byte1, byte2 + 1 AS byte2, " +
             ...     "short1 + 1 AS short1, short2 - 1 AS short2, int - 1 AS int, " +
    -        ...     "float + 1.1 as float FROM table2").collect()
    -        [Row(byte1=126, byte2=-127, short1=-32767, short2=32766, int=2147483646, float=2.1...)]
    +        ...     "float + 1.5 as float FROM table2").collect()
    +        [Row(byte1=126, byte2=-127, short1=-32767, short2=32766, int=2147483646, float=2.5...)]
    --- End diff --
    
    Removed `...`, thanks :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52074671
  
    After some thoughts, I think we still need to use safer floating-point numbers (`I/2**J`), since the `2.1...` pattern doesn't cover evil cases like `2.099999999999`. Reopening this. /cc @davies


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52224691
  
    Yes, look good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by liancheng <gi...@git.apache.org>.
GitHub user liancheng reopened a pull request:

    https://github.com/apache/spark/pull/1925

    [SQL] Using safe floating-point numbers in doctest

    Test code in `sql.py` tries to compare two floating-point numbers directly, and cased [build failure(s)](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18365/consoleFull).
    
    [Doctest documentation](https://docs.python.org/3/library/doctest.html#warnings) recommends using numbers in the form of `I/2**J` to avoid the precision issue.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark fix-pysql-fp-test

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1925
    
----
commit e8059d4115cc286bfe87bad82bbd3b2fe8e16db0
Author: Cheng Lian <li...@gmail.com>
Date:   2014-08-13T16:07:47Z

    Using safe floating-point numbers in doctest

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1925#discussion_r16194004
  
    --- Diff: python/pyspark/sql.py ---
    @@ -1093,8 +1093,8 @@ def applySchema(self, rdd, schema):
             >>> sqlCtx.sql(
             ...   "SELECT byte1 - 1 AS byte1, byte2 + 1 AS byte2, " +
             ...     "short1 + 1 AS short1, short2 - 1 AS short2, int - 1 AS int, " +
    -        ...     "float + 1.1 as float FROM table2").collect()
    -        [Row(byte1=126, byte2=-127, short1=-32767, short2=32766, int=2147483646, float=2.1...)]
    +        ...     "float + 1.5 as float FROM table2").collect()
    +        [Row(byte1=126, byte2=-127, short1=-32767, short2=32766, int=2147483646, float=2.5...)]
    --- End diff --
    
    Why 1.5 and is safer than 1.1 ? I think float + 1.0 -> 2.0 will be safer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52071418
  
    QA tests have started for PR 1925. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18443/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52078515
  
    QA results for PR 1925:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18443/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52082450
  
    QA results for PR 1925:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18445/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52224810
  
    scala> 1.5f.toDouble
    res1: Double = 1.5
    
    scala> 2.5f.toDouble
    res2: Double = 2.5


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52075512
  
    QA tests have started for PR 1925. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18445/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/1925#issuecomment-52071469
  
    Closing this since it's already fixed in 882da57. /cc @yhuai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1925


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng closed the pull request at:

    https://github.com/apache/spark/pull/1925


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org