You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2017/07/04 23:45:19 UTC

[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/18532

    [SPARK-21263][SQL] Do not allow partially parsing double and floats via NumberFormat in CSV

    ## What changes were proposed in this pull request?
    
    This PR proposes to remove `NumberFormat.parse` use to disallow a case of partially parsed data. For example,
    
    ```
    scala> spark.read.schema("a DOUBLE").option("mode", "FAILFAST").csv(Seq("10u12").toDS).show()
    +----+
    |   a|
    +----+
    |10.0|
    +----+
    ```
    
    ## How was this patch tested?
    
    Unit tests added in `UnivocityParserSuite` and `CSVSuite`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-21263

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18532.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18532
    
----
commit 024dfccaf8969edb8f1bc719063d3bc97ebff64b
Author: hyukjinkwon <gu...@gmail.com>
Date:   2017-07-04T23:35:46Z

    Do not allow partially parsing double and floats via NumberFormat in CSV

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79168 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79168/testReport)** for PR 18532 at commit [`32233dd`](https://github.com/apache/spark/commit/32233dd1d5772b5df57919fabc467a197574fa72).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79167 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79167/testReport)** for PR 18532 at commit [`024dfcc`](https://github.com/apache/spark/commit/024dfccaf8969edb8f1bc719063d3bc97ebff64b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Thank you @falaki. I just updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    cc @srowen and @falaki, could you take a look and see if I understood correctly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79168/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18532#discussion_r125764216
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -1174,4 +1174,25 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
             }
           }
       }
    +
    +  test("SPARK-21263: Invalid float and double are handled correctly in different modes") {
    +    val exception = intercept[SparkException] {
    +      spark.read.schema("a DOUBLE")
    +        .option("mode", "FAILFAST")
    +        .csv(Seq("10u12").toDS())
    +        .collect()
    +    }
    +    assert(exception.getMessage.contains("10u12"))
    --- End diff --
    
    Can we check for more specific error message?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79169 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79169/testReport)** for PR 18532 at commit [`a41c028`](https://github.com/apache/spark/commit/a41c028392acf70600f2e882cafdf59ad50def14).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79169/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79168 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79168/testReport)** for PR 18532 at commit [`32233dd`](https://github.com/apache/spark/commit/32233dd1d5772b5df57919fabc467a197574fa72).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18532#discussion_r125537110
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -1174,4 +1174,25 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
             }
           }
       }
    +
    +  test("Do not partially lose data when parsing float and double") {
    --- End diff --
    
    I suggest a better description for this test and please include the JIRA number. E.g., 
    ```SPARK-21263: Invalid float and double are handled correctly in different modes```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18532#discussion_r125763819
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParserSuite.scala ---
    @@ -130,17 +130,17 @@ class UnivocityParserSuite extends SparkFunSuite {
           DateTimeUtils.millisToDays(DateTimeUtils.stringToTime("2015-01-01").getTime))
       }
     
    -  test("Float and Double Types are cast without respect to platform default Locale") {
    -    val originalLocale = Locale.getDefault
    -    try {
    -      Locale.setDefault(new Locale("fr", "FR"))
    -      // Would parse as 1.0 in fr-FR
    -      val options = new CSVOptions(Map.empty[String, String], "GMT")
    -      assert(parser.makeConverter("_1", FloatType, options = options).apply("1,00") == 100.0)
    -      assert(parser.makeConverter("_1", DoubleType, options = options).apply("1,00") == 100.0)
    -    } finally {
    -      Locale.setDefault(originalLocale)
    -    }
    +  test("Throws exception for casting an invalid string to Float and Double Types") {
    +    val options = new CSVOptions(Map.empty[String, String], "GMT")
    +    var message = intercept[NumberFormatException] {
    +      parser.makeConverter("_1", FloatType, options = options).apply("10u000")
    +    }.getMessage
    +    assert(message.contains("10u000"))
    +
    +    message = intercept[NumberFormatException] {
    +      parser.makeConverter("_1", DoubleType, options = options).apply("10u000")
    +    }.getMessage
    +    assert(message.contains("10u000"))
    --- End diff --
    
    dito


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79256 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79256/testReport)** for PR 18532 at commit [`c1967f8`](https://github.com/apache/spark/commit/c1967f80e340b374e2d6b0e7b9759d0d61a9df13).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79167 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79167/testReport)** for PR 18532 at commit [`024dfcc`](https://github.com/apache/spark/commit/024dfccaf8969edb8f1bc719063d3bc97ebff64b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79256/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18532#discussion_r125763800
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParserSuite.scala ---
    @@ -130,17 +130,17 @@ class UnivocityParserSuite extends SparkFunSuite {
           DateTimeUtils.millisToDays(DateTimeUtils.stringToTime("2015-01-01").getTime))
       }
     
    -  test("Float and Double Types are cast without respect to platform default Locale") {
    -    val originalLocale = Locale.getDefault
    -    try {
    -      Locale.setDefault(new Locale("fr", "FR"))
    -      // Would parse as 1.0 in fr-FR
    -      val options = new CSVOptions(Map.empty[String, String], "GMT")
    -      assert(parser.makeConverter("_1", FloatType, options = options).apply("1,00") == 100.0)
    -      assert(parser.makeConverter("_1", DoubleType, options = options).apply("1,00") == 100.0)
    -    } finally {
    -      Locale.setDefault(originalLocale)
    -    }
    +  test("Throws exception for casting an invalid string to Float and Double Types") {
    +    val options = new CSVOptions(Map.empty[String, String], "GMT")
    +    var message = intercept[NumberFormatException] {
    +      parser.makeConverter("_1", FloatType, options = options).apply("10u000")
    +    }.getMessage
    +    assert(message.contains("10u000"))
    --- End diff --
    
    Is there some more specific error we could check for?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79256 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79256/testReport)** for PR 18532 at commit [`c1967f8`](https://github.com/apache/spark/commit/c1967f80e340b374e2d6b0e7b9759d0d61a9df13).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79167/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18532#discussion_r125537620
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
    @@ -1174,4 +1174,25 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils {
             }
           }
       }
    +
    +  test("Do not partially lose data when parsing float and double") {
    --- End diff --
    
    Sure, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18532


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18532: [SPARK-21263][SQL] Do not allow partially parsing double...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18532
  
    **[Test build #79169 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79169/testReport)** for PR 18532 at commit [`a41c028`](https://github.com/apache/spark/commit/a41c028392acf70600f2e882cafdf59ad50def14).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18532: [SPARK-21263][SQL] Do not allow partially parsing...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18532#discussion_r125536928
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala ---
    @@ -111,19 +111,15 @@ class UnivocityParser(
             case options.nanValue => Float.NaN
             case options.negativeInf => Float.NegativeInfinity
             case options.positiveInf => Float.PositiveInfinity
    -        case datum =>
    --- End diff --
    
    BTW, it looks we are not using `NumberFormat.parse` in schema inference - https://github.com/apache/spark/blob/7e5359be5ca038fdb579712b18e7f226d705c276/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala#L141


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org