You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/11/24 22:24:07 UTC
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/23132
[SPARK-26163][SQL] Parsing decimals from JSON using locale
## What changes were proposed in this pull request?
In the PR, I propose using of the locale option to parse (and infer) decimals from JSON input. After the changes, `JacksonParser` converts input string to `BigDecimal` and to Spark's Decimal by using `java.text.DecimalFormat`. New behaviour can be switched off via SQL config `spark.sql.legacy.decimalParsing.enabled`.
## How was this patch tested?
Added 2 tests to `JsonExpressionsSuite` for the `en-US`, `ko-KR`, `ru-RU`, `de-DE` locales:
- Inferring decimal type using locale from JSON field values
- Converting JSON field values to specified decimal type using the locales.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 json-decimal-parsing-locale
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23132.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23132
----
commit 506417ed4b643298560a66c043f7b31beb489da3
Author: Maxim Gekk <ma...@...>
Date: 2018-11-10T19:49:12Z
Test for parsing decimals using locale
commit ac25fb6ed1d3d6689ad8841476c025848c87f2a3
Author: Maxim Gekk <ma...@...>
Date: 2018-11-10T19:51:48Z
Parsing decimals using locale
commit b784003078270a46aaf8aceb2d86dd9f13f3500c
Author: Maxim Gekk <ma...@...>
Date: 2018-11-10T19:54:00Z
Updating the migration guide
commit d0522093dfe1823f91100cbec3ef5d6c8a372f27
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T19:32:45Z
Merge remote-tracking branch 'origin/master' into json-decimal-parsing-locale
# Conflicts:
# sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
commit 722e135cf3b94e22079fd9a0fe1d309a04e76a64
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T19:58:39Z
Add SQL config spark.sql.legacy.decimalParsing.enabled
commit dc6c0ac0a97b31441d99ff1dd71608ae5e2eca73
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T20:00:31Z
Updating the migration guide
commit f15b1817fb51d453487665122473855712214692
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T20:38:24Z
Added a test for parsing
commit ab781d54e4f7604d64c72c3c383c549abab0a9a9
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T20:52:03Z
Fix test
commit 163a8b9d7d017409ae4dfa40e492680bf0e4f935
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T20:52:26Z
Create getDecimalParser
commit 8fb65c0db85f4bd2f76d473c5e31e772ff0d4c1d
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T22:11:42Z
Add a test for inferring decimals
commit 7e3a2906a96894cadc58771131d07d06ba265382
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T22:12:35Z
Change JsonSuite to adopt it for JsonInferSchema class
commit 83920b25f586dc242841ff0a73105ae9e43295ed
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T22:13:01Z
Inferring decimals from JSON
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r237091555
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide
## Upgrading From Spark SQL 2.4 to 3.0
+ - In Spark version 2.4 and earlier, accepted format of decimals parsed from JSON is an optional sign ('+' or '-'), followed by a sequence of zero or more decimal digits, optionally followed by a fraction, optionally followed by an exponent. Any commas were removed from the input before parsing. Since Spark 3.0, format varies and depends on locale which can be set via JSON option `locale`. The default locale is `en-US`. To switch back to previous behavior, set `spark.sql.legacy.decimalParsing.enabled` to `true`.
--- End diff --
> Is there any performance regression with DecimalFormat?
I haven't benchmarked the changes. Looking at `JSONBenchmarks`, we even don't check decimals there.
> Do we need the DecimalFormat when locale is en-US?
No, we don't need it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r237247452
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1610,6 +1610,13 @@ object SQLConf {
""" "... N more fields" placeholder.""")
.intConf
.createWithDefault(25)
+
+ val LEGACY_DECIMAL_PARSING_ENABLED = buildConf("spark.sql.legacy.decimalParsing.enabled")
+ .doc("If it is set to false, it enables parsing decimals in locale specific formats. " +
+ "To switch back to previous behaviour when parsing was performed by java.math.BigDecimal " +
+ "and all commas were removed from the input, set the flag to true.")
+ .booleanConf
+ .createWithDefault(false)
--- End diff --
I want to distiguish adding new implementation from enforcing it as a default. Your new implementation will already be in 3.0 release, isn't it? All users can taste it technically.
BTW, we don't have a regular release plan for major release. So if you are thinking of 4.0, I don't think that's feasible option for this PR. Our next feature(minor) release is 3.1. Enabling this in 3.1 sounds natural to me. We already enabled many things in 2.4.0.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23132
@MaxGekk, mind fixing PR description accordingly?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99396/testReport)** for PR 23132 at commit [`1ec56e5`](https://github.com/apache/spark/commit/1ec56e53ec7dfc79d4714f55a1166f3b66e6f03b).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/23132
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99417 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99417/testReport)** for PR 23132 at commit [`1ec56e5`](https://github.com/apache/spark/commit/1ec56e53ec7dfc79d4714f55a1166f3b66e6f03b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99417 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99417/testReport)** for PR 23132 at commit [`1ec56e5`](https://github.com/apache/spark/commit/1ec56e53ec7dfc79d4714f55a1166f3b66e6f03b).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23132
> mind fixing PR description accordingly?
@HyukjinKwon fixed
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99232/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99232 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99232/testReport)** for PR 23132 at commit [`83920b2`](https://github.com/apache/spark/commit/83920b25f586dc242841ff0a73105ae9e43295ed).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5470/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5488/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23132
jenkins, retest this, please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99417/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r236873488
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide
## Upgrading From Spark SQL 2.4 to 3.0
+ - In Spark version 2.4 and earlier, accepted format of decimals parsed from JSON is an optional sign ('+' or '-'), followed by a sequence of zero or more decimal digits, optionally followed by a fraction, optionally followed by an exponent. Any commas were removed from the input before parsing. Since Spark 3.0, format varies and depends on locale which can be set via JSON option `locale`. The default locale is `en-US`. To switch back to previous behavior, set `spark.sql.legacy.decimalParsing.enabled` to `true`.
--- End diff --
Is there any performance regression with `DecimalFormat`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23132
@cloud-fan I did the same for CSV: https://github.com/apache/spark/pull/22979
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23132
@dongjoon-hyun @cloud-fan Please, take a look at the PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r237239829
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide
## Upgrading From Spark SQL 2.4 to 3.0
+ - In Spark version 2.4 and earlier, accepted format of decimals parsed from JSON is an optional sign ('+' or '-'), followed by a sequence of zero or more decimal digits, optionally followed by a fraction, optionally followed by an exponent. Any commas were removed from the input before parsing. Since Spark 3.0, format varies and depends on locale which can be set via JSON option `locale`. The default locale is `en-US`. To switch back to previous behavior, set `spark.sql.legacy.decimalParsing.enabled` to `true`.
--- End diff --
I removed SQL config and record in the migration guid. Also I applied `DecimalFormat` only for not `en-US` locales.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5321/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99232 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99232/testReport)** for PR 23132 at commit [`83920b2`](https://github.com/apache/spark/commit/83920b25f586dc242841ff0a73105ae9e43295ed).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r237120812
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide
## Upgrading From Spark SQL 2.4 to 3.0
+ - In Spark version 2.4 and earlier, accepted format of decimals parsed from JSON is an optional sign ('+' or '-'), followed by a sequence of zero or more decimal digits, optionally followed by a fraction, optionally followed by an exponent. Any commas were removed from the input before parsing. Since Spark 3.0, format varies and depends on locale which can be set via JSON option `locale`. The default locale is `en-US`. To switch back to previous behavior, set `spark.sql.legacy.decimalParsing.enabled` to `true`.
--- End diff --
since the default value is `en-US`, can we skip `DecimalFormat` when locale is `en-US`? Then there is nothing changes by default, and we don't even need a config.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r237058331
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide
## Upgrading From Spark SQL 2.4 to 3.0
+ - In Spark version 2.4 and earlier, accepted format of decimals parsed from JSON is an optional sign ('+' or '-'), followed by a sequence of zero or more decimal digits, optionally followed by a fraction, optionally followed by an exponent. Any commas were removed from the input before parsing. Since Spark 3.0, format varies and depends on locale which can be set via JSON option `locale`. The default locale is `en-US`. To switch back to previous behavior, set `spark.sql.legacy.decimalParsing.enabled` to `true`.
--- End diff --
I have the same question. Do we need the `DecimalFormat` when locale is `en-US`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99318 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99318/testReport)** for PR 23132 at commit [`72ebd34`](https://github.com/apache/spark/commit/72ebd340e5c3130cf5174d7f03b3cc93f7900f9d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r237091495
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1610,6 +1610,13 @@ object SQLConf {
""" "... N more fields" placeholder.""")
.intConf
.createWithDefault(25)
+
+ val LEGACY_DECIMAL_PARSING_ENABLED = buildConf("spark.sql.legacy.decimalParsing.enabled")
+ .doc("If it is set to false, it enables parsing decimals in locale specific formats. " +
+ "To switch back to previous behaviour when parsing was performed by java.math.BigDecimal " +
+ "and all commas were removed from the input, set the flag to true.")
+ .booleanConf
+ .createWithDefault(false)
--- End diff --
> Can we have true by default?
Yes, sure.
> I usually recommend to keep the existing behavior during one next release for safety.
Would it be better to switch on new implementation in major release (3.0) neither in minor one?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23132
thanks, merging to master!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99396/testReport)** for PR 23132 at commit [`1ec56e5`](https://github.com/apache/spark/commit/1ec56e53ec7dfc79d4714f55a1166f3b66e6f03b).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99318/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99396/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23132
LGTM, does CSV need to do the same?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/23132
`spark.sql.legacy.decimalParsing.enabled` is still shown in the PR description and commit messages.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23132
**[Test build #99318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99318/testReport)** for PR 23132 at commit [`72ebd34`](https://github.com/apache/spark/commit/72ebd340e5c3130cf5174d7f03b3cc93f7900f9d).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23132: [SPARK-26163][SQL] Parsing decimals from JSON usi...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/23132#discussion_r236874565
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1610,6 +1610,13 @@ object SQLConf {
""" "... N more fields" placeholder.""")
.intConf
.createWithDefault(25)
+
+ val LEGACY_DECIMAL_PARSING_ENABLED = buildConf("spark.sql.legacy.decimalParsing.enabled")
+ .doc("If it is set to false, it enables parsing decimals in locale specific formats. " +
+ "To switch back to previous behaviour when parsing was performed by java.math.BigDecimal " +
+ "and all commas were removed from the input, set the flag to true.")
+ .booleanConf
+ .createWithDefault(false)
--- End diff --
Can we have `true` by default? I usually recommend to keep the existing behavior during one next release for safety. But, @gatorsmile might have different opinion for this.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5400/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23132: [SPARK-26163][SQL] Parsing decimals from JSON using loca...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23132
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org