You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cnZach <gi...@git.apache.org> on 2018/06/27 03:40:16 UTC
[GitHub] spark pull request #21647: [apache/spark] [SPARK-21335] [DOC] doc changes fo...
GitHub user cnZach opened a pull request:
https://github.com/apache/spark/pull/21647
[apache/spark] [SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case
## What changes were proposed in this pull request?
Document a change for un-aliased subquery use case, to address the last question in PR #18559:
https://github.com/apache/spark/pull/18559#issuecomment-316884858
(Please fill in changes proposed in this fix)
## How was this patch tested?
it does not affect tests.
Please review http://spark.apache.org/contributing.html before opening a pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cnZach/spark doc_change_for_SPARK-20690_SPARK-21335
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21647.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21647
----
commit c611a113dce56600928cdbada69b6ec832903043
Author: Yuexin Zhang <za...@...>
Date: 2018-06-27T03:33:14Z
add documentaion for disallowed un-aliased subquery use case
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21647
It is quite a bit long before. Anyway, this document looks fine to me.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21647
thanks, merging to master!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21647#discussion_r198364566
--- Diff: docs/sql-programming-guide.md ---
@@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
- The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
- In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
+ - Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
--- End diff --
for details. -> for more details.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21647
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by cnZach <gi...@git.apache.org>.
Github user cnZach commented on the issue:
https://github.com/apache/spark/pull/21647
@viirya @cloud-fan , please kindly help to review. Thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21647#discussion_r198364542
--- Diff: docs/sql-programming-guide.md ---
@@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
- The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
- In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
+ - Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
--- End diff --
Also consider:
Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery.
->
The cases throw an analysis exception because users should not be able to use the qualifier inside a subquery.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92371/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21647
**[Test build #92371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92371/testReport)** for PR 21647 at commit [`bebc3a8`](https://github.com/apache/spark/commit/bebc3a8117c5d8c9b75d1b90f0ae1594fd3e55cc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/21647
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21647
**[Test build #92370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92370/testReport)** for PR 21647 at commit [`c611a11`](https://github.com/apache/spark/commit/c611a113dce56600928cdbada69b6ec832903043).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21647
**[Test build #92371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92371/testReport)** for PR 21647 at commit [`bebc3a8`](https://github.com/apache/spark/commit/bebc3a8117c5d8c9b75d1b90f0ae1594fd3e55cc).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21647#discussion_r198364360
--- Diff: docs/sql-programming-guide.md ---
@@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
- The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
- In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
+ - Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
--- End diff --
Not a big deal but please consider:
Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`
>
Un-aliased subquery's semantic has not been well defined with confusing behaviors. Since Spark 2.3, we invalidate such confusing cases, for example, `SELECT v.i from (SELECT i FROM v)`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21647
LGTM except the title issue pointed out by @viirya
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by cnZach <gi...@git.apache.org>.
Github user cnZach commented on the issue:
https://github.com/apache/spark/pull/21647
okay, changed the PR title. Thanks. @cloud-fan @viirya
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92370/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21647
We don't need the `[apache/spark]` prefix in the PR title. Can you remove it?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21647
**[Test build #92370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92370/testReport)** for PR 21647 at commit [`c611a11`](https://github.com/apache/spark/commit/c611a113dce56600928cdbada69b6ec832903043).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by cnZach <gi...@git.apache.org>.
Github user cnZach commented on the issue:
https://github.com/apache/spark/pull/21647
@HyukjinKwon updated, thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21647
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org