You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cnZach <gi...@git.apache.org> on 2018/06/27 03:40:16 UTC

[GitHub] spark pull request #21647: [apache/spark] [SPARK-21335] [DOC] doc changes fo...

GitHub user cnZach opened a pull request:

    https://github.com/apache/spark/pull/21647

    [apache/spark] [SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case

    ## What changes were proposed in this pull request?
    Document a change for un-aliased subquery use case, to address the last question in PR #18559:
    https://github.com/apache/spark/pull/18559#issuecomment-316884858
    
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
     it does not affect tests.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cnZach/spark doc_change_for_SPARK-20690_SPARK-21335

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21647.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21647
    
----
commit c611a113dce56600928cdbada69b6ec832903043
Author: Yuexin Zhang <za...@...>
Date:   2018-06-27T03:33:14Z

    add documentaion for disallowed un-aliased subquery use case

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    It is quite a bit long before. Anyway, this document looks fine to me.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21647#discussion_r198364566
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
         - Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
         - The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
       - In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
    +  - Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
    --- End diff --
    
    for details. -> for more details.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by cnZach <gi...@git.apache.org>.
Github user cnZach commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    @viirya @cloud-fan , please kindly help to review. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21647#discussion_r198364542
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
         - Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
         - The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
       - In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
    +  - Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
    --- End diff --
    
    Also consider:
    
    Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery.
    
    ->
    
    The cases throw an analysis exception because users should not be able to use the qualifier inside a subquery.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92371/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    **[Test build #92371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92371/testReport)** for PR 21647 at commit [`bebc3a8`](https://github.com/apache/spark/commit/bebc3a8117c5d8c9b75d1b90f0ae1594fd3e55cc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21647


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    **[Test build #92370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92370/testReport)** for PR 21647 at commit [`c611a11`](https://github.com/apache/spark/commit/c611a113dce56600928cdbada69b6ec832903043).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    **[Test build #92371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92371/testReport)** for PR 21647 at commit [`bebc3a8`](https://github.com/apache/spark/commit/bebc3a8117c5d8c9b75d1b90f0ae1594fd3e55cc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21647: [SPARK-21335] [DOC] doc changes for disallowed un...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21647#discussion_r198364360
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
         - Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
         - The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
       - In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
    +  - Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
    --- End diff --
    
    Not a big deal but please consider:
    
    Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`
    
    >
    
    Un-aliased subquery's semantic has not been well defined with confusing behaviors. Since Spark 2.3, we invalidate such confusing cases, for example, `SELECT v.i from (SELECT i FROM v)`.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    LGTM except the title issue pointed out by @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by cnZach <gi...@git.apache.org>.
Github user cnZach commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    okay, changed the PR title. Thanks. @cloud-fan @viirya  


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92370/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [apache/spark] [SPARK-21335] [DOC] doc changes for disal...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    We don't need the `[apache/spark]` prefix in the PR title. Can you remove it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    **[Test build #92370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92370/testReport)** for PR 21647 at commit [`c611a11`](https://github.com/apache/spark/commit/c611a113dce56600928cdbada69b6ec832903043).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by cnZach <gi...@git.apache.org>.
Github user cnZach commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    @HyukjinKwon updated, thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21647: [SPARK-21335] [DOC] doc changes for disallowed un-aliase...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21647
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org