You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/24 15:16:08 UTC

[GitHub] [spark] gengliangwang opened a new pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

gengliangwang opened a new pull request #31954:
URL: https://github.com/apache/spark/pull/31954


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Allow casting complex types as string type in ANSI mode.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Currently, complex types are not allowed to cast as string type. This breaks the DataFrame.show() API. E.g
   ```
   scala> sql(“select array(1, 2, 2)“).show(false)
   org.apache.spark.sql.AnalysisException: cannot resolve ‘CAST(`array(1, 2, 2)` AS STRING)’ due to data type mismatch:
    cannot cast array<int> to string with ANSI mode on.
   ```
   We should allow the conversion as the extension of the ANSI SQL standard, so that the DataFrame.show() still work in ANSI mode.
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   Yes, casting complex types as string type is now allowed in ANSI mode.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807026661


   Thanks for the review.  Merging to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806082699


   This is a documented behavior, https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html.
   >  If a user can trigger an unexpected failure with public APIs, that's a bug to me.
   
   So, this is not an unexpected failure. For me, this is a well-documented limitation of Apache Spark 3.1.x with the explicit test coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807319951


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136524/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r600787480



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -1873,6 +1873,8 @@ object AnsiCast {
 
     case (NullType, _) => true
 
+    case (_, StringType) => true

Review comment:
       @cloud-fan . Usually, we do the explicitly allowed-list approach in case of types. Is this change okay?
   If this PR aims for `complex type` only, why don't we add them explicitly instead of doing this widely. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806838834


   **[Test build #136524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136524/testReport)** for PR 31954 at commit [`1707885`](https://github.com/apache/spark/commit/1707885814cf846d567fdbca65a2ded88643d679).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806999324


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41107/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807319951


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136524/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r600663383



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -1873,6 +1873,8 @@ object AnsiCast {
 
     case (NullType, _) => true
 
+    case (_, StringType) => true

Review comment:
       Yes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806055447


   @cloud-fan . It's not fixed. The new implementation is disabled. :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807283386


   **[Test build #136524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136524/testReport)** for PR 31954 at commit [`1707885`](https://github.com/apache/spark/commit/1707885814cf846d567fdbca65a2ded88643d679).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806004045


   New improvements always deliver something which doesn't work before. According to the existing test case, this limitation is designed from the beginning at the Spark 3.1 implementation. For me, it's just a known limitation instead of a bug.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r600672427



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##########
@@ -851,12 +962,6 @@ abstract class AnsiCastSuiteBase extends CastSuiteBase {
     assert(cast(booleanLiteral, DateType).checkInputDataTypes().isFailure)
   }
 
-  test("ANSI mode: disallow casting complex types as String type") {

Review comment:
       Yea it is designed like this, but if `df.show` can't work, I think it's a bug in the design...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806838834


   **[Test build #136524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136524/testReport)** for PR 31954 at commit [`1707885`](https://github.com/apache/spark/commit/1707885814cf846d567fdbca65a2ded88643d679).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806757880


   Talked to @cloud-fan offline. It's overkilling to find a different fix in the implementation of`df.show()`.  Let's use this simple solution and merge to master only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806012897


   For SPARK-34827, it does expose a bug and we fixed it in 3.1, see https://github.com/apache/spark/pull/31898
   
   I don't think failing in `df.show` is a limitation. It's obviously a bug. Failing with AQE + IO Encryption is also a bug and you merged that fix to 3.1. A limitation can be "ANSI mode is forcibly disabled in df.show", but it's kind of a weird behavior (`df.collect` and `df.show` behave differently) and I'd prefer to just fix this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806165816


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41054/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-805961387


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136470/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806087197


   I don't see any document saying that `df.show` needs to cast the data to string, and it's unexpected that `df.show` fails with ANSI mode.
   
   I agree that it's tricky to change a documented behavior, maybe it's arguable to change the ANSI explicit CAST behavior in 3.1. But `df.show` should be fixed as it's a bug. If you don't agree to change the ANSI explicit CAST behavior, we can try to find a different fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806055447


   @cloud-fan . It's not fixed. The new implementation is disabled. :)
   Please note that SPARK-34827 is different from SPARK-34790. SPARK-34827 is not fixed it in 3.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-805995102


   @dongjoon-hyun I think putting it in branch-3.1 makes sense as well. It fixes the broken API `DataFrame.show()` under ANSI mode.
   As the 3.2 release won't be available for months, it doesn't hunt to port this to branch-3.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yaooqinn commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r600637835



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -1873,6 +1873,8 @@ object AnsiCast {
 
     case (NullType, _) => true
 
+    case (_, StringType) => true

Review comment:
       Does this affect the coming year-month and day-time interval?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807129079


   Thank you for the decision, @gengliangwang and @cloud-fan .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r601620213



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -1873,6 +1873,8 @@ object AnsiCast {
 
     case (NullType, _) => true
 
+    case (_, StringType) => true

Review comment:
       So far, we don't support such casting. I opened the JIRAs for that: SPARK-34667 and SPARK-34668




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-805915268


   **[Test build #136470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136470/testReport)** for PR 31954 at commit [`31ab2b6`](https://github.com/apache/spark/commit/31ab2b69c07cf33661da833b0edb322842907f70).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806082699


   This is a documented behavior, https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html.
   >  If a user can trigger an unexpected failure with public APIs, that's a bug to me.
   
   So, this is not an unexpected one. For me, this is a well-documented limitation of Apache Spark 3.1.x with the explicit test coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang closed pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
gengliangwang closed pull request #31954:
URL: https://github.com/apache/spark/pull/31954


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang edited a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
gengliangwang edited a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806757880


   Talked to @cloud-fan offline. It's overkilling to find a different fix(e.g. create a new cast expression) in the implementation of`df.show()`.  Let's use this simple solution and merge to master only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806070265


   I don't understand what's the difference here. Both AQE and IO Encryption are disabled by default in 3.1 and we still merged #31898 to 3.1
   
   I'm OK if you have different ideas to make `df.show` not fail when ANSI mode is turned on. If a user can trigger an unexpected failure with public APIs, that's a bug to me.
   
   I agree we should have more QAs, but it doesn't mean we should stop backporting bug fixes. We have fixed many AQE issues and backported them to 3.0/3.1, although AQE is not turned on by default there. If you don't think this PR is the corrected fix, that's a different story and we can discuss more here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806165816


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41054/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r601485468



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -1873,6 +1873,8 @@ object AnsiCast {
 
     case (NullType, _) => true
 
+    case (_, StringType) => true

Review comment:
       `df.show` needs to cast the column to string, I think we need to support casting from all the data types here, otherwise `df.show` may still be broken under some cases.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-805991017


   For me, this is an improvement for Apache Spark 3.2.0, @gengliangwang .
   Let's not backport this to branch-3.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806920338


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41107/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807033950


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41107/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806005652


   New features do not work properly with all combinations. Like SPARK-34827 (AQE + IO Encryption), this kind of efforts should be done in the new releases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-805961387


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136470/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806160546


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41054/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r600787948



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
##########
@@ -1873,6 +1873,8 @@ object AnsiCast {
 
     case (NullType, _) => true
 
+    case (_, StringType) => true

Review comment:
       Also, cc @MaxGekk 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806165789


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41054/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #31954:
URL: https://github.com/apache/spark/pull/31954#discussion_r600669928



##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##########
@@ -851,12 +962,6 @@ abstract class AnsiCastSuiteBase extends CastSuiteBase {
     assert(cast(booleanLiteral, DateType).checkInputDataTypes().isFailure)
   }
 
-  test("ANSI mode: disallow casting complex types as String type") {

Review comment:
       Hi, @cloud-fan .
   According to this test case, it's not a bug because it is designed like this.
   > ANSI explicit CAST is in 3.1 so this is a bug fix of ANSI mode for 3.1 (because df.show is not working with ANSI mode)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806058880


   Like we are doing in SPARK-33828 for AQE QA, I believe ANSI also needs more QA with umbrella JIRA issue, @gengliangwang and @cloud-fan .
   Both AQE and ANSI is hidden by default since 3.0, we still are receiving more issues. We know that there is no bug-free status.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-805936752


   ANSI explicit CAST is in 3.1 so this is a bug fix of ANSI mode for 3.1 (because `df.show` is not working with ANSI mode)? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806132442


   +1 for that. I guess it will be a fix for `df.show` function which is applicable to master/3.1 together.
   >  If you don't agree to change the ANSI explicit CAST behavior, we can try to find a different fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-806005652


   New features do not work properly with all combinations. Like SPARK-34827 (AQE + IO Encryption), this should be done in the new releases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31954: [SPARK-34856][SQL] ANSI mode: Allow casting complex types as string type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31954:
URL: https://github.com/apache/spark/pull/31954#issuecomment-807033950


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41107/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org