You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/03 13:31:22 UTC

[GitHub] [spark] wangyum opened a new pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

wangyum opened a new pull request #30222:
URL: https://github.com/apache/spark/pull/30222


   ### What changes were proposed in this pull request?
   
   This pr simplify `CaseWhen` with `EqualTo`, for exmaple:
   ```sql
   create table t(a int, b int, c int) using parquet;
   SELECT * 
   FROM   (SELECT CASE 
                    WHEN a = 100 THEN 1 
                    WHEN b > 1000 THEN 2 
                    WHEN c IS NOT NULL THEN 3 
                  END AS x 
           FROM   t) tmp 
   WHERE  x = 2
   ```
   
   Before this PR:
   ```scala
   == Physical Plan ==
   *(1) Project [CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END AS x#5]
   +- *(1) Filter (CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END = 2)
      +- *(1) ColumnarToRow
         +- FileScan parquet default.t[a#1,b#2,c#3] Batched: true, DataFilters: [(CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END = 2)], Format: Parquet, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:int,b:int,c:int>
   ```
   
   After this PR:
   ```scala
   == Physical Plan ==
   *(1) Project [CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END AS x#0]
   +- *(1) Filter (b#2 > 1000)
      +- *(1) ColumnarToRow
         +- FileScan parquet default.t[a#1,b#2,c#3] Batched: true, DataFilters: [(b#2 > 1000)], Format: Parquet, PartitionFilters: [], PushedFilters: [GreaterThan(b,1000)], ReadSchema: struct<a:int,b:int,c:int>
   
   ```
   
   
   ### Why are the changes needed?
   
   Improve query performance.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Unit test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722831410


   @wangyum do you know how we optimize the plan wrongly step by step?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659058






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722919383






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r542200066



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -523,6 +523,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case e @ EqualTo(c @ CaseWhen(branches, elseValue), right)
+          if c.deterministic &&
+            right.isInstanceOf[Literal] && branches.forall(_._2.isInstanceOf[Literal]) &&
+            elseValue.forall(_.isInstanceOf[Literal]) =>
+        if ((branches.map(_._2) ++ elseValue).forall(!_.equals(right))) {
+          FalseLiteral

Review comment:
       Let's update the JIRA/PR title, as it's a different optimization now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722289313






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722957043


   Sorry. This change has logic issue, for example:
   ```scala
   spark.sql("CREATE TABLE t using parquet AS SELECT if(id % 2 = 7, null, id) AS a FROM range(7)")
   spark.sql(
     """
       |SELECT *
       |  FROM   (SELECT CASE
       |    WHEN a > 1 THEN 1
       |    WHEN a > 3 THEN 3
       |    WHEN a > 5 THEN 5
       |    ELSE 6
       |END AS x
       |FROM t ) t1
       |WHERE x = 3
       |""".stripMargin).show
   ```
   Before this pr, the result is empty, after this pr, the result is not empty.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722095011






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r516165401



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(CaseWhen(branches, _), right)
+        if branches.count(_._2.semanticEquals(right)) == 1 =>

Review comment:
       indentation?

##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyConditionalSuite.scala
##########
@@ -199,4 +199,26 @@ class SimplifyConditionalSuite extends PlanTest with ExpressionEvalHelper with P
         If(Factorial(5) > 100L, b, nullLiteral).eval(EmptyRow))
     }
   }
+
+  test("simplify CaseWhen with EqualTo") {

Review comment:
       Shall we use JIRA ID prefix, `test("SPARK-33315: ...`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112519


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644802


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287303


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130645/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517787607



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,14 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(CaseWhen(branches, elseValue), right)
+          if right.foldable && branches.forall(_._2.foldable) =>
+        (branches.filter(_._2.equals(right)).map(_._1) ++

Review comment:
       `equals` is only well implemented in `Literal`, but the condition we use is `.foldable`. Shall we change the condition to `.isInstanceOf[Literal]` and wait for the constant folding rule before running this rule?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517095107



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(CaseWhen(branches, _), right)

Review comment:
       As an example `(CASE WHEN a=1 THEN 1 ELSE b) = 1` can be true if `a=1` or `b=1`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517094786



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(CaseWhen(branches, _), right)

Review comment:
       I'm a bit worried about dropping other branches in CASE WHEN. `a.semanticEquals(b)` means `a` is always equal to `b`. But `!a.semanticEquals(b)` doesn't mean that `a` will never be equal to `b`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722289313






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r543812718



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -523,6 +523,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case e @ EqualTo(c @ CaseWhen(branches, elseValue), right)
+          if c.deterministic &&
+            right.isInstanceOf[Literal] && branches.forall(_._2.isInstanceOf[Literal]) &&
+            elseValue.forall(_.isInstanceOf[Literal]) =>
+        if ((branches.map(_._2) ++ elseValue).forall(!_.equals(right))) {
+          FalseLiteral

Review comment:
       https://github.com/apache/spark/pull/30790/files




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum closed pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum closed pull request #30222:
URL: https://github.com/apache/spark/pull/30222


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743290746


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37273/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517093766



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(CaseWhen(branches, _), right)
+          if branches.count(_._2.semanticEquals(right)) == 1 =>

Review comment:
       if there are more than one matches, shall we combine the conditions with `Or`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722620398


   **[Test build #130665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130665/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722807366


   @wangyum, it's https://github.com/apache/spark/pull/21852 right? Can you file a blocker JIRA?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722617489


   Retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r542199756



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -523,6 +523,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case e @ EqualTo(c @ CaseWhen(branches, elseValue), right)
+          if c.deterministic &&
+            right.isInstanceOf[Literal] && branches.forall(_._2.isInstanceOf[Literal]) &&
+            elseValue.forall(_.isInstanceOf[Literal]) =>
+        if ((branches.map(_._2) ++ elseValue).forall(!_.equals(right))) {

Review comment:
       can we use an `EqualTo` expression to compare literals? and how about the null semantic?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517739136



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(CaseWhen(branches, _), right)
+          if branches.count(_._2.semanticEquals(right)) == 1 =>

Review comment:
       Yes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287119


   **[Test build #130645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130645/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287294


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722110851






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722079317


   **[Test build #130630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130630/testReport)** for PR 30222 at commit [`b611659`](https://github.com/apache/spark/commit/b6116598203b0a9c81c77bcb2d03ef001b2306a3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722095011






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722114392


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35237/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722079317


   **[Test build #130630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130630/testReport)** for PR 30222 at commit [`b611659`](https://github.com/apache/spark/commit/b6116598203b0a9c81c77bcb2d03ef001b2306a3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659066


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35276/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722919383






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722645553


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35276/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743363032


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/132669/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907690


   **[Test build #130694 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130694/testReport)** for PR 30222 at commit [`5a90bfc`](https://github.com/apache/spark/commit/5a90bfcee523eb480b41ff0240ea22c3d5f7d931).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720639579


   Also, cc @cloud-fan and @sunchao 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722247724


   **[Test build #130645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130645/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112436


   **[Test build #130630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130630/testReport)** for PR 30222 at commit [`b611659`](https://github.com/apache/spark/commit/b6116598203b0a9c81c77bcb2d03ef001b2306a3).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907801


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644645


   **[Test build #130665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130665/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287294






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743274036


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37273/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659058


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-723291872


   Thank you for your decision, @wangyum and @cloud-fan .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722863678


   **[Test build #130694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130694/testReport)** for PR 30222 at commit [`5a90bfc`](https://github.com/apache/spark/commit/5a90bfcee523eb480b41ff0240ea22c3d5f7d931).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743363032


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/132669/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722247724


   **[Test build #130645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130645/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722110851






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743725618


   @cloud-fan @dongjoon-hyun We can improve the following case to reduce `Union` operator:
   ```sql
   create table t1 using parquet as select * from range(100);
   create table t2 using parquet as select * from range(200);
   
   create temp view v1 as                                                             
   select 'a' as event_type, * from t1                                                
   union all                                                                          
   select CASE WHEN id = 1 THEN 'b' WHEN id = 3 THEN 'c' end as event_type, * from t2;
   
   explain select * from v1 where event_type = 'a';
   == Physical Plan ==
   Union
   :- *(1) Project [a AS event_type#8, id#10L]
   :  +- *(1) ColumnarToRow
   :     +- FileScan parquet default.t1[id#10L] Batched: true, DataFilters: [], Format: Parquet,
   +- *(2) Project [CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END AS event_type#9, id#11L]
      +- *(2) Filter (CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = a)
         +- *(2) ColumnarToRow
            +- FileScan parquet default.t2[id#11L] Batched: true, DataFilters: [(CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = a)], Format: Parquet
   
   
   explain select * from v1 where event_type = 'b';
   == Physical Plan ==
   *(1) Project [CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END AS event_type#8, id#11L AS id#10L]
   +- *(1) Filter (CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = b)
      +- *(1) ColumnarToRow
         +- FileScan parquet default.t2[id#11L] Batched: true, DataFilters: [(CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = b)], Format: Parquet
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720308627






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720308627






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644802






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112524


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130630/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659035


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35276/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743304380


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37273/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743240790


   **[Test build #132669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132669/testReport)** for PR 30222 at commit [`312c613`](https://github.com/apache/spark/commit/312c6139ff209472a5cea6f4fe5bd1fdc2040a08).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum closed pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum closed pull request #30222:
URL: https://github.com/apache/spark/pull/30222


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722095486


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35235/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743356753


   **[Test build #132669 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132669/testReport)** for PR 30222 at commit [`312c613`](https://github.com/apache/spark/commit/312c6139ff209472a5cea6f4fe5bd1fdc2040a08).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722077202


   **[Test build #130629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130629/testReport)** for PR 30222 at commit [`ee5e6dd`](https://github.com/apache/spark/commit/ee5e6ddfbc25e879ed92ea0a1a6c3470ebd52214).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r518539170



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,15 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
         } else {
           e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
         }
+
+      case EqualTo(c @ CaseWhen(branches, elseValue), right)
+          if c.deterministic &&

Review comment:
       More precisely, I think we only need to make sure the skipped branches are all deterministic.
   ```
   val (picked, skipped) = branches.partition(_._2.equals(right))
   if (skipped.forall(_._1.determinisitc)) {
     ...
   } else {
     original
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722751973


   It seems it is caused by **deterministic**. cc @viirya
   ```
   == Analyzed Logical Plan ==
   label: double, features: vector, fold: int
   Filter (UDF(fold#14) AND NOT (fold#14 = 2))
   +- Repartition 2, true
      +- Project [label#3, features#4, fold#14]
         +- Project [label#3, features#4, random#10, CASE WHEN (random#10 < 0.33) THEN 0 WHEN (random#10 < 0.66) THEN 1 ELSE 2 END AS fold#14]
            +- Project [label#3, features#4, rand(100) AS random#10]
               +- Repartition 1, true
                  +- SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.ml.feature.LabeledPoint, true])).label AS label#3, newInstance(class org.apache.spark.ml.linalg.VectorUDT).serialize AS features#4]
                     +- ExternalRDD [obj#2]
   
   == Optimized Logical Plan ==
   LocalRelation <empty>, [label#3, features#4, fold#14]
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722094276


   **[Test build #130629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130629/testReport)** for PR 30222 at commit [`ee5e6dd`](https://github.com/apache/spark/commit/ee5e6ddfbc25e879ed92ea0a1a6c3470ebd52214).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722620398


   **[Test build #130665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130665/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720280772






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907809


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130694/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743240790


   **[Test build #132669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132669/testReport)** for PR 30222 at commit [`312c613`](https://github.com/apache/spark/commit/312c6139ff209472a5cea6f4fe5bd1fdc2040a08).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722914471


   This seems to fail still.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123386


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-721474620


   Hive optimized it to `predicate: CASE WHEN ((a = 100)) THEN (false) WHEN ((b > 1000)) THEN (true) WHEN (c is not null) THEN (false) ELSE (null) END (type: boolean)`. But this condition can not push down. We can optimized it to `b > 1000` and push down it.
   ```
   hive> explain SELECT *
       > FROM   (SELECT CASE
       >                  WHEN a = 100 THEN 1
       >                  WHEN b > 1000 THEN 2
       >                  WHEN c IS NOT NULL THEN 3
       >                END AS x
       >         FROM   t) tmp
       > WHERE  x = 2;
   OK
   STAGE DEPENDENCIES:
     Stage-0 is a root stage
   
   STAGE PLANS:
     Stage: Stage-0
       Fetch Operator
         limit: -1
         Processor Tree:
           TableScan
             alias: t
             Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
             Filter Operator
               predicate: CASE WHEN ((a = 100)) THEN (false) WHEN ((b > 1000)) THEN (true) WHEN (c is not null) THEN (false) ELSE (null) END (type: boolean)
               Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
               Select Operator
                 expressions: CASE WHEN ((a = 100)) THEN (1) WHEN ((b > 1000)) THEN (2) WHEN (c is not null) THEN (3) ELSE (null) END (type: int)
                 outputColumnNames: _col0
                 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                 ListSink
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722077202


   **[Test build #130629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130629/testReport)** for PR 30222 at commit [`ee5e6dd`](https://github.com/apache/spark/commit/ee5e6ddfbc25e879ed92ea0a1a6c3470ebd52214).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720280772






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722919375


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35304/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722863678


   **[Test build #130694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130694/testReport)** for PR 30222 at commit [`5a90bfc`](https://github.com/apache/spark/commit/5a90bfcee523eb480b41ff0240ea22c3d5f7d931).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720320498


   retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123377


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35237/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123396


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35237/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722970648


   I see, the case when conditions are not orthogonal. We can't skip any of them.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722289297


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35255/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722110837


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35235/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743304380


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37273/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722271155


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35255/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112519






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722909905


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35304/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644809


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130665/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123386






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907801






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo

Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722858604


   We can reproduce it by:
   ```scala
   spark.sql("CREATE TABLE t(a int, b int, c int) using parquet")
   spark.sql(
     """
       |SELECT *
       |  FROM   (SELECT CASE
       |    WHEN rd > 1 THEN 1
       |    WHEN b > 1000 THEN 2
       |    WHEN c < 100 THEN 3
       |    ELSE 4
       |END AS x
       |FROM (SELECT *, rand(100) as rd FROM t) t1) t2
       |WHERE  x = 2
       |""".stripMargin).explain
   ```
   
   1. `Alias.toAttribute` construct `AttributeReference` with default deterministic, that is true:
   https://github.com/apache/spark/blob/ca2cfd4185586993f981cfd2f1aff30ee6b2294e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala#L181
   
   2. Therefore, deterministic is true, and`SimplifyConditionals` can simplify it:
   ![image](https://user-images.githubusercontent.com/5399861/98330987-9aa8ab00-2036-11eb-8acf-93f1a2b9f404.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org