You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/03 13:31:22 UTC
[GitHub] [spark] wangyum opened a new pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
wangyum opened a new pull request #30222:
URL: https://github.com/apache/spark/pull/30222
### What changes were proposed in this pull request?
This pr simplify `CaseWhen` with `EqualTo`, for exmaple:
```sql
create table t(a int, b int, c int) using parquet;
SELECT *
FROM (SELECT CASE
WHEN a = 100 THEN 1
WHEN b > 1000 THEN 2
WHEN c IS NOT NULL THEN 3
END AS x
FROM t) tmp
WHERE x = 2
```
Before this PR:
```scala
== Physical Plan ==
*(1) Project [CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END AS x#5]
+- *(1) Filter (CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END = 2)
+- *(1) ColumnarToRow
+- FileScan parquet default.t[a#1,b#2,c#3] Batched: true, DataFilters: [(CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END = 2)], Format: Parquet, PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:int,b:int,c:int>
```
After this PR:
```scala
== Physical Plan ==
*(1) Project [CASE WHEN (a#1 = 100) THEN 1 WHEN (b#2 > 1000) THEN 2 WHEN isnotnull(c#3) THEN 3 END AS x#0]
+- *(1) Filter (b#2 > 1000)
+- *(1) ColumnarToRow
+- FileScan parquet default.t[a#1,b#2,c#3] Batched: true, DataFilters: [(b#2 > 1000)], Format: Parquet, PartitionFilters: [], PushedFilters: [GreaterThan(b,1000)], ReadSchema: struct<a:int,b:int,c:int>
```
### Why are the changes needed?
Improve query performance.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Unit test.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722831410
@wangyum do you know how we optimize the plan wrongly step by step?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659058
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722919383
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r542200066
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -523,6 +523,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case e @ EqualTo(c @ CaseWhen(branches, elseValue), right)
+ if c.deterministic &&
+ right.isInstanceOf[Literal] && branches.forall(_._2.isInstanceOf[Literal]) &&
+ elseValue.forall(_.isInstanceOf[Literal]) =>
+ if ((branches.map(_._2) ++ elseValue).forall(!_.equals(right))) {
+ FalseLiteral
Review comment:
Let's update the JIRA/PR title, as it's a different optimization now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722289313
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722957043
Sorry. This change has logic issue, for example:
```scala
spark.sql("CREATE TABLE t using parquet AS SELECT if(id % 2 = 7, null, id) AS a FROM range(7)")
spark.sql(
"""
|SELECT *
| FROM (SELECT CASE
| WHEN a > 1 THEN 1
| WHEN a > 3 THEN 3
| WHEN a > 5 THEN 5
| ELSE 6
|END AS x
|FROM t ) t1
|WHERE x = 3
|""".stripMargin).show
```
Before this pr, the result is empty, after this pr, the result is not empty.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722095011
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r516165401
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(CaseWhen(branches, _), right)
+ if branches.count(_._2.semanticEquals(right)) == 1 =>
Review comment:
indentation?
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyConditionalSuite.scala
##########
@@ -199,4 +199,26 @@ class SimplifyConditionalSuite extends PlanTest with ExpressionEvalHelper with P
If(Factorial(5) > 100L, b, nullLiteral).eval(EmptyRow))
}
}
+
+ test("simplify CaseWhen with EqualTo") {
Review comment:
Shall we use JIRA ID prefix, `test("SPARK-33315: ...`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112519
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644802
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287303
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130645/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517787607
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,14 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(CaseWhen(branches, elseValue), right)
+ if right.foldable && branches.forall(_._2.foldable) =>
+ (branches.filter(_._2.equals(right)).map(_._1) ++
Review comment:
`equals` is only well implemented in `Literal`, but the condition we use is `.foldable`. Shall we change the condition to `.isInstanceOf[Literal]` and wait for the constant folding rule before running this rule?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517095107
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(CaseWhen(branches, _), right)
Review comment:
As an example `(CASE WHEN a=1 THEN 1 ELSE b) = 1` can be true if `a=1` or `b=1`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517094786
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(CaseWhen(branches, _), right)
Review comment:
I'm a bit worried about dropping other branches in CASE WHEN. `a.semanticEquals(b)` means `a` is always equal to `b`. But `!a.semanticEquals(b)` doesn't mean that `a` will never be equal to `b`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722289313
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r543812718
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -523,6 +523,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case e @ EqualTo(c @ CaseWhen(branches, elseValue), right)
+ if c.deterministic &&
+ right.isInstanceOf[Literal] && branches.forall(_._2.isInstanceOf[Literal]) &&
+ elseValue.forall(_.isInstanceOf[Literal]) =>
+ if ((branches.map(_._2) ++ elseValue).forall(!_.equals(right))) {
+ FalseLiteral
Review comment:
https://github.com/apache/spark/pull/30790/files
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum closed pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum closed pull request #30222:
URL: https://github.com/apache/spark/pull/30222
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743290746
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37273/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517093766
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(CaseWhen(branches, _), right)
+ if branches.count(_._2.semanticEquals(right)) == 1 =>
Review comment:
if there are more than one matches, shall we combine the conditions with `Or`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722620398
**[Test build #130665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130665/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722807366
@wangyum, it's https://github.com/apache/spark/pull/21852 right? Can you file a blocker JIRA?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722617489
Retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r542199756
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -523,6 +523,16 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case e @ EqualTo(c @ CaseWhen(branches, elseValue), right)
+ if c.deterministic &&
+ right.isInstanceOf[Literal] && branches.forall(_._2.isInstanceOf[Literal]) &&
+ elseValue.forall(_.isInstanceOf[Literal]) =>
+ if ((branches.map(_._2) ++ elseValue).forall(!_.equals(right))) {
Review comment:
can we use an `EqualTo` expression to compare literals? and how about the null semantic?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r517739136
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,10 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(CaseWhen(branches, _), right)
+ if branches.count(_._2.semanticEquals(right)) == 1 =>
Review comment:
Yes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287119
**[Test build #130645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130645/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287294
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722110851
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722079317
**[Test build #130630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130630/testReport)** for PR 30222 at commit [`b611659`](https://github.com/apache/spark/commit/b6116598203b0a9c81c77bcb2d03ef001b2306a3).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722095011
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722114392
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35237/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722079317
**[Test build #130630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130630/testReport)** for PR 30222 at commit [`b611659`](https://github.com/apache/spark/commit/b6116598203b0a9c81c77bcb2d03ef001b2306a3).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659066
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35276/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722919383
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722645553
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35276/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743363032
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/132669/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907690
**[Test build #130694 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130694/testReport)** for PR 30222 at commit [`5a90bfc`](https://github.com/apache/spark/commit/5a90bfcee523eb480b41ff0240ea22c3d5f7d931).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720639579
Also, cc @cloud-fan and @sunchao
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722247724
**[Test build #130645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130645/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112436
**[Test build #130630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130630/testReport)** for PR 30222 at commit [`b611659`](https://github.com/apache/spark/commit/b6116598203b0a9c81c77bcb2d03ef001b2306a3).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907801
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644645
**[Test build #130665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130665/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722287294
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743274036
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37273/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659058
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-723291872
Thank you for your decision, @wangyum and @cloud-fan .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722863678
**[Test build #130694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130694/testReport)** for PR 30222 at commit [`5a90bfc`](https://github.com/apache/spark/commit/5a90bfcee523eb480b41ff0240ea22c3d5f7d931).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743363032
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/132669/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722247724
**[Test build #130645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130645/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722110851
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743725618
@cloud-fan @dongjoon-hyun We can improve the following case to reduce `Union` operator:
```sql
create table t1 using parquet as select * from range(100);
create table t2 using parquet as select * from range(200);
create temp view v1 as
select 'a' as event_type, * from t1
union all
select CASE WHEN id = 1 THEN 'b' WHEN id = 3 THEN 'c' end as event_type, * from t2;
explain select * from v1 where event_type = 'a';
== Physical Plan ==
Union
:- *(1) Project [a AS event_type#8, id#10L]
: +- *(1) ColumnarToRow
: +- FileScan parquet default.t1[id#10L] Batched: true, DataFilters: [], Format: Parquet,
+- *(2) Project [CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END AS event_type#9, id#11L]
+- *(2) Filter (CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = a)
+- *(2) ColumnarToRow
+- FileScan parquet default.t2[id#11L] Batched: true, DataFilters: [(CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = a)], Format: Parquet
explain select * from v1 where event_type = 'b';
== Physical Plan ==
*(1) Project [CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END AS event_type#8, id#11L AS id#10L]
+- *(1) Filter (CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = b)
+- *(1) ColumnarToRow
+- FileScan parquet default.t2[id#11L] Batched: true, DataFilters: [(CASE WHEN (id#11L = 1) THEN b WHEN (id#11L = 3) THEN c END = b)], Format: Parquet
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720308627
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720308627
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644802
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112524
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130630/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722659035
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35276/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743304380
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37273/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743240790
**[Test build #132669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132669/testReport)** for PR 30222 at commit [`312c613`](https://github.com/apache/spark/commit/312c6139ff209472a5cea6f4fe5bd1fdc2040a08).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum closed pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum closed pull request #30222:
URL: https://github.com/apache/spark/pull/30222
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722095486
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35235/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743356753
**[Test build #132669 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132669/testReport)** for PR 30222 at commit [`312c613`](https://github.com/apache/spark/commit/312c6139ff209472a5cea6f4fe5bd1fdc2040a08).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722077202
**[Test build #130629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130629/testReport)** for PR 30222 at commit [`ee5e6dd`](https://github.com/apache/spark/commit/ee5e6ddfbc25e879ed92ea0a1a6c3470ebd52214).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #30222:
URL: https://github.com/apache/spark/pull/30222#discussion_r518539170
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -510,6 +510,15 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
} else {
e.copy(branches = branches.take(i).map(branch => (branch._1, elseValue)))
}
+
+ case EqualTo(c @ CaseWhen(branches, elseValue), right)
+ if c.deterministic &&
Review comment:
More precisely, I think we only need to make sure the skipped branches are all deterministic.
```
val (picked, skipped) = branches.partition(_._2.equals(right))
if (skipped.forall(_._1.determinisitc)) {
...
} else {
original
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722751973
It seems it is caused by **deterministic**. cc @viirya
```
== Analyzed Logical Plan ==
label: double, features: vector, fold: int
Filter (UDF(fold#14) AND NOT (fold#14 = 2))
+- Repartition 2, true
+- Project [label#3, features#4, fold#14]
+- Project [label#3, features#4, random#10, CASE WHEN (random#10 < 0.33) THEN 0 WHEN (random#10 < 0.66) THEN 1 ELSE 2 END AS fold#14]
+- Project [label#3, features#4, rand(100) AS random#10]
+- Repartition 1, true
+- SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.ml.feature.LabeledPoint, true])).label AS label#3, newInstance(class org.apache.spark.ml.linalg.VectorUDT).serialize AS features#4]
+- ExternalRDD [obj#2]
== Optimized Logical Plan ==
LocalRelation <empty>, [label#3, features#4, fold#14]
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722094276
**[Test build #130629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130629/testReport)** for PR 30222 at commit [`ee5e6dd`](https://github.com/apache/spark/commit/ee5e6ddfbc25e879ed92ea0a1a6c3470ebd52214).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722620398
**[Test build #130665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130665/testReport)** for PR 30222 at commit [`7d7eca3`](https://github.com/apache/spark/commit/7d7eca3de165881cc5c040688af476fad0ac7a20).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720280772
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907809
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130694/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743240790
**[Test build #132669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132669/testReport)** for PR 30222 at commit [`312c613`](https://github.com/apache/spark/commit/312c6139ff209472a5cea6f4fe5bd1fdc2040a08).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722914471
This seems to fail still.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123386
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-721474620
Hive optimized it to `predicate: CASE WHEN ((a = 100)) THEN (false) WHEN ((b > 1000)) THEN (true) WHEN (c is not null) THEN (false) ELSE (null) END (type: boolean)`. But this condition can not push down. We can optimized it to `b > 1000` and push down it.
```
hive> explain SELECT *
> FROM (SELECT CASE
> WHEN a = 100 THEN 1
> WHEN b > 1000 THEN 2
> WHEN c IS NOT NULL THEN 3
> END AS x
> FROM t) tmp
> WHERE x = 2;
OK
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: t
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
predicate: CASE WHEN ((a = 100)) THEN (false) WHEN ((b > 1000)) THEN (true) WHEN (c is not null) THEN (false) ELSE (null) END (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: CASE WHEN ((a = 100)) THEN (1) WHEN ((b > 1000)) THEN (2) WHEN (c is not null) THEN (3) ELSE (null) END (type: int)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
ListSink
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722077202
**[Test build #130629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130629/testReport)** for PR 30222 at commit [`ee5e6dd`](https://github.com/apache/spark/commit/ee5e6ddfbc25e879ed92ea0a1a6c3470ebd52214).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720280772
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722919375
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35304/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722863678
**[Test build #130694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130694/testReport)** for PR 30222 at commit [`5a90bfc`](https://github.com/apache/spark/commit/5a90bfcee523eb480b41ff0240ea22c3d5f7d931).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-720320498
retest this please.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123377
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35237/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123396
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35237/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722970648
I see, the case when conditions are not orthogonal. We can't skip any of them.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722289297
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35255/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722110837
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35235/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-743304380
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37273/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722271155
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35255/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722112519
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722909905
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35304/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722644809
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130665/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722123386
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722907801
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #30222: [SPARK-33315][SQL] Simplify CaseWhen with EqualTo
Posted by GitBox <gi...@apache.org>.
wangyum commented on pull request #30222:
URL: https://github.com/apache/spark/pull/30222#issuecomment-722858604
We can reproduce it by:
```scala
spark.sql("CREATE TABLE t(a int, b int, c int) using parquet")
spark.sql(
"""
|SELECT *
| FROM (SELECT CASE
| WHEN rd > 1 THEN 1
| WHEN b > 1000 THEN 2
| WHEN c < 100 THEN 3
| ELSE 4
|END AS x
|FROM (SELECT *, rand(100) as rd FROM t) t1) t2
|WHERE x = 2
|""".stripMargin).explain
```
1. `Alias.toAttribute` construct `AttributeReference` with default deterministic, that is true:
https://github.com/apache/spark/blob/ca2cfd4185586993f981cfd2f1aff30ee6b2294e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala#L181
2. Therefore, deterministic is true, and`SimplifyConditionals` can simplify it:
![image](https://user-images.githubusercontent.com/5399861/98330987-9aa8ab00-2036-11eb-8acf-93f1a2b9f404.png)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org