You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2018/11/26 03:18:18 UTC

[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/23139

    [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullWithFalseInPredicate

    ## What changes were proposed in this pull request?
    
    Based on https://github.com/apache/spark/pull/22857 and https://github.com/apache/spark/pull/23079, this PR did a few updates
    
    - Limit the data types of NULL to Boolean. 
    - Limit the input data type of replaceNullWithFalse to Boolean
    - Create a new file for the rule ReplaceNullWithFalseInPredicate
    - Update the description of this rule. 
    
    ## How was this patch tested?
    N/A

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark followupSpark-25860

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23139
    
----
commit d2c5e6666814439a049b4cb7a28ae4802b49d164
Author: gatorsmile <ga...@...>
Date:   2018-11-26T03:00:27Z

    fix

commit 6b6997d6c5eedb9a75af61345ae808c9d98e6f4d
Author: gatorsmile <ga...@...>
Date:   2018-11-26T03:18:05Z

    fix

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99294 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99294/testReport)** for PR 23139 at commit [`8b0401c`](https://github.com/apache/spark/commit/8b0401c4440136e33d4580a3f8da80164de3d4b4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99262/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236456694
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -79,29 +80,31 @@ object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
        * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
        * `Literal(null, BooleanType)`.
        */
    -  private def replaceNullWithFalse(e: Expression): Expression = {
    -    if (e.dataType != BooleanType) {
    +  private def replaceNullWithFalse(e: Expression): Expression = e match {
    +    case Literal(null, BooleanType) =>
    +      FalseLiteral
    +    case And(left, right) =>
    +      And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case Or(left, right) =>
    +      Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case cw: CaseWhen if cw.dataType == BooleanType =>
    +      val newBranches = cw.branches.map { case (cond, value) =>
    +        replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    +      }
    +      val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    +      CaseWhen(newBranches, newElseValue)
    +    case i @ If(pred, trueVal, falseVal) if i.dataType == BooleanType =>
    +      If(replaceNullWithFalse(pred), replaceNullWithFalse(trueVal), replaceNullWithFalse(falseVal))
    +    case e if e.dataType == BooleanType =>
           e
    -    } else {
    -      e match {
    -        case Literal(null, BooleanType) =>
    -          FalseLiteral
    -        case And(left, right) =>
    -          And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case Or(left, right) =>
    -          Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case cw: CaseWhen =>
    -          val newBranches = cw.branches.map { case (cond, value) =>
    -            replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    -          }
    -          val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    -          CaseWhen(newBranches, newElseValue)
    -        case If(pred, trueVal, falseVal) =>
    -          If(replaceNullWithFalse(pred),
    -            replaceNullWithFalse(trueVal),
    -            replaceNullWithFalse(falseVal))
    -        case _ => e
    +    case e =>
    +      val message = "Expected a Boolean type expression in replaceNullWithFalse, " +
    +        s"but got the type `${e.dataType.catalogString}` in `${e.sql}`."
    +      if (Utils.isTesting) {
    +        throw new IllegalArgumentException(message)
    --- End diff --
    
    Added a test case.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236450239
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -79,29 +80,31 @@ object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
        * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
        * `Literal(null, BooleanType)`.
        */
    -  private def replaceNullWithFalse(e: Expression): Expression = {
    -    if (e.dataType != BooleanType) {
    +  private def replaceNullWithFalse(e: Expression): Expression = e match {
    +    case Literal(null, BooleanType) =>
    +      FalseLiteral
    +    case And(left, right) =>
    +      And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case Or(left, right) =>
    +      Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case cw: CaseWhen if cw.dataType == BooleanType =>
    +      val newBranches = cw.branches.map { case (cond, value) =>
    +        replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    +      }
    +      val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    +      CaseWhen(newBranches, newElseValue)
    +    case i @ If(pred, trueVal, falseVal) if i.dataType == BooleanType =>
    +      If(replaceNullWithFalse(pred), replaceNullWithFalse(trueVal), replaceNullWithFalse(falseVal))
    +    case e if e.dataType == BooleanType =>
           e
    -    } else {
    -      e match {
    -        case Literal(null, BooleanType) =>
    -          FalseLiteral
    -        case And(left, right) =>
    -          And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case Or(left, right) =>
    -          Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case cw: CaseWhen =>
    -          val newBranches = cw.branches.map { case (cond, value) =>
    -            replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    -          }
    -          val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    -          CaseWhen(newBranches, newElseValue)
    -        case If(pred, trueVal, falseVal) =>
    -          If(replaceNullWithFalse(pred),
    -            replaceNullWithFalse(trueVal),
    -            replaceNullWithFalse(falseVal))
    -        case _ => e
    +    case e =>
    +      val message = "Expected a Boolean type expression in replaceNullWithFalse, " +
    +        s"but got the type `${e.dataType.catalogString}` in `${e.sql}`."
    +      if (Utils.isTesting) {
    +        throw new IllegalArgumentException(message)
    --- End diff --
    
    The tests might not catch all the cases if the test coverage is not complete. Such an exception should not block the query execution. Thus, we just throw an exception in our testing mode instead of the production mode. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5347/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99262 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99262/testReport)** for PR 23139 at commit [`e416810`](https://github.com/apache/spark/commit/e41681096867cbc6d2556da83ce733092d6df841).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    LGTM.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Thanks. Merged into master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5353/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99270 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99270/testReport)** for PR 23139 at commit [`e416810`](https://github.com/apache/spark/commit/e41681096867cbc6d2556da83ce733092d6df841).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236468109
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,110 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +import org.apache.spark.util.Utils
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    --- End diff --
    
    The extra scope is covered by "Moreover, ..."


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5342/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236118914
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,107 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition.
    + *
    + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition
    + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable
    + * in very limited cases.
    + *
    + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch
    + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions.
    + *
    + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified
    + * into `Project(Literal(2))`.
    + */
    +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    +    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    +    case p: LogicalPlan => p transformExpressions {
    +      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    +      case cw @ CaseWhen(branches, _) =>
    +        val newBranches = branches.map { case (cond, value) =>
    +          replaceNullWithFalse(cond) -> value
    +        }
    +        cw.copy(branches = newBranches)
    +      case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        af.copy(function = newLambda)
    +      case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        ae.copy(function = newLambda)
    +      case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        mf.copy(function = newLambda)
    +    }
    +  }
    +
    +  /**
    +   * Recursively traverse the Boolean-type expression to replace
    +   * `Literal(null, BooleanType)` with `FalseLiteral`, if possible.
    +   *
    +   * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit
    +   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
    +   * `Literal(null, BooleanType)`.
    +   */
    +  private def replaceNullWithFalse(e: Expression): Expression = {
    +    if (e.dataType != BooleanType) {
    --- End diff --
    
    do we need this? `And`, `Or`, `If` all return boolean, and we already requires boolean type for literal case.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236157275
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala ---
    @@ -736,69 +736,3 @@ object CombineConcats extends Rule[LogicalPlan] {
           flattenConcats(concat)
       }
     }
    -
    -/**
    - * A rule that replaces `Literal(null, _)` with `FalseLiteral` for further optimizations.
    - *
    - * This rule applies to conditions in [[Filter]] and [[Join]]. Moreover, it transforms predicates
    - * in all [[If]] expressions as well as branch conditions in all [[CaseWhen]] expressions.
    - *
    - * For example, `Filter(Literal(null, _))` is equal to `Filter(FalseLiteral)`.
    - *
    - * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`;
    - * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    - * `Filter(FalseLiteral)`.
    - *
    - * As this rule is not limited to conditions in [[Filter]] and [[Join]], arbitrary plans can
    - * benefit from it. For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))`
    - * can be simplified into `Project(Literal(2))`.
    - *
    - * As a result, many unnecessary computations can be removed in the query optimization phase.
    - */
    -object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    -    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    -    case p: LogicalPlan => p transformExpressions {
    -      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    -      case cw @ CaseWhen(branches, _) =>
    -        val newBranches = branches.map { case (cond, value) =>
    -          replaceNullWithFalse(cond) -> value
    -        }
    -        cw.copy(branches = newBranches)
    -      case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
    -        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    -        af.copy(function = newLambda)
    -      case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) =>
    -        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    -        ae.copy(function = newLambda)
    -      case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) =>
    -        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    -        mf.copy(function = newLambda)
    -    }
    -  }
    -
    -  /**
    -   * Recursively replaces `Literal(null, _)` with `FalseLiteral`.
    -   *
    -   * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit
    -   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or `Literal(null, _)`.
    -   */
    -  private def replaceNullWithFalse(e: Expression): Expression = e match {
    -    case cw: CaseWhen if cw.dataType == BooleanType =>
    -      val newBranches = cw.branches.map { case (cond, value) =>
    -        replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    -      }
    -      val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    -      CaseWhen(newBranches, newElseValue)
    -    case i @ If(pred, trueVal, falseVal) if i.dataType == BooleanType =>
    -      If(replaceNullWithFalse(pred), replaceNullWithFalse(trueVal), replaceNullWithFalse(falseVal))
    -    case And(left, right) =>
    -      And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -    case Or(left, right) =>
    -      Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -    case Literal(null, _) => FalseLiteral
    --- End diff --
    
    How about this line? What happened if the input data type of `e` is not Boolean? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by aokolnychyi <gi...@git.apache.org>.
Github user aokolnychyi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236467423
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,110 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +import org.apache.spark.util.Utils
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    --- End diff --
    
    I am not sure I understand `"which contain an implicit Boolean operator "(search condition) = TRUE""`. Could you, please, elaborate a bit?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/23139


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    cc @dbtsai @aokolnychyi @rednaxelafx @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236120731
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,107 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition.
    + *
    + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition
    + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable
    + * in very limited cases.
    + *
    + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch
    + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions.
    + *
    + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified
    + * into `Project(Literal(2))`.
    + */
    +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    +    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    +    case p: LogicalPlan => p transformExpressions {
    +      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    +      case cw @ CaseWhen(branches, _) =>
    +        val newBranches = branches.map { case (cond, value) =>
    +          replaceNullWithFalse(cond) -> value
    +        }
    +        cw.copy(branches = newBranches)
    +      case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        af.copy(function = newLambda)
    +      case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        ae.copy(function = newLambda)
    +      case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        mf.copy(function = newLambda)
    +    }
    +  }
    +
    +  /**
    +   * Recursively traverse the Boolean-type expression to replace
    +   * `Literal(null, BooleanType)` with `FalseLiteral`, if possible.
    +   *
    +   * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit
    +   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
    +   * `Literal(null, BooleanType)`.
    +   */
    +  private def replaceNullWithFalse(e: Expression): Expression = {
    +    if (e.dataType != BooleanType) {
    --- End diff --
    
    How about the LambdaFunction? My major concern is the future changes might forget to add it?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236394865
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -79,29 +80,31 @@ object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
        * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
        * `Literal(null, BooleanType)`.
        */
    -  private def replaceNullWithFalse(e: Expression): Expression = {
    -    if (e.dataType != BooleanType) {
    +  private def replaceNullWithFalse(e: Expression): Expression = e match {
    +    case Literal(null, BooleanType) =>
    +      FalseLiteral
    +    case And(left, right) =>
    +      And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case Or(left, right) =>
    +      Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case cw: CaseWhen if cw.dataType == BooleanType =>
    +      val newBranches = cw.branches.map { case (cond, value) =>
    +        replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    +      }
    +      val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    +      CaseWhen(newBranches, newElseValue)
    +    case i @ If(pred, trueVal, falseVal) if i.dataType == BooleanType =>
    +      If(replaceNullWithFalse(pred), replaceNullWithFalse(trueVal), replaceNullWithFalse(falseVal))
    +    case e if e.dataType == BooleanType =>
           e
    -    } else {
    -      e match {
    -        case Literal(null, BooleanType) =>
    -          FalseLiteral
    -        case And(left, right) =>
    -          And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case Or(left, right) =>
    -          Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case cw: CaseWhen =>
    -          val newBranches = cw.branches.map { case (cond, value) =>
    -            replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    -          }
    -          val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    -          CaseWhen(newBranches, newElseValue)
    -        case If(pred, trueVal, falseVal) =>
    -          If(replaceNullWithFalse(pred),
    -            replaceNullWithFalse(trueVal),
    -            replaceNullWithFalse(falseVal))
    -        case _ => e
    +    case e =>
    +      val message = "Expected a Boolean type expression in replaceNullWithFalse, " +
    +        s"but got the type `${e.dataType.catalogString}` in `${e.sql}`."
    +      if (Utils.isTesting) {
    +        throw new IllegalArgumentException(message)
    --- End diff --
    
    Test for this? Why not also throw exception in runtime since this should never be hit? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99270 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99270/testReport)** for PR 23139 at commit [`e416810`](https://github.com/apache/spark/commit/e41681096867cbc6d2556da83ce733092d6df841).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236463594
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -79,29 +80,31 @@ object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
        * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
        * `Literal(null, BooleanType)`.
        */
    -  private def replaceNullWithFalse(e: Expression): Expression = {
    -    if (e.dataType != BooleanType) {
    +  private def replaceNullWithFalse(e: Expression): Expression = e match {
    +    case Literal(null, BooleanType) =>
    +      FalseLiteral
    +    case And(left, right) =>
    +      And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case Or(left, right) =>
    +      Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    +    case cw: CaseWhen if cw.dataType == BooleanType =>
    +      val newBranches = cw.branches.map { case (cond, value) =>
    +        replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    +      }
    +      val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    +      CaseWhen(newBranches, newElseValue)
    +    case i @ If(pred, trueVal, falseVal) if i.dataType == BooleanType =>
    +      If(replaceNullWithFalse(pred), replaceNullWithFalse(trueVal), replaceNullWithFalse(falseVal))
    +    case e if e.dataType == BooleanType =>
           e
    -    } else {
    -      e match {
    -        case Literal(null, BooleanType) =>
    -          FalseLiteral
    -        case And(left, right) =>
    -          And(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case Or(left, right) =>
    -          Or(replaceNullWithFalse(left), replaceNullWithFalse(right))
    -        case cw: CaseWhen =>
    -          val newBranches = cw.branches.map { case (cond, value) =>
    -            replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    -          }
    -          val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    -          CaseWhen(newBranches, newElseValue)
    -        case If(pred, trueVal, falseVal) =>
    -          If(replaceNullWithFalse(pred),
    -            replaceNullWithFalse(trueVal),
    -            replaceNullWithFalse(falseVal))
    -        case _ => e
    +    case e =>
    +      val message = "Expected a Boolean type expression in replaceNullWithFalse, " +
    +        s"but got the type `${e.dataType.catalogString}` in `${e.sql}`."
    +      if (Utils.isTesting) {
    +        throw new IllegalArgumentException(message)
    --- End diff --
    
    Sounds fair.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99294 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99294/testReport)** for PR 23139 at commit [`8b0401c`](https://github.com/apache/spark/commit/8b0401c4440136e33d4580a3f8da80164de3d4b4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Although we are trying to make sure in the caller side to only call `replaceNullWithFalse` when the expression is boolean type, I agree that for safety, we should check it and throw exception for future development. 
    
    LGTM.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5366/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236468395
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,110 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +import org.apache.spark.util.Utils
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    --- End diff --
    
    This is based on the ANSI SQL. All these clauses have the implicit Boolean operator "(search condition) = TRUE". That is why NULL and FALSE do not satisfy the condition in these clauses


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99255/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236150260
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,107 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition.
    + *
    + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition
    + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable
    + * in very limited cases.
    + *
    + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch
    + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions.
    + *
    + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified
    + * into `Project(Literal(2))`.
    + */
    +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    +    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    +    case p: LogicalPlan => p transformExpressions {
    +      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    +      case cw @ CaseWhen(branches, _) =>
    +        val newBranches = branches.map { case (cond, value) =>
    +          replaceNullWithFalse(cond) -> value
    +        }
    +        cw.copy(branches = newBranches)
    +      case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        af.copy(function = newLambda)
    +      case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        ae.copy(function = newLambda)
    +      case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        mf.copy(function = newLambda)
    +    }
    +  }
    +
    +  /**
    +   * Recursively traverse the Boolean-type expression to replace
    +   * `Literal(null, BooleanType)` with `FalseLiteral`, if possible.
    +   *
    +   * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit
    +   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
    +   * `Literal(null, BooleanType)`.
    +   */
    +  private def replaceNullWithFalse(e: Expression): Expression = {
    +    if (e.dataType != BooleanType) {
    --- End diff --
    
    We don't handle `LambdaFunction` inside this method, it's caller side.
    
    This method is to deal with boolean expressions, and return the original expression if it's not: https://github.com/apache/spark/pull/23139/files#diff-0bb4fc0a3c867b855f84dd1db8867139R103


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99255 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99255/testReport)** for PR 23139 at commit [`6b6997d`](https://github.com/apache/spark/commit/6b6997d6c5eedb9a75af61345ae808c9d98e6f4d).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by aokolnychyi <gi...@git.apache.org>.
Github user aokolnychyi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236466962
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,110 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +import org.apache.spark.util.Utils
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    --- End diff --
    
    I think the scope of this rule is a bit bigger. For example, higher-order functions, all `If` and `CaseWhen` expressions. Would it make sense to replace `"in the search condition of the WHERE/HAVING/ON(JOIN) clauses"` with `"in predicates"`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99294/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5377/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99262 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99262/testReport)** for PR 23139 at commit [`e416810`](https://github.com/apache/spark/commit/e41681096867cbc6d2556da83ce733092d6df841).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99283 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99283/testReport)** for PR 23139 at commit [`3501420`](https://github.com/apache/spark/commit/350142028a003e3729cb05f37983e77be548deff).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99283/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236157802
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,107 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition.
    + *
    + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition
    + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable
    + * in very limited cases.
    + *
    + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch
    + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions.
    + *
    + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified
    + * into `Project(Literal(2))`.
    + */
    +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    +    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    +    case p: LogicalPlan => p transformExpressions {
    +      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    +      case cw @ CaseWhen(branches, _) =>
    +        val newBranches = branches.map { case (cond, value) =>
    +          replaceNullWithFalse(cond) -> value
    +        }
    +        cw.copy(branches = newBranches)
    +      case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        af.copy(function = newLambda)
    +      case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        ae.copy(function = newLambda)
    +      case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        mf.copy(function = newLambda)
    +    }
    +  }
    +
    +  /**
    +   * Recursively traverse the Boolean-type expression to replace
    +   * `Literal(null, BooleanType)` with `FalseLiteral`, if possible.
    +   *
    +   * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit
    +   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
    +   * `Literal(null, BooleanType)`.
    +   */
    +  private def replaceNullWithFalse(e: Expression): Expression = {
    +    if (e.dataType != BooleanType) {
    --- End diff --
    
    See this line https://github.com/apache/spark/pull/23139/files/e41681096867cbc6d2556da83ce733092d6df841#diff-a1acb054bc8888376603ef510e6d0ee0
    
    My major concern is we should not completely rely on the caller to ensure the data type is Boolean. In the future, the new code changes might not completely follow our current assumption. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23139#discussion_r236377208
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala ---
    @@ -0,0 +1,107 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
    +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
    +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
    +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.Rule
    +import org.apache.spark.sql.types.BooleanType
    +
    +
    +/**
    + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search
    + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator
    + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is
    + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition.
    + *
    + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition
    + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable
    + * in very limited cases.
    + *
    + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch
    + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions.
    + *
    + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified
    + * into `Project(Literal(2))`.
    + */
    +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    +    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    +    case p: LogicalPlan => p transformExpressions {
    +      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    +      case cw @ CaseWhen(branches, _) =>
    +        val newBranches = branches.map { case (cond, value) =>
    +          replaceNullWithFalse(cond) -> value
    +        }
    +        cw.copy(branches = newBranches)
    +      case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        af.copy(function = newLambda)
    +      case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        ae.copy(function = newLambda)
    +      case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) =>
    +        val newLambda = lf.copy(function = replaceNullWithFalse(func))
    +        mf.copy(function = newLambda)
    +    }
    +  }
    +
    +  /**
    +   * Recursively traverse the Boolean-type expression to replace
    +   * `Literal(null, BooleanType)` with `FalseLiteral`, if possible.
    +   *
    +   * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit
    +   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or
    +   * `Literal(null, BooleanType)`.
    +   */
    +  private def replaceNullWithFalse(e: Expression): Expression = {
    +    if (e.dataType != BooleanType) {
    --- End diff --
    
    Had an offline discussion with @rednaxelafx . We can issue an exception instead of silently bypass it. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99255 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99255/testReport)** for PR 23139 at commit [`6b6997d`](https://github.com/apache/spark/commit/6b6997d6c5eedb9a75af61345ae808c9d98e6f4d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99270/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    **[Test build #99283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99283/testReport)** for PR 23139 at commit [`3501420`](https://github.com/apache/spark/commit/350142028a003e3729cb05f37983e77be548deff).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23139
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org