You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/10/13 06:23:38 UTC

[GitHub] spark pull request #22711: [SPARK-25714] improve the comment inside BooleanS...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/22711

    [SPARK-25714] improve the comment inside BooleanSimplification rule

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22711.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22711
    
----
commit 02e61004b9542063bed5a584a9eb5914299ab1c8
Author: Wenchen Fan <we...@...>
Date:   2018-10-13T06:17:49Z

    improve the comment inside BooleanSimplification rule

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    **[Test build #97334 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97334/testReport)** for PR 22711 at commit [`21dab84`](https://github.com/apache/spark/commit/21dab84d2d54491a1cdd63f91ae5e83727f99526).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97342/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22711: [SPARK-25714][SQL][followup] improve the comment ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22711


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3938/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3940/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3945/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    cc @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97334/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    **[Test build #97332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97332/testReport)** for PR 22711 at commit [`02e6100`](https://github.com/apache/spark/commit/02e61004b9542063bed5a584a9eb5914299ab1c8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    **[Test build #97342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97342/testReport)** for PR 22711 at commit [`21dab84`](https://github.com/apache/spark/commit/21dab84d2d54491a1cdd63f91ae5e83727f99526).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22711: [SPARK-25714][SQL][followup] improve the comment ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22711#discussion_r224951190
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala ---
    @@ -276,31 +276,37 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
           case a And b if a.semanticEquals(b) => a
           case a Or b if a.semanticEquals(b) => a
     
    -      // The following optimization is applicable only when the operands are not nullable,
    +      // The following optimizations are applicable only when the operands are not nullable,
           // since the three-value logic of AND and OR are different in NULL handling.
           // See the chart:
           // +---------+---------+---------+---------+
    -      // |    p    |    q    | p OR q  | p AND q |
    +      // | operand | operand |   OR    |   AND   |
    --- End diff --
    
    AND and OR is commutative, so we can reduce the entries in this table


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22711: [SPARK-25714][SQL][followup] improve the comment ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22711#discussion_r224951223
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala ---
    @@ -276,31 +276,37 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
           case a And b if a.semanticEquals(b) => a
           case a Or b if a.semanticEquals(b) => a
     
    -      // The following optimization is applicable only when the operands are not nullable,
    +      // The following optimizations are applicable only when the operands are not nullable,
           // since the three-value logic of AND and OR are different in NULL handling.
           // See the chart:
           // +---------+---------+---------+---------+
    -      // |    p    |    q    | p OR q  | p AND q |
    +      // | operand | operand |   OR    |   AND   |
           // +---------+---------+---------+---------+
           // | TRUE    | TRUE    | TRUE    | TRUE    |
           // | TRUE    | FALSE   | TRUE    | FALSE   |
    -      // | TRUE    | UNKNOWN | TRUE    | UNKNOWN |
    -      // | FALSE   | TRUE    | TRUE    | FALSE   |
           // | FALSE   | FALSE   | FALSE   | FALSE   |
    -      // | FALSE   | UNKNOWN | UNKNOWN | FALSE   |
           // | UNKNOWN | TRUE    | TRUE    | UNKNOWN |
           // | UNKNOWN | FALSE   | UNKNOWN | FALSE   |
           // | UNKNOWN | UNKNOWN | UNKNOWN | UNKNOWN |
           // +---------+---------+---------+---------+
    +
    +      // This can break if a is null and c is false, so a can't be nullable.
           case a And (b Or c) if !a.nullable && Not(a).semanticEquals(b) => And(a, c)
    +      // This can break if a is null and b is false, so a can't be nullable.
           case a And (b Or c) if !a.nullable && Not(a).semanticEquals(c) => And(a, b)
    -      case (a Or b) And c if !a.nullable && a.semanticEquals(Not(c)) => And(b, c)
    -      case (a Or b) And c if !b.nullable && b.semanticEquals(Not(c)) => And(a, c)
    +      // This can break if c is null and b is false, so c can't be nullable.
    +      case (a Or b) And c if !c.nullable && a.semanticEquals(Not(c)) => And(b, c)
    --- End diff --
    
    `b.nullable` and `c.nullable` are same, because `a.semanticEquals(Not(c))`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    **[Test build #97332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97332/testReport)** for PR 22711 at commit [`02e6100`](https://github.com/apache/spark/commit/02e61004b9542063bed5a584a9eb5914299ab1c8).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22711: [SPARK-25714][SQL][followup] improve the comment ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22711#discussion_r224951215
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala ---
    @@ -276,31 +276,37 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
           case a And b if a.semanticEquals(b) => a
           case a Or b if a.semanticEquals(b) => a
     
    -      // The following optimization is applicable only when the operands are not nullable,
    +      // The following optimizations are applicable only when the operands are not nullable,
           // since the three-value logic of AND and OR are different in NULL handling.
           // See the chart:
           // +---------+---------+---------+---------+
    -      // |    p    |    q    | p OR q  | p AND q |
    +      // | operand | operand |   OR    |   AND   |
           // +---------+---------+---------+---------+
           // | TRUE    | TRUE    | TRUE    | TRUE    |
           // | TRUE    | FALSE   | TRUE    | FALSE   |
    -      // | TRUE    | UNKNOWN | TRUE    | UNKNOWN |
    -      // | FALSE   | TRUE    | TRUE    | FALSE   |
           // | FALSE   | FALSE   | FALSE   | FALSE   |
    -      // | FALSE   | UNKNOWN | UNKNOWN | FALSE   |
           // | UNKNOWN | TRUE    | TRUE    | UNKNOWN |
           // | UNKNOWN | FALSE   | UNKNOWN | FALSE   |
           // | UNKNOWN | UNKNOWN | UNKNOWN | UNKNOWN |
           // +---------+---------+---------+---------+
    +
    +      // This can break if a is null and c is false, so a can't be nullable.
    --- End diff --
    
    explain which input can break it, so it's easier to understand.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    LGTM
    
    Thanks! Merged to master/2.4


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    **[Test build #97342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97342/testReport)** for PR 22711 at commit [`21dab84`](https://github.com/apache/spark/commit/21dab84d2d54491a1cdd63f91ae5e83727f99526).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22711: [SPARK-25714][SQL][followup] improve the comment ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22711#discussion_r224951558
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala ---
    @@ -276,31 +276,37 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
           case a And b if a.semanticEquals(b) => a
           case a Or b if a.semanticEquals(b) => a
     
    -      // The following optimization is applicable only when the operands are not nullable,
    +      // The following optimizations are applicable only when the operands are not nullable,
           // since the three-value logic of AND and OR are different in NULL handling.
           // See the chart:
           // +---------+---------+---------+---------+
    -      // |    p    |    q    | p OR q  | p AND q |
    +      // | operand | operand |   OR    |   AND   |
           // +---------+---------+---------+---------+
           // | TRUE    | TRUE    | TRUE    | TRUE    |
           // | TRUE    | FALSE   | TRUE    | FALSE   |
    -      // | TRUE    | UNKNOWN | TRUE    | UNKNOWN |
    -      // | FALSE   | TRUE    | TRUE    | FALSE   |
           // | FALSE   | FALSE   | FALSE   | FALSE   |
    -      // | FALSE   | UNKNOWN | UNKNOWN | FALSE   |
           // | UNKNOWN | TRUE    | TRUE    | UNKNOWN |
           // | UNKNOWN | FALSE   | UNKNOWN | FALSE   |
           // | UNKNOWN | UNKNOWN | UNKNOWN | UNKNOWN |
           // +---------+---------+---------+---------+
    +
    +      // This can break if a is null and c is false, so a can't be nullable.
    --- End diff --
    
    The current code will not break. Thus, the comment is confusing to the future reader. To make it clear, we can just give the actual value. 
    > (NULL And (NULL Or FALSE)) = NULL, but (NULL And FALSE) = FALSE. Thus, a can't be nullable. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97332/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22711: [SPARK-25714][SQL][followup] improve the comment inside ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22711
  
    **[Test build #97334 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97334/testReport)** for PR 22711 at commit [`21dab84`](https://github.com/apache/spark/commit/21dab84d2d54491a1cdd63f91ae5e83727f99526).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org