You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gengliangwang <gi...@git.apache.org> on 2018/01/15 15:17:29 UTC

[GitHub] spark pull request #20270: [SPARK-23079] Fix query constraints propagation w...

GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/20270

    [SPARK-23079] Fix query constraints propagation with aliases

    ## What changes were proposed in this pull request?
    
    Previously, PR #19201 fix the problem of non-converging constraints.
    
    After that PR #19149 improve the loop and constraints is inferred only once.
    
    So the problem of non-converging constraints is gone.
    
    Also, in current code, the case below will fail.
    
    ```
    
    spark.range(5).write.saveAsTable("t")
    val t = spark.read.table("t")
    val left = t.withColumn("xid", $"id" + lit(1)).as("x")
    val right = t.withColumnRenamed("id", "xid").as("y")
    val df = left.join(right, "xid").filter("id = 3").toDF()
    checkAnswer(df, Row(4, 3))
    
    ```
    
    Because `aliasMap` replace all the aliased child. See the test case in PR for details.
    
    The PR remove the useless code for preventing the non-converging constraints. Also, infer the constraints with `EqualNullSafe` as well.
     
    
    ## How was this patch tested?
    
    Unit test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark FixConstraint

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20270.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20270
    
----
commit e9dd769cc96158ba3b1597bd5eb6fb824aeab22a
Author: Wang Gengliang <lt...@...>
Date:   2018-01-15T15:10:57Z

    SPARK-23079: constraints should be inferred correctly with aliases

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86173/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86173/testReport)** for PR 20270 at commit [`8a22e1d`](https://github.com/apache/spark/commit/8a22e1d1705f0aa546b04dca0f5ffc60394b0f24).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20270: [SPARK-23079][SQL] Fix query constraints propagat...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang closed the pull request at:

    https://github.com/apache/spark/pull/20270


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20270: [SPARK-23079] Fix query constraints propagation w...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20270#discussion_r161634560
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala ---
    @@ -94,56 +94,23 @@ trait QueryPlanConstraints { self: LogicalPlan =>
         case _ => Seq.empty[Attribute]
       }
     
    -  // Collect aliases from expressions of the whole tree rooted by the current QueryPlan node, so
    -  // we may avoid producing recursive constraints.
    -  private lazy val aliasMap: AttributeMap[Expression] = AttributeMap(
    -    expressions.collect {
    -      case a: Alias if !a.child.isInstanceOf[Literal] => (a.toAttribute, a.child)
    -    } ++ children.flatMap(_.asInstanceOf[QueryPlanConstraints].aliasMap))
    -    // Note: the explicit cast is necessary, since Scala compiler fails to infer the type.
    -
       /**
        * Infers an additional set of constraints from a given set of equality constraints.
        * For e.g., if an operator has constraints of the form (`a = 5`, `a = b`), this returns an
        * additional constraint of the form `b = 5`.
        */
       private def inferAdditionalConstraints(constraints: Set[Expression]): Set[Expression] = {
    -    val aliasedConstraints = eliminateAliasedExpressionInConstraints(constraints)
    --- End diff --
    
    Let us revert this back?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86176/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86135/testReport)** for PR 20270 at commit [`e9dd769`](https://github.com/apache/spark/commit/e9dd769cc96158ba3b1597bd5eb6fb824aeab22a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86217/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86217/testReport)** for PR 20270 at commit [`aaad66a`](https://github.com/apache/spark/commit/aaad66aef2c1b8349660211a8589e906795ff0e8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86176 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86176/testReport)** for PR 20270 at commit [`aaad66a`](https://github.com/apache/spark/commit/aaad66aef2c1b8349660211a8589e906795ff0e8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86175/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86217/testReport)** for PR 20270 at commit [`aaad66a`](https://github.com/apache/spark/commit/aaad66aef2c1b8349660211a8589e906795ff0e8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86175 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86175/testReport)** for PR 20270 at commit [`05c32bf`](https://github.com/apache/spark/commit/05c32bf632e03b3c1530d3cdff5f2efeaf3707a9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079][SQL] Fix query constraints propagation wit...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    **[Test build #86135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86135/testReport)** for PR 20270 at commit [`e9dd769`](https://github.com/apache/spark/commit/e9dd769cc96158ba3b1597bd5eb6fb824aeab22a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20270: [SPARK-23079] Fix query constraints propagation with ali...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20270
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86135/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org