You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jiangxb1987 <gi...@git.apache.org> on 2017/08/31 21:57:47 UTC

[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

GitHub user jiangxb1987 opened a pull request:

    https://github.com/apache/spark/pull/19099

    [SPARK-21652][SQL] Fix rule confliction between InferFiltersFromConstraints and ConstantPropagation

    ## What changes were proposed in this pull request?
    
    For the given example below, the predicate added by `InferFiltersFromConstraints` is folded by `ConstantPropagation` later, this leads to unconverged optimize iteration:
    ```
    Seq((1, 1)).toDF("col1", "col2").createOrReplaceTempView("t1")
    Seq(1, 2).toDF("col").createOrReplaceTempView("t2")
    sql("SELECT * FROM t1, t2 WHERE t1.col1 = 1 AND 1 = t1.col2 AND t1.col1 = t2.col AND t1.col2 = t2.col")
    ```
    
    We can fix this by adjusting the indent of the optimize rules.
    
    ## How was this patch tested?
    
    Add test case that would have failed in `SQLQuerySuite`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiangxb1987/spark unconverge-optimization

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19099
    
----
commit 7a364a192f15bc99e362a2615c775730cb11fc24
Author: Xingbo Jiang <xi...@databricks.com>
Date:   2017-08-31T21:49:02Z

    fix rule confliction between InferFiltersFromConstraints and ConstantPropagation.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81299/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81323/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81323 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81323/testReport)** for PR 19099 at commit [`231ca3f`](https://github.com/apache/spark/commit/231ca3f9879cd844fa1bf6f98b2418e5f8137f8f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19099#discussion_r136479686
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -79,11 +79,11 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
           PushProjectionThroughUnion,
           ReorderJoin,
           EliminateOuterJoin,
    +      InferFiltersFromConstraints,
    --- End diff --
    
    Why you changed this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    LGTM 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81418/testReport)** for PR 19099 at commit [`231ca3f`](https://github.com/apache/spark/commit/231ca3f9879cd844fa1bf6f98b2418e5f8137f8f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19099#discussion_r136479931
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2663,4 +2664,31 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         // In unit test, Spark will fail the query if memory leak detected.
         spark.range(100).groupBy("id").count().limit(1).collect()
       }
    +
    +  test("SPARK-21652: rule confliction of InferFiltersFromConstraints and ConstantPropagation") {
    +    // Under test environment, throws Exception if the max iteration number is reached for an
    --- End diff --
    
    Just catching the exception is not enough for this test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81322 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81322/testReport)** for PR 19099 at commit [`9b1c642`](https://github.com/apache/spark/commit/9b1c642b1ec05b3c7e95f342a59aea6be35a0b35).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81418/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81299 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81299/testReport)** for PR 19099 at commit [`7a364a1`](https://github.com/apache/spark/commit/7a364a192f15bc99e362a2615c775730cb11fc24).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81322/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81322 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81322/testReport)** for PR 19099 at commit [`9b1c642`](https://github.com/apache/spark/commit/9b1c642b1ec05b3c7e95f342a59aea6be35a0b35).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19099#discussion_r136678259
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -79,11 +79,11 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
           PushProjectionThroughUnion,
           ReorderJoin,
           EliminateOuterJoin,
    +      InferFiltersFromConstraints,
    --- End diff --
    
    aha, thanks for the explanation~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81323 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81323/testReport)** for PR 19099 at commit [`231ca3f`](https://github.com/apache/spark/commit/231ca3f9879cd844fa1bf6f98b2418e5f8137f8f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Thanks! Merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19099


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    cc @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81418/testReport)** for PR 19099 at commit [`231ca3f`](https://github.com/apache/spark/commit/231ca3f9879cd844fa1bf6f98b2418e5f8137f8f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19099#discussion_r136637929
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -79,11 +79,11 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
           PushProjectionThroughUnion,
           ReorderJoin,
           EliminateOuterJoin,
    +      InferFiltersFromConstraints,
    --- End diff --
    
    This is not ideal, but it should resolve the specific failure case and could unblock other PR.
    
    @gatorsmile is working on an alter approach that should perfectly resolve the entire issue of add/remove filters, but it takes time and I don't think it will be ready in a few days.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19099#discussion_r136664387
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -79,11 +79,11 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
           PushProjectionThroughUnion,
           ReorderJoin,
           EliminateOuterJoin,
    +      InferFiltersFromConstraints,
    --- End diff --
    
    Basically, this is to infer the filters at the beginning of this huge batch, and then remove the useless at the end. When it involves multiple rules, we might still hit the max iteration. Eventually, we need a clean fix to split the whole batch to two separate ones. One is to add the filters/constraints; another is to remove the filters/constraints based on the costs.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    **[Test build #81299 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81299/testReport)** for PR 19099 at commit [`7a364a1`](https://github.com/apache/spark/commit/7a364a192f15bc99e362a2615c775730cb11fc24).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19099
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org