You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by viirya <gi...@git.apache.org> on 2017/03/07 05:25:07 UTC

[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/17186

    [SPARK-19846][SQL] Add a flag to disable constraint propagation

    ## What changes were proposed in this pull request?
    
    Constraint propagation can be computation expensive and block the driver execution for long time. For example, the below benchmark needs 30mins.
    
    Compared with previous PRs #16998, #16785, this is a much simpler option: add a flag to disable constraint propagation.
    
    ### Benchmark
    
    Run the following codes locally.
    
        import org.apache.spark.ml.{Pipeline, PipelineStage}
        import org.apache.spark.ml.feature.{OneHotEncoder, StringIndexer, VectorAssembler}
        import org.apache.spark.sql.internal.SQLConf
    
        spark.conf.set(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key, false)
    
        val df = (1 to 40).foldLeft(Seq((1, "foo"), (2, "bar"), (3, "baz")).toDF("id", "x0"))((df, i) => df.withColumn(s"x$i", $"x0"))
    
        val indexers = df.columns.tail.map(c => new StringIndexer()
          .setInputCol(c)
          .setOutputCol(s"${c}_indexed")
          .setHandleInvalid("skip"))
    
        val encoders = indexers.map(indexer => new OneHotEncoder()
          .setInputCol(indexer.getOutputCol)
          .setOutputCol(s"${indexer.getOutputCol}_encoded")
          .setDropLast(true))
    
        val stages: Array[PipelineStage] = indexers ++ encoders
        val pipeline = new Pipeline().setStages(stages)
    
        val startTime = System.nanoTime
        pipeline.fit(df).transform(df).show
        val runningTime = System.nanoTime - startTime
    
    Before this patch: 1786001 ms ~= 30 mins
    After this patch: 33105 ms = about half of a minute
    
    Related PRs: #16998, #16785.
    
    ## How was this patch tested?
    
    Jenkins tests.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 add-flag-disable-constraint-propagation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17186.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17186
    
----
commit 44e494bd1725645a2ce13321ec846098d8da6ae4
Author: Liang-Chi Hsieh <vi...@gmail.com>
Date:   2017-03-07T04:59:32Z

    Add a flag to disable constraint propagation.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74588/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74581/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107824144
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -187,6 +187,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    +    .doc("When true, the query optimizer will use constraint propagation in query plans to " +
    --- End diff --
    
    Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74473/testReport)** for PR 17186 at commit [`0e204bc`](https://github.com/apache/spark/commit/0e204bc226cfa520fe76a83e233790153c776522).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74173/testReport)** for PR 17186 at commit [`c863f67`](https://github.com/apache/spark/commit/c863f67022514606f797acff0e5c8d7da591bb14).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105792974
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -606,9 +606,10 @@ object CollapseWindow extends Rule[LogicalPlan] {
      * Note: While this optimization is applicable to all types of join, it primarily benefits Inner and
      * LeftSemi joins.
      */
    -object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelper {
    +case class InferFiltersFromConstraints(conf: CatalystConf)
    +    extends Rule[LogicalPlan] with PredicateHelper {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    +1; this rule is just a no-op if constraints aren't inferred.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107826699
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OuterJoinEliminationSuite.scala ---
    @@ -31,7 +32,7 @@ class OuterJoinEliminationSuite extends PlanTest {
           Batch("Subqueries", Once,
             EliminateSubqueryAliases) ::
           Batch("Outer Join Elimination", Once,
    -        EliminateOuterJoin,
    +        EliminateOuterJoin(SimpleCatalystConf(caseSensitiveAnalysis = true)),
    --- End diff --
    
    Added a test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74171/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74071/testReport)** for PR 17186 at commit [`44e494b`](https://github.com/apache/spark/commit/44e494bd1725645a2ce13321ec846098d8da6ae4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class InferFiltersFromConstraints(conf: CatalystConf)`
      * `case class PruneFilters(conf: CatalystConf) extends Rule[LogicalPlan] with PredicateHelper `
      * `case class EliminateOuterJoin(conf: CatalystConf) extends Rule[LogicalPlan] with PredicateHelper `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74174 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74174/testReport)** for PR 17186 at commit [`3eda726`](https://github.com/apache/spark/commit/3eda726e5694d46b1aee20d88cf4a601fd87c379).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @hvanhovell It is late in local time, I addressed the comments first. I will add test tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74108/testReport)** for PR 17186 at commit [`ae9f037`](https://github.com/apache/spark/commit/ae9f0370f20dfb90e3abeed22258ce224f43b175).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105805819
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -606,9 +606,10 @@ object CollapseWindow extends Rule[LogicalPlan] {
      * Note: While this optimization is applicable to all types of join, it primarily benefits Inner and
      * LeftSemi joins.
      */
    -object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelper {
    +case class InferFiltersFromConstraints(conf: CatalystConf)
    +    extends Rule[LogicalPlan] with PredicateHelper {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @sameeragarwal Rethink about it, I think let `QueryPlan` returns constraints depending on the flag is more easy to test. I will give it a try.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75043/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74581 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74581/testReport)** for PR 17186 at commit [`d3b0a72`](https://github.com/apache/spark/commit/d3b0a7237c5f70ea64a786ecf63edac13617d284).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Thanks @viirya, this approach makes sense to me. Can you please modify `InferFiltersFromConstraints` and I'll take a closer look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74071/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107806043
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OuterJoinEliminationSuite.scala ---
    @@ -31,7 +32,7 @@ class OuterJoinEliminationSuite extends PlanTest {
           Batch("Subqueries", Once,
             EliminateSubqueryAliases) ::
           Batch("Outer Join Elimination", Once,
    -        EliminateOuterJoin,
    +        EliminateOuterJoin(SimpleCatalystConf(caseSensitiveAnalysis = true)),
    --- End diff --
    
    Can we add a test for outer join elimination as well? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107804505
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala ---
    @@ -33,7 +34,19 @@ class PruneFiltersSuite extends PlanTest {
             EliminateSubqueryAliases) ::
           Batch("Filter Pushdown and Pruning", Once,
             CombineFilters,
    -        PruneFilters,
    +        PruneFilters(SimpleCatalystConf(caseSensitiveAnalysis = true)),
    +        PushDownPredicate,
    +        PushPredicateThroughJoin) :: Nil
    +  }
    +
    +  object OptimizeDisableConstraintPropagation extends RuleExecutor[LogicalPlan] {
    --- End diff --
    
    nit: same as above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #75139 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75139/testReport)** for PR 17186 at commit [`a02c8cb`](https://github.com/apache/spark/commit/a02c8cbf80a8977e80ce21e0db9b4c0b61753331).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @viirya could you add a test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @hvanhovell @sameeragarwal Following @sameeragarwal's comment, now instead of disabling every optimization, a new method `getConstraints` in `QueryPlan` will return empty constraints if the flag is disabled, otherwise returning original propagated constraints.
    
    Please take a look. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74253 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74253/testReport)** for PR 17186 at commit [`eb200d6`](https://github.com/apache/spark/commit/eb200d618b8e2d86b1d29df2350319bf69e36738).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74171 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74171/testReport)** for PR 17186 at commit [`8318152`](https://github.com/apache/spark/commit/8318152521371744cfc91c3ff810aa28acb5760e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    LGTM, thanks! cc @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by nsyca <gi...@git.apache.org>.
Github user nsyca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r106166444
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala ---
    @@ -133,4 +146,28 @@ class PruneFiltersSuite extends PlanTest {
         val correctAnswer = testRelation.where(Rand(10) > 5).where(Rand(10) > 5).select('a).analyze
         comparePlans(optimized, correctAnswer)
       }
    +
    +  test("No pruning when constraint propagation is disabled") {
    +    val tr1 = LocalRelation('a.int, 'b.int, 'c.int).subquery('tr1)
    +    val tr2 = LocalRelation('a.int, 'd.int, 'e.int).subquery('tr2)
    +
    +    val query = tr1
    +      .where("tr1.a".attr > 10 || "tr1.c".attr < 10)
    +      .join(tr2.where('d.attr < 100), Inner, Some("tr1.a".attr === "tr2.a".attr))
    +
    +    val queryWithUselessFilter =
    +      query.where(
    +        ("tr1.a".attr > 10 || "tr1.c".attr < 10) &&
    +          'd.attr < 100)
    +
    +    val optimized = OptimizeDisableConstraintPropagation.execute(queryWithUselessFilter.analyze)
    +    // When constraint propagation is disabled, the useless filter won't be pruned.
    +    // It gets pushed down. Because the rule `CombineFilters` runs only once, there are redundant
    +    // and duplicate filters.
    --- End diff --
    
    This behaviour does not make sense to me. If I write a query like
    
    `select * from (select * from t1 where t1.a1 > 1) tx where tx.a1 > 1`
    
    I expect that Spark evaluates the predicate only once. The wording of "constraint propagation" is misleading. In this example, there is no activity of propagation at all. Perhaps we want to distinguish the "constraints" between the ones written originally and the ones that are inferred from relationships with other predicates. When the "propagation" (or perhaps a more meaningful term "predicate inference") is set to OFF, we want to exclude those inferred predicates in the `def constraints`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @sameeragarwal Thanks for the comment. I've updated `InferFiltersFromConstraints`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105806019
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    +    .doc("When true, the query optimizer will use constraint propagation in query plans to " +
    +         "perform optimization. Constraint propagation can be computation expensive for long " +
    --- End diff --
    
    Thanks. Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @sameeragarwal Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @sameeragarwal To do that, we may need to change `constraints` to a method taking `CatalystConf`.  As `constraints` is public, is it good to do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107800328
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -615,9 +615,18 @@ object CollapseWindow extends Rule[LogicalPlan] {
      * Note: While this optimization is applicable to all types of join, it primarily benefits Inner and
      * LeftSemi joins.
      */
    -object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelper {
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +case class InferFiltersFromConstraints(conf: CatalystConf)
    +    extends Rule[LogicalPlan] with PredicateHelper {
    +  def apply(plan: LogicalPlan): LogicalPlan = if (conf.constraintPropagationEnabled) {
    +    inferFilters(plan)
    +  } else {
    +    plan
    +  }
    +
    +
    +  private def inferFilters(plan: LogicalPlan): LogicalPlan = plan transform {
         case filter @ Filter(condition, child) =>
    +      val constraintEnabled = conf.constraintPropagationEnabled
    --- End diff --
    
    this is unused


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105793874
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    --- End diff --
    
    This should not be an internal flag, right? cc @sameeragarwal @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74108/testReport)** for PR 17186 at commit [`ae9f037`](https://github.com/apache/spark/commit/ae9f0370f20dfb90e3abeed22258ce224f43b175).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    ping @sameeragarwal Is it possible this goes in before code-freeze of 2.2? Please let me know that, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74248/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74253/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105805970
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    --- End diff --
    
    Not sure about it, because constraint propagation is internal details.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #75043 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75043/testReport)** for PR 17186 at commit [`da09d9f`](https://github.com/apache/spark/commit/da09d9ff687afdf163d7c82971d70294ae0482df).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Here's another instantiation of the underlying bug: https://issues.apache.org/jira/browse/SPARK-19875


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105793939
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    +    .doc("When true, the query optimizer will use constraint propagation in query plans to " +
    +         "perform optimization. Constraint propagation can be computation expensive for long " +
    --- End diff --
    
    `long` -> `large`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74588/testReport)** for PR 17186 at commit [`d4c9a5e`](https://github.com/apache/spark/commit/d4c9a5ea1c5a102017f522f21c434c5311ef64c7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class JsonDataSource extends Serializable `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @sameeragarwal Btw, another point is, if we do that, we still need to transform the plan even the flag is disabled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74253 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74253/testReport)** for PR 17186 at commit [`eb200d6`](https://github.com/apache/spark/commit/eb200d618b8e2d86b1d29df2350319bf69e36738).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  public static class LongWrapper `
      * `  public static class IntWrapper `
      * `case class CostBasedJoinReorder(conf: CatalystConf) extends Rule[LogicalPlan] with PredicateHelper `
      * `  case class JoinPlan(itemIds: Set[Int], plan: LogicalPlan, joinConds: Set[Expression], cost: Cost)`
      * `case class Cost(rows: BigInt, size: BigInt) `
      * `abstract class RepartitionOperation extends UnaryNode `
      * `case class FlatMapGroupsWithState(`
      * `class CSVOptions(`
      * `class UnivocityParser(`
      * `trait WatermarkSupport extends UnaryExecNode `
      * `case class FlatMapGroupsWithStateExec(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r106167725
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala ---
    @@ -133,4 +146,28 @@ class PruneFiltersSuite extends PlanTest {
         val correctAnswer = testRelation.where(Rand(10) > 5).where(Rand(10) > 5).select('a).analyze
         comparePlans(optimized, correctAnswer)
       }
    +
    +  test("No pruning when constraint propagation is disabled") {
    +    val tr1 = LocalRelation('a.int, 'b.int, 'c.int).subquery('tr1)
    +    val tr2 = LocalRelation('a.int, 'd.int, 'e.int).subquery('tr2)
    +
    +    val query = tr1
    +      .where("tr1.a".attr > 10 || "tr1.c".attr < 10)
    +      .join(tr2.where('d.attr < 100), Inner, Some("tr1.a".attr === "tr2.a".attr))
    +
    +    val queryWithUselessFilter =
    +      query.where(
    +        ("tr1.a".attr > 10 || "tr1.c".attr < 10) &&
    +          'd.attr < 100)
    +
    +    val optimized = OptimizeDisableConstraintPropagation.execute(queryWithUselessFilter.analyze)
    +    // When constraint propagation is disabled, the useless filter won't be pruned.
    +    // It gets pushed down. Because the rule `CombineFilters` runs only once, there are redundant
    +    // and duplicate filters.
    --- End diff --
    
    What needs to clarify is, this behaviour is just limited to this test case. That is why I added the comment. In normal optimization, `CombineFilters` will run multiple times and the predicates will be combined.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74173/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Sorry @viirya, I'll review this first thing tomorrow morning


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105818486
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    --- End diff --
    
    Due to the fact there are few users reporting hitting this issue, we may need to expose it as an external flag.
    
    However, I would think that a large portion of external users may not know constraint propagation. It might not be intuitive to link the problem they hit to constraint propagation and to find this config, even it is external.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74248 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74248/testReport)** for PR 17186 at commit [`eb200d6`](https://github.com/apache/spark/commit/eb200d618b8e2d86b1d29df2350319bf69e36738).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107803893
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -187,6 +187,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    +    .doc("When true, the query optimizer will use constraint propagation in query plans to " +
    --- End diff --
    
    nit: 'get around the issue' might sound pretty vague to a non-expert user. How about something along these lines?
    
    ```scala
        .doc("When true, the query optimizer will infer and propagate data constraints in the query " +
          "plan to optimize them. Constraint propagation can sometimes be computationally expensive" +
          "for certain kinds of query plans (such as those with a large number of predicates and " +
          "aliases) which might negatively impact overall runtime.")
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #75139 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75139/testReport)** for PR 17186 at commit [`a02c8cb`](https://github.com/apache/spark/commit/a02c8cbf80a8977e80ce21e0db9b4c0b61753331).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r104664155
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -606,9 +606,10 @@ object CollapseWindow extends Rule[LogicalPlan] {
      * Note: While this optimization is applicable to all types of join, it primarily benefits Inner and
      * LeftSemi joins.
      */
    -object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelper {
    +case class InferFiltersFromConstraints(conf: CatalystConf)
    +    extends Rule[LogicalPlan] with PredicateHelper {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    Just check the flag before you start transforming the tree. That is a lot simpler & faster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74108/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107804156
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/InferFiltersFromConstraintsSuite.scala ---
    @@ -31,7 +32,17 @@ class InferFiltersFromConstraintsSuite extends PlanTest {
           Batch("InferAndPushDownFilters", FixedPoint(100),
             PushPredicateThroughJoin,
             PushDownPredicate,
    -        InferFiltersFromConstraints,
    +        InferFiltersFromConstraints(SimpleCatalystConf(caseSensitiveAnalysis = true)),
    +        CombineFilters) :: Nil
    +  }
    +
    +  object OptimizeDisableConstraintPropagation extends RuleExecutor[LogicalPlan] {
    --- End diff --
    
    nit: perhaps `OptimizeWithConstraintPropagationDisabled`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17186


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74588/testReport)** for PR 17186 at commit [`d4c9a5e`](https://github.com/apache/spark/commit/d4c9a5ea1c5a102017f522f21c434c5311ef64c7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Btw, test cases are added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74173/testReport)** for PR 17186 at commit [`c863f67`](https://github.com/apache/spark/commit/c863f67022514606f797acff0e5c8d7da591bb14).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74171 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74171/testReport)** for PR 17186 at commit [`8318152`](https://github.com/apache/spark/commit/8318152521371744cfc91c3ff810aa28acb5760e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74071/testReport)** for PR 17186 at commit [`44e494b`](https://github.com/apache/spark/commit/44e494bd1725645a2ce13321ec846098d8da6ae4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75139/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74174 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74174/testReport)** for PR 17186 at commit [`3eda726`](https://github.com/apache/spark/commit/3eda726e5694d46b1aee20d88cf4a601fd87c379).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r106012488
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    --- End diff --
    
    Constraint propagation is like predicate pushdown, which has already been used in external configurations. We can rename it to make external users easy to understand, e.g., constraints inferences. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74473/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @sameeragarwal Thanks for review! I've addressed all the comments now. Please take a look if it is good for you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r104665124
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala ---
    @@ -135,7 +136,8 @@ object EliminateOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
       }
     
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case f @ Filter(condition, j @ Join(_, _, RightOuter | LeftOuter | FullOuter, _)) =>
    +    case f @ Filter(condition, j @ Join(_, _, RightOuter | LeftOuter | FullOuter, _))
    +        if conf.constraintPropagationEnabled =>
    --- End diff --
    
    This is far to restrictive. We can still eliminate outer joins without constraint propagation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r105815354
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -190,6 +190,15 @@ object SQLConf {
         .booleanConf
         .createWithDefault(false)
     
    +  val CONSTRAINT_PROPAGATION_ENABLED = buildConf("spark.sql.constraintPropagation.enabled")
    +    .internal()
    --- End diff --
    
    To determine whether a flag is internal or not, we should consider the impact of external users. If users could easily hit this, we might need to expose it as an external flag and document it in [the public document](http://spark.apache.org/docs/latest/sql-programming-guide.html).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    ping @sameeragarwal This is updated according to your previous comment. Can you help review this? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r106172104
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala ---
    @@ -133,4 +146,28 @@ class PruneFiltersSuite extends PlanTest {
         val correctAnswer = testRelation.where(Rand(10) > 5).where(Rand(10) > 5).select('a).analyze
         comparePlans(optimized, correctAnswer)
       }
    +
    +  test("No pruning when constraint propagation is disabled") {
    +    val tr1 = LocalRelation('a.int, 'b.int, 'c.int).subquery('tr1)
    +    val tr2 = LocalRelation('a.int, 'd.int, 'e.int).subquery('tr2)
    +
    +    val query = tr1
    +      .where("tr1.a".attr > 10 || "tr1.c".attr < 10)
    +      .join(tr2.where('d.attr < 100), Inner, Some("tr1.a".attr === "tr2.a".attr))
    +
    +    val queryWithUselessFilter =
    +      query.where(
    +        ("tr1.a".attr > 10 || "tr1.c".attr < 10) &&
    +          'd.attr < 100)
    +
    +    val optimized = OptimizeDisableConstraintPropagation.execute(queryWithUselessFilter.analyze)
    +    // When constraint propagation is disabled, the useless filter won't be pruned.
    +    // It gets pushed down. Because the rule `CombineFilters` runs only once, there are redundant
    +    // and duplicate filters.
    --- End diff --
    
    This is a workaround in short term. Actually I have proposed another approach to bring new data structure for constraint propagation in #16998. But it is more complex and may need more time to consider and review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    cc @sameeragarwal @hvanhovell 
    
    This is a much simpler option: add a flag to disable constraint propagation, if we are ok for skipping optimizations relying on constraints (`InferFiltersFromConstraints`, `PruneFilters`, `EliminateOuterJoin`) when disabling this flag for the uncommon cases.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #74473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74473/testReport)** for PR 17186 at commit [`0e204bc`](https://github.com/apache/spark/commit/0e204bc226cfa520fe76a83e233790153c776522).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r107824151
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -615,9 +615,18 @@ object CollapseWindow extends Rule[LogicalPlan] {
      * Note: While this optimization is applicable to all types of join, it primarily benefits Inner and
      * LeftSemi joins.
      */
    -object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelper {
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +case class InferFiltersFromConstraints(conf: CatalystConf)
    +    extends Rule[LogicalPlan] with PredicateHelper {
    +  def apply(plan: LogicalPlan): LogicalPlan = if (conf.constraintPropagationEnabled) {
    +    inferFilters(plan)
    +  } else {
    +    plan
    +  }
    +
    +
    +  private def inferFilters(plan: LogicalPlan): LogicalPlan = plan transform {
         case filter @ Filter(condition, child) =>
    +      val constraintEnabled = conf.constraintPropagationEnabled
    --- End diff --
    
    oh. missing it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

Posted by nsyca <gi...@git.apache.org>.
Github user nsyca commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17186#discussion_r106169573
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala ---
    @@ -133,4 +146,28 @@ class PruneFiltersSuite extends PlanTest {
         val correctAnswer = testRelation.where(Rand(10) > 5).where(Rand(10) > 5).select('a).analyze
         comparePlans(optimized, correctAnswer)
       }
    +
    +  test("No pruning when constraint propagation is disabled") {
    +    val tr1 = LocalRelation('a.int, 'b.int, 'c.int).subquery('tr1)
    +    val tr2 = LocalRelation('a.int, 'd.int, 'e.int).subquery('tr2)
    +
    +    val query = tr1
    +      .where("tr1.a".attr > 10 || "tr1.c".attr < 10)
    +      .join(tr2.where('d.attr < 100), Inner, Some("tr1.a".attr === "tr2.a".attr))
    +
    +    val queryWithUselessFilter =
    +      query.where(
    +        ("tr1.a".attr > 10 || "tr1.c".attr < 10) &&
    +          'd.attr < 100)
    +
    +    val optimized = OptimizeDisableConstraintPropagation.execute(queryWithUselessFilter.analyze)
    +    // When constraint propagation is disabled, the useless filter won't be pruned.
    +    // It gets pushed down. Because the rule `CombineFilters` runs only once, there are redundant
    +    // and duplicate filters.
    --- End diff --
    
    I am aware of it. My point is when users turn on this setting in hope of alleviating a lengthy compilation time, they will get this "unintentional" side effect that could lengthen the execution time of evaluating the same predicate twice.
    
    Overall, I agree with your approach but the point I raised could be a followup work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74174/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    @hvanhovell @sameeragarwal Please let me know if you have more thoughts on the new change. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17186
  
    **[Test build #75043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75043/testReport)** for PR 17186 at commit [`da09d9f`](https://github.com/apache/spark/commit/da09d9ff687afdf163d7c82971d70294ae0482df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org