You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2016/05/02 22:41:52 UTC

[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/12850

    [SPARK-15076][SQL] Improve ConstantFolding optimizer by using integral associative property

    ## What changes were proposed in this pull request?
    
    This issue improves `ConstantFolding` optimizer by taking advantage of integral associative property. Currently, `ConstantFolding` works like the following.
    
    1) Can optimize `1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + a` into `45 + a`.
    2) Cannot optimize `a + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9`.
    
    This issue enables `ConstantFolding` to handle Case 2 for Add/Multiply expression whose data types are `ByteType`, `ShortType`, `IntegerType`, and `LongType`. The followings are the plan comparision between `before` and `after` this issue.
    
    **Before**
    ```
    scala> sql("select a+1+2+3+4+5+6+7+8+9 from (select explode(array(1)) a)").explain
    == Physical Plan ==
    WholeStageCodegen
    :  +- Project [(((((((((a#7 + 1) + 2) + 3) + 4) + 5) + 6) + 7) + 8) + 9) AS (((((((((a + 1) + 2) + 3) + 4) + 5) + 6) + 7) + 8) + 9)#8]
    :     +- INPUT
    +- Generate explode([1]), false, false, [a#7]
       +- Scan OneRowRelation[]
    ```
    
    **After**
    ```
    scala> sql("select a+1+2+3+4+5+6+7+8+9 from (select explode(array(1)) a)").explain
    == Physical Plan ==
    WholeStageCodegen
    :  +- Project [(a#7 + 45) AS (((((((((a + 1) + 2) + 3) + 4) + 5) + 6) + 7) + 8) + 9)#8]
    :     +- INPUT
    +- Generate explode([1]), false, false, [a#7]
       +- Scan OneRowRelation[]
    ```
    
    ## How was this patch tested?
    
    Pass the Jenkins tests including new and updated testcases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-15076

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12850.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12850
    
----
commit 71c3c73ad0f507837539ed58489e63e501bd4c3d
Author: Dongjoon Hyun <do...@apache.org>
Date:   2016-05-02T22:38:31Z

    [SPARK-15076][SQL] Improve ConstantFolding optimizer by using integral associative property.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218922877
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58519/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65484414
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -738,6 +739,42 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
     }
     
     /**
    + * Reorder associative integral-type operators and fold all constants into one.
    + */
    +object ReorderAssociativeOperator extends Rule[LogicalPlan] {
    +  private def flattenAdd(e: Expression): Seq[Expression] = e match {
    +    case Add(l, r) => flattenAdd(l) ++ flattenAdd(r)
    +    case other => other :: Nil
    +  }
    +
    +  private def flattenMultiply(e: Expression): Seq[Expression] = e match {
    +    case Multiply(l, r) => flattenMultiply(l) ++ flattenMultiply(r)
    +    case other => other :: Nil
    +  }
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformExpressionsDown {
    --- End diff --
    
    we should do:
    ```
    plan transform {
        case q: LogicalPlan => q transformExpressionsDown {
          ......
        }
    }
    ```
    or here we just optimize the top level plan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217522469
  
    **[Test build #58006 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58006/consoleFull)** for PR 12850 at commit [`a4a3ce3`](https://github.com/apache/spark/commit/a4a3ce31f323e7cdfe4216c63ba156f8226ede52).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65257181
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -742,6 +742,23 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
      * equivalent [[Literal]] values.
      */
     object ConstantFolding extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    --- End diff --
    
    I think this is OK, `BooleanSimplification` is also kind of constant folding but we made a new rule for it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65272379
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -742,6 +742,23 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
      * equivalent [[Literal]] values.
      */
     object ConstantFolding extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    --- End diff --
    
    Thank you. I see!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65127553
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -751,6 +751,16 @@ object ConstantFolding extends Rule[LogicalPlan] {
     
           // Fold expressions that are foldable.
           case e if e.foldable => Literal.create(e.eval(EmptyRow), e.dataType)
    +
    +      // Use associative property for integral type
    +      case e if e.isInstanceOf[BinaryArithmetic] && e.dataType.isInstanceOf[IntegralType]
    +        => e match {
    +        case Add(Add(a, b), c) if b.foldable && c.foldable => Add(a, Add(b, c))
    --- End diff --
    
    Thank you for review, @cloud-fan ! 
    I see. That sounds great.
    Let me think about how to eliminate all constants then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65465190
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -738,6 +739,49 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
     }
     
     /**
    + * Reorder associative integral-type operators and fold all constants into one.
    + */
    +object ReorderAssociativeOperator extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    +    e.deterministic && e.isInstanceOf[BinaryArithmetic] && e.dataType.isInstanceOf[IntegralType] &&
    +      isSingleOperatorExpr(e)
    +
    +  private def isSingleOperatorExpr(e: Expression): Boolean = e.find {
    +    case a: Add if a.getClass == e.getClass => false
    +    case m: Multiply if m.getClass == e.getClass => false
    +    case _: BinaryArithmetic => true
    +    case _ => false
    +  }.isEmpty
    +
    +  private def getOperandList(e: Expression): Seq[Expression] = e match {
    +    case BinaryArithmetic(a, b) => getOperandList(a) ++ getOperandList(b)
    +    case other => other :: Nil
    +  }
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case q: LogicalPlan => q transformExpressionsDown {
    --- End diff --
    
    I see. That could be.
    We also need to add `isSingleOperatorExpr` there.
    Otherwise, `flattenAdd(Add(Multiply(1, 2), 3)) -> (3)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220859004
  
    **[Test build #59114 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59114/consoleFull)** for PR 12850 at commit [`eeae56d`](https://github.com/apache/spark/commit/eeae56d9b743e24a2657329f38a617ffa163ce5a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    BTW it goes without saying ... if you do decide to merge this, don't merge it in branch-2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216386855
  
    **[Test build #57567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57567/consoleFull)** for PR 12850 at commit [`71c3c73`](https://github.com/apache/spark/commit/71c3c73ad0f507837539ed58489e63e501bd4c3d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-222275618
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    I discussed it with @davies offline and here is our conclusion:
    
    1. This feature is not that important, as users can always do it manually, i.e. change the add/multiply order, which is not a lot effort.
    2. When we have this future, users lose the control of the execution order. e.g. they may add UDFs and other literals together and they want a deterministic execution order.
    3. other corner cases like overflow
    
    In general, we think this feature brings too much nondeterminacy compared to the benefits it brings. What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59824/consoleFull)** for PR 12850 at commit [`3959d57`](https://github.com/apache/spark/commit/3959d57258ed03bf6e9b845569d420c40b70e5c4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218910104
  
    **[Test build #58519 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58519/consoleFull)** for PR 12850 at commit [`18f5a8a`](https://github.com/apache/spark/commit/18f5a8a08ccfa75cb34ba2902883b9b1c0b7b8eb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Thank you for deep discussion on this. I think like this.
    
    For 1), there are **machine-generated** queries by BI tools. This is an important category of queries. In many cases, BIs (or other tools having UI) will generated queries by simple rules and those rule does not care about the output queries. The optimization is the role of **DBMS** or **Spark**. So, static optimizations are always important. This PR also minimizes the size of generated codes, too.
    
    For 2), other optimizers already **remove** or **duplicate** UDFs. Spark dose not give the control of the execution order. As you know, we already made the conclusion to leave an explicit note like the following for this (in SPARK-15282 and https://github.com/apache/spark/pull/13087).
    ```
    Note that the user-defined functions must be deterministic. Due to optimization,
    duplicate invocations may be eliminated or the function may even be invoked more times than
    it is present in the query.
    ```
    
    For 3), could you give some problematic real cases? This PR reordered only **addition** or **multiplications**, but I think this PR does not change the final result value. The following is the behavior of current Spark. (Not this PR. You can see that in the physical plan.)
    
    ```scala
    scala> sql("select 2147483640 + a + 7 from (select explode(array(1,2,3)) a)").explain()
    == Physical Plan ==
    *Project [((2147483640 + a#8) + 7) AS ((2147483640 + a) + 7)#9]
    +- Generate explode([1,2,3]), false, false, [a#8]
       +- Scan OneRowRelation[]
    
    scala> sql("select 2147483640 + a + 7 from (select explode(array(1,2,3)) a)").collect()
    res1: Array[org.apache.spark.sql.Row] = Array([-2147483648], [-2147483647], [-2147483646])
    
    scala>  sql("select a + 2147483647 from (select explode(array(1,2,3)) a)").collect()
    res2: Array[org.apache.spark.sql.Row] = Array([-2147483648], [-2147483647], [-2147483646])
    
    scala> sql("select 214748364 * a from (select explode(array(1,2,3)) a)").collect()
    res3: Array[org.apache.spark.sql.Row] = Array([214748364], [429496728], [644245092])
    
    scala> sql("select 214748364 * a * 10 from (select explode(array(1,2,3)) a)").collect()
    res4: Array[org.apache.spark.sql.Row] = Array([2147483640], [-16], [2147483624])
    
    scala> sql("select a * 2147483640 from (select explode(array(1,2,3)) a)").collect()
    res5: Array[org.apache.spark.sql.Row] = Array([2147483640], [-16], [2147483624])
    ```
    
    Apparently, the optimization of this PR will work like the above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Hi, @cloud-fan and @davies .
    How do you think about the above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216400248
  
    **[Test build #57567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57567/consoleFull)** for PR 12850 at commit [`71c3c73`](https://github.com/apache/spark/commit/71c3c73ad0f507837539ed58489e63e501bd4c3d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220862943
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-219502167
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58648/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218256026
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58246/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65485176
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -738,6 +739,42 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
     }
     
     /**
    + * Reorder associative integral-type operators and fold all constants into one.
    + */
    +object ReorderAssociativeOperator extends Rule[LogicalPlan] {
    +  private def flattenAdd(e: Expression): Seq[Expression] = e match {
    +    case Add(l, r) => flattenAdd(l) ++ flattenAdd(r)
    +    case other => other :: Nil
    +  }
    +
    +  private def flattenMultiply(e: Expression): Seq[Expression] = e match {
    +    case Multiply(l, r) => flattenMultiply(l) ++ flattenMultiply(r)
    +    case other => other :: Nil
    +  }
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformExpressionsDown {
    --- End diff --
    
    My bad. I changed this in a hurry. I'll fix soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65237290
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -742,6 +742,23 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
      * equivalent [[Literal]] values.
      */
     object ConstantFolding extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    --- End diff --
    
    Oh, that could be. 
    
    There is some difference on level of granulity.
    
    Join-related optimizers might be improved later to cost-based optimizers while ConstantFolder optimizer is just about removing constants on a single expression.
    
    Do you think it is a good idea to put the different levels of concerns together?
    
    I can do this in any way you decide. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220862945
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59114/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59700 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59700/consoleFull)** for PR 12850 at commit [`2ebc53c`](https://github.com/apache/spark/commit/2ebc53c9215e2ffb3fbd89ba5d998be394ef87d4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59700/consoleFull)** for PR 12850 at commit [`2ebc53c`](https://github.com/apache/spark/commit/2ebc53c9215e2ffb3fbd89ba5d998be394ef87d4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59788/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65465330
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -738,6 +739,49 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
     }
     
     /**
    + * Reorder associative integral-type operators and fold all constants into one.
    + */
    +object ReorderAssociativeOperator extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    +    e.deterministic && e.isInstanceOf[BinaryArithmetic] && e.dataType.isInstanceOf[IntegralType] &&
    +      isSingleOperatorExpr(e)
    +
    +  private def isSingleOperatorExpr(e: Expression): Boolean = e.find {
    +    case a: Add if a.getClass == e.getClass => false
    +    case m: Multiply if m.getClass == e.getClass => false
    +    case _: BinaryArithmetic => true
    +    case _ => false
    +  }.isEmpty
    +
    +  private def getOperandList(e: Expression): Seq[Expression] = e match {
    +    case BinaryArithmetic(a, b) => getOperandList(a) ++ getOperandList(b)
    +    case other => other :: Nil
    +  }
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case q: LogicalPlan => q transformExpressionsDown {
    --- End diff --
    
    `flattenAdd(Add(Multiply(1, 2), 3))` will become `[Multiply(1, 2), 3]`, and we won't get wrong result


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220390947
  
    **[Test build #58875 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58875/consoleFull)** for PR 12850 at commit [`65c7db7`](https://github.com/apache/spark/commit/65c7db78bfcac3ce025a858101ea53383f79a523).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65464470
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReorderAssociativeOperatorSuite.scala ---
    @@ -0,0 +1,59 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class ReorderAssociativeOperatorSuite extends PlanTest {
    +
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("ReorderAssociativeOperator", Once,
    +        ReorderAssociativeOperator) :: Nil
    +  }
    +
    +  val testRelation = LocalRelation('a.int, 'b.int, 'c.int)
    +
    +  test("Reorder associative operators") {
    --- End diff --
    
    also add a test for nondeterministic case please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217751647
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218922875
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59690/consoleFull)** for PR 12850 at commit [`d022904`](https://github.com/apache/spark/commit/d022904b0071d24ddbfd9d291e8766e177264413).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    I added the missing part, `e.deterministic` check in `isAssociativelyFoldable`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-219479063
  
    **[Test build #58648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58648/consoleFull)** for PR 12850 at commit [`0b60464`](https://github.com/apache/spark/commit/0b60464e3eed1ebf6e40e4216b9e521cbf539a52).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217751614
  
    **[Test build #58113 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58113/consoleFull)** for PR 12850 at commit [`3802255`](https://github.com/apache/spark/commit/3802255328481261949976913dfd2df6248a9bb8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220417362
  
    **[Test build #58884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58884/consoleFull)** for PR 12850 at commit [`0ffd004`](https://github.com/apache/spark/commit/0ffd0043e54685885cc5a17947a5ac6281e8cf89).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216980322
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57781/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    UDF is the first thing I came out, and yes, it must be deterministic. But as we have the `deterministic` property in `Expression`, I think it's possible for users to create non-deterministic expressions, e.g. `ScalaUDAF`, or other API we may create in the future, then the execution order matters.
    
    You can still improve this PR to handle non-deterministic cases, but that will make this PR more complex and harder to reason about, which may not worth.
    
    cc @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65126741
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -751,6 +751,16 @@ object ConstantFolding extends Rule[LogicalPlan] {
     
           // Fold expressions that are foldable.
           case e if e.foldable => Literal.create(e.eval(EmptyRow), e.dataType)
    +
    +      // Use associative property for integral type
    +      case e if e.isInstanceOf[BinaryArithmetic] && e.dataType.isInstanceOf[IntegralType]
    +        => e match {
    +        case Add(Add(a, b), c) if b.foldable && c.foldable => Add(a, Add(b, c))
    --- End diff --
    
    what about `a + 1 + b + 2`? I think we need a more general approach, like reordering the `Add` nodes to put all literals together.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59690/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59795/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218255698
  
    **[Test build #58246 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58246/consoleFull)** for PR 12850 at commit [`06e9b36`](https://github.com/apache/spark/commit/06e9b36c0381f6c987147803e894f2e29236deba).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    @cloud-fan .
    According to your advice, I refactored the code and added mixed(addition+multiplication) testcases. Also, the PR description is updated.
    Thank you so much again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-219502166
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217748244
  
    **[Test build #58113 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58113/consoleFull)** for PR 12850 at commit [`3802255`](https://github.com/apache/spark/commit/3802255328481261949976913dfd2df6248a9bb8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59795 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59795/consoleFull)** for PR 12850 at commit [`0acb157`](https://github.com/apache/spark/commit/0acb157111c5557ecdc8ce2189c7f0005431316a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59795 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59795/consoleFull)** for PR 12850 at commit [`0acb157`](https://github.com/apache/spark/commit/0acb157111c5557ecdc8ce2189c7f0005431316a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-222275527
  
    **[Test build #59531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59531/consoleFull)** for PR 12850 at commit [`8c8ea7a`](https://github.com/apache/spark/commit/8c8ea7a571636ee2c8cc70aa1cadced87784060e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Hi, @cloud-fan .
    Could you review again?
    Now, this PR provides a more generalized way to handle all foldable constants.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216980083
  
    **[Test build #57781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57781/consoleFull)** for PR 12850 at commit [`6898e0a`](https://github.com/apache/spark/commit/6898e0a4d1f4ba23dc7c19e73e1621c2a57cd8eb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59691/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220447148
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-222265243
  
    **[Test build #59531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59531/consoleFull)** for PR 12850 at commit [`8c8ea7a`](https://github.com/apache/spark/commit/8c8ea7a571636ee2c8cc70aa1cadced87784060e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-221334943
  
    Hi, @marmbrus and @rxin .
    Could you review this PR when you have some time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59691/consoleFull)** for PR 12850 at commit [`4e4845c`](https://github.com/apache/spark/commit/4e4845c3ff37eec370371ebee3d80d32bb06be92).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217522739
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58006/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59691 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59691/consoleFull)** for PR 12850 at commit [`4e4845c`](https://github.com/apache/spark/commit/4e4845c3ff37eec370371ebee3d80d32bb06be92).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59690 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59690/consoleFull)** for PR 12850 at commit [`d022904`](https://github.com/apache/spark/commit/d022904b0071d24ddbfd9d291e8766e177264413).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Thank you for reconsidering this PR positively. I'll update soon according to your advice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Hi, @cloud-fan .
    It's ready for review again.
    Could you review this when you have some time?
    Thank you always!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216980320
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217522736
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217751648
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58113/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59788/consoleFull)** for PR 12850 at commit [`8b7a0bf`](https://github.com/apache/spark/commit/8b7a0bfe21e700729e5dc1b7849e488982312875).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Thank you for feedback. I'm really happy with your attention!
    For the non-deterministic part, we can add a single condition in `isAssociativelyFoldable`.
    If some of operand of the expression is non-deterministic, the whole expression is non-deterministic. It's easy.
    Also, it's **future-proof**. In the future, although we make `NondeterministicScalaUDF` whose `deterministic==false`, this optimizer will not handle the expressions containing it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59645/consoleFull)** for PR 12850 at commit [`8956a1e`](https://github.com/apache/spark/commit/8956a1e0300c162cc8b71e836ee54c182cdd6a50).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218231334
  
    Rebased to see the result on re-enable hive queries.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220416079
  
    Rebased to trigger Jenkins test again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65221659
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -742,6 +742,23 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
      * equivalent [[Literal]] values.
      */
     object ConstantFolding extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    --- End diff --
    
    Similar to `ReorderJoin`, we should have a new rule `ReorderAssociativeOperator` to do this optimization, instead of putting it in `ConstantFolding`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    looks like it's not such difficult to handle all cases, this optimization LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-219501871
  
    **[Test build #58648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58648/consoleFull)** for PR 12850 at commit [`0b60464`](https://github.com/apache/spark/commit/0b60464e3eed1ebf6e40e4216b9e521cbf539a52).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59788/consoleFull)** for PR 12850 at commit [`8b7a0bf`](https://github.com/apache/spark/commit/8b7a0bfe21e700729e5dc1b7849e488982312875).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65464426
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -738,6 +739,49 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
     }
     
     /**
    + * Reorder associative integral-type operators and fold all constants into one.
    + */
    +object ReorderAssociativeOperator extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    +    e.deterministic && e.isInstanceOf[BinaryArithmetic] && e.dataType.isInstanceOf[IntegralType] &&
    +      isSingleOperatorExpr(e)
    +
    +  private def isSingleOperatorExpr(e: Expression): Boolean = e.find {
    +    case a: Add if a.getClass == e.getClass => false
    +    case m: Multiply if m.getClass == e.getClass => false
    +    case _: BinaryArithmetic => true
    +    case _ => false
    +  }.isEmpty
    +
    +  private def getOperandList(e: Expression): Seq[Expression] = e match {
    +    case BinaryArithmetic(a, b) => getOperandList(a) ++ getOperandList(b)
    +    case other => other :: Nil
    +  }
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case q: LogicalPlan => q transformExpressionsDown {
    --- End diff --
    
    how about
    ```
    def flattenAdd(e: Expression): Seq[Expression] = e match {
      case Add(l, r) => flattenAdd(l) ++ flattenAdd(r)
      case other => other
    }
    
    ...
    plan transformAllExpressions {
      case a: Add if a.deterministic && a.dataType.isInstanceOf[IntegralType] =>
        val (foldables, others) => flattenAdd(a).partition(_.foldable)
        if (foldables.size > 1) {
          val foldableExpr = foldables.reduce(Add(_, _))
          val c = Literal.create(foldableExpr.eval(), a.dataType)
          if (others.isEmpty) c else Add(others.reduce(Add(_, _)), c)
        } else {
          a
        }
    }
    ```
    
    We can duplicate some code for `Multiply`, and I think this maybe more readable than the current version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/12850


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-222275620
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59531/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220412842
  
    **[Test build #58875 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58875/consoleFull)** for PR 12850 at commit [`65c7db7`](https://github.com/apache/spark/commit/65c7db78bfcac3ce025a858101ea53383f79a523).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218256024
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216400402
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57567/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59824/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218231707
  
    **[Test build #58246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58246/consoleFull)** for PR 12850 at commit [`06e9b36`](https://github.com/apache/spark/commit/06e9b36c0381f6c987147803e894f2e29236deba).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding optimizer by ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59645/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220862881
  
    **[Test build #59114 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59114/consoleFull)** for PR 12850 at commit [`eeae56d`](https://github.com/apache/spark/commit/eeae56d9b743e24a2657329f38a617ffa163ce5a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    **[Test build #59824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59824/consoleFull)** for PR 12850 at commit [`3959d57`](https://github.com/apache/spark/commit/3959d57258ed03bf6e9b845569d420c40b70e5c4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65464887
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReorderAssociativeOperatorSuite.scala ---
    @@ -0,0 +1,59 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans.PlanTest
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class ReorderAssociativeOperatorSuite extends PlanTest {
    +
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("ReorderAssociativeOperator", Once,
    +        ReorderAssociativeOperator) :: Nil
    +  }
    +
    +  val testRelation = LocalRelation('a.int, 'b.int, 'c.int)
    +
    +  test("Reorder associative operators") {
    +    val originalQuery =
    +      testRelation
    +        .select(
    +          (Literal(3) + ((Literal(1) + 'a) + 2)) + 4,
    +          'b * 1 * 2 * 3 * 4,
    +          'a + 1 + 'b + 2 + 'c + 3,
    +          Rand(0) * 1 * 2 * 3 * 4)
    +
    --- End diff --
    
    I already added non-deterministic case here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220413082
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58875/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    cc @davies , can you take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220447151
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58884/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216400399
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Hi, @cloud-fan .
    Now, I made a new rule `ReorderAssociativeOperator` as you recommended.
    Jira issue and PR description are updated together, too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220413078
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/12850#discussion_r65465517
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -738,6 +739,49 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] with PredicateHelpe
     }
     
     /**
    + * Reorder associative integral-type operators and fold all constants into one.
    + */
    +object ReorderAssociativeOperator extends Rule[LogicalPlan] {
    +  private def isAssociativelyFoldable(e: Expression): Boolean =
    +    e.deterministic && e.isInstanceOf[BinaryArithmetic] && e.dataType.isInstanceOf[IntegralType] &&
    +      isSingleOperatorExpr(e)
    +
    +  private def isSingleOperatorExpr(e: Expression): Boolean = e.find {
    +    case a: Add if a.getClass == e.getClass => false
    +    case m: Multiply if m.getClass == e.getClass => false
    +    case _: BinaryArithmetic => true
    +    case _ => false
    +  }.isEmpty
    +
    +  private def getOperandList(e: Expression): Seq[Expression] = e match {
    +    case BinaryArithmetic(a, b) => getOperandList(a) ++ getOperandList(b)
    +    case other => other :: Nil
    +  }
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case q: LogicalPlan => q transformExpressionsDown {
    --- End diff --
    
    Oh, I see. You generalize my PR again! Great!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-216955720
  
    **[Test build #57781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57781/consoleFull)** for PR 12850 at commit [`6898e0a`](https://github.com/apache/spark/commit/6898e0a4d1f4ba23dc7c19e73e1621c2a57cd8eb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-217499612
  
    **[Test build #58006 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58006/consoleFull)** for PR 12850 at commit [`a4a3ce3`](https://github.com/apache/spark/commit/a4a3ce31f323e7cdfe4216c63ba156f8226ede52).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/12850
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59700/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-222646568
  
    **[Test build #59645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59645/consoleFull)** for PR 12850 at commit [`8956a1e`](https://github.com/apache/spark/commit/8956a1e0300c162cc8b71e836ee54c182cdd6a50).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Oh, thank you! @cloud-fan .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-218922758
  
    **[Test build #58519 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58519/consoleFull)** for PR 12850 at commit [`18f5a8a`](https://github.com/apache/spark/commit/18f5a8a08ccfa75cb34ba2902883b9b1c0b7b8eb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #12850: [SPARK-15076][SQL] Add ReorderAssociativeOperator optimi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/12850
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15076][SQL] Improve ConstantFolding opt...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/12850#issuecomment-220446855
  
    **[Test build #58884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58884/consoleFull)** for PR 12850 at commit [`0ffd004`](https://github.com/apache/spark/commit/0ffd0043e54685885cc5a17947a5ac6281e8cf89).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org