You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by SongYadong <gi...@git.apache.org> on 2018/10/12 03:10:40 UTC

[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...

GitHub user SongYadong opened a pull request:

    https://github.com/apache/spark/pull/22706

    [SPARK-25716][SQL][MINOR] remove unnecessary collection operation in valid constraints generation

    ## What changes were proposed in this pull request?
    
    Project logical operator generates valid constraints using two opposite operations. It substracts child constraints from all constraints, than union child constraints again. I think it may be not necessary.
    Aggregate operator has the same problem with Project. 
    
    This PR try to remove these two opposite collection operations.
    
    ## How was this patch tested?
    
    Related unit tests:
    ProjectEstimationSuite
    CollapseProjectSuite
    PushProjectThroughUnionSuite
    UnsafeProjectionBenchmark
    GeneratedProjectionSuite
    CodeGeneratorWithInterpretedFallbackSuite
    TakeOrderedAndProjectSuite
    GenerateUnsafeProjectionSuite
    BucketedRandomProjectionLSHSuite
    RemoveRedundantAliasAndProjectSuite
    AggregateBenchmark
    AggregateOptimizeSuite
    AggregateEstimationSuite
    DecimalAggregatesSuite
    DateFrameAggregateSuite
    ObjectHashAggregateSuite
    TwoLevelAggregateHashMapSuite
    ObjectHashAggregateExecBenchmark
    SingleLevelAggregateHaspMapSuite
    TypedImperativeAggregateSuite
    RewriteDistinctAggregatesSuite
    HashAggregationQuerySuite
    HashAggregationQueryWithControlledFallbackSuite
    TypedImperativeAggregateSuite
    TwoLevelAggregateHashMapWithVectorizedMapSuite


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/SongYadong/spark generate_constraints

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22706.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22706
    
----
commit fab5faaa838295affdb9a1bfeae1d613eddfb7a1
Author: SongYadong <so...@...>
Date:   2018-10-11T14:12:05Z

    remove unnecessary collection operation in valid constraints generation

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    LGTM
    
    Thanks! Merged to master. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    It makes some sense, but how much difference does it make, performance-wise?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by maryannxue <gi...@git.apache.org>.
Github user maryannxue commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    @srowen I don't think this would make a big difference performance-wise, but if it's the right change, it just looks cleaner now. Anyone have any idea why it wasn't like this before?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    cc @maryannxue 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22706#discussion_r224692031
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala ---
    @@ -152,10 +152,10 @@ abstract class UnaryNode extends LogicalPlan {
       override final def children: Seq[LogicalPlan] = child :: Nil
     
       /**
    -   * Generates an additional set of aliased constraints by replacing the original constraint
    -   * expressions with the corresponding alias
    +   * Generates all valid constraints including an set of aliased constraints by replacing the
    +   * original constraint expressions with the corresponding alias
        */
    -  protected def getAliasedConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
    +  protected def getAllValidConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
    --- End diff --
    
    cc @gatorsmile .


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    **[Test build #97345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97345/testReport)** for PR 22706 at commit [`fab5faa`](https://github.com/apache/spark/commit/fab5faaa838295affdb9a1bfeae1d613eddfb7a1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    **[Test build #97345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97345/testReport)** for PR 22706 at commit [`fab5faa`](https://github.com/apache/spark/commit/fab5faaa838295affdb9a1bfeae1d613eddfb7a1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22706


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22706#discussion_r224966692
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala ---
    @@ -152,10 +152,10 @@ abstract class UnaryNode extends LogicalPlan {
       override final def children: Seq[LogicalPlan] = child :: Nil
     
       /**
    -   * Generates an additional set of aliased constraints by replacing the original constraint
    -   * expressions with the corresponding alias
    +   * Generates all valid constraints including an set of aliased constraints by replacing the
    +   * original constraint expressions with the corresponding alias
        */
    -  protected def getAliasedConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
    +  protected def getAllValidConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
    --- End diff --
    
    `getValidConstraints`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22706
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97345/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org