You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by SongYadong <gi...@git.apache.org> on 2018/10/12 03:10:40 UTC
[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...
GitHub user SongYadong opened a pull request:
https://github.com/apache/spark/pull/22706
[SPARK-25716][SQL][MINOR] remove unnecessary collection operation in valid constraints generation
## What changes were proposed in this pull request?
Project logical operator generates valid constraints using two opposite operations. It substracts child constraints from all constraints, than union child constraints again. I think it may be not necessary.
Aggregate operator has the same problem with Project.
This PR try to remove these two opposite collection operations.
## How was this patch tested?
Related unit tests:
ProjectEstimationSuite
CollapseProjectSuite
PushProjectThroughUnionSuite
UnsafeProjectionBenchmark
GeneratedProjectionSuite
CodeGeneratorWithInterpretedFallbackSuite
TakeOrderedAndProjectSuite
GenerateUnsafeProjectionSuite
BucketedRandomProjectionLSHSuite
RemoveRedundantAliasAndProjectSuite
AggregateBenchmark
AggregateOptimizeSuite
AggregateEstimationSuite
DecimalAggregatesSuite
DateFrameAggregateSuite
ObjectHashAggregateSuite
TwoLevelAggregateHashMapSuite
ObjectHashAggregateExecBenchmark
SingleLevelAggregateHaspMapSuite
TypedImperativeAggregateSuite
RewriteDistinctAggregatesSuite
HashAggregationQuerySuite
HashAggregationQueryWithControlledFallbackSuite
TypedImperativeAggregateSuite
TwoLevelAggregateHashMapWithVectorizedMapSuite
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/SongYadong/spark generate_constraints
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22706.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22706
----
commit fab5faaa838295affdb9a1bfeae1d613eddfb7a1
Author: SongYadong <so...@...>
Date: 2018-10-11T14:12:05Z
remove unnecessary collection operation in valid constraints generation
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22706
LGTM
Thanks! Merged to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22706
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22706
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/22706
It makes some sense, but how much difference does it make, performance-wise?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22706
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22706
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by maryannxue <gi...@git.apache.org>.
Github user maryannxue commented on the issue:
https://github.com/apache/spark/pull/22706
@srowen I don't think this would make a big difference performance-wise, but if it's the right change, it just looks cleaner now. Anyone have any idea why it wasn't like this before?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22706
cc @maryannxue
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22706#discussion_r224692031
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala ---
@@ -152,10 +152,10 @@ abstract class UnaryNode extends LogicalPlan {
override final def children: Seq[LogicalPlan] = child :: Nil
/**
- * Generates an additional set of aliased constraints by replacing the original constraint
- * expressions with the corresponding alias
+ * Generates all valid constraints including an set of aliased constraints by replacing the
+ * original constraint expressions with the corresponding alias
*/
- protected def getAliasedConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
+ protected def getAllValidConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
--- End diff --
cc @gatorsmile .
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22706
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22706
**[Test build #97345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97345/testReport)** for PR 22706 at commit [`fab5faa`](https://github.com/apache/spark/commit/fab5faaa838295affdb9a1bfeae1d613eddfb7a1).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22706
**[Test build #97345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97345/testReport)** for PR 22706 at commit [`fab5faa`](https://github.com/apache/spark/commit/fab5faaa838295affdb9a1bfeae1d613eddfb7a1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22706
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22706: [SPARK-25716][SQL][MINOR] remove unnecessary coll...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22706#discussion_r224966692
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala ---
@@ -152,10 +152,10 @@ abstract class UnaryNode extends LogicalPlan {
override final def children: Seq[LogicalPlan] = child :: Nil
/**
- * Generates an additional set of aliased constraints by replacing the original constraint
- * expressions with the corresponding alias
+ * Generates all valid constraints including an set of aliased constraints by replacing the
+ * original constraint expressions with the corresponding alias
*/
- protected def getAliasedConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
+ protected def getAllValidConstraints(projectList: Seq[NamedExpression]): Set[Expression] = {
--- End diff --
`getValidConstraints`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22706
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97345/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org