You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (JIRA)" <ji...@apache.org> on 2016/04/22 02:08:12 UTC
[jira] [Created] (SPARK-14830) Add
RemoveRepetitionFromGroupExpressions optimizer
Dongjoon Hyun created SPARK-14830:
-------------------------------------
Summary: Add RemoveRepetitionFromGroupExpressions optimizer
Key: SPARK-14830
URL: https://issues.apache.org/jira/browse/SPARK-14830
Project: Spark
Issue Type: Improvement
Components: Optimizer
Reporter: Dongjoon Hyun
This issue aims to optimize GroupExpressions by removing repeating expressions.
**Before**
{code}
scala> sql("select a from (select explode(array(1,2)) a) T group by a, a, a").explain()
== Physical Plan ==
WholeStageCodegen
: +- TungstenAggregate(key=[a#5,a#5,a#5], functions=[], output=[a#5])
: +- INPUT
+- Exchange hashpartitioning(a#5, a#5, a#5, 200), None
+- WholeStageCodegen
: +- TungstenAggregate(key=[a#5,a#5,a#5], functions=[], output=[a#5,a#5,a#5])
: +- INPUT
+- Generate explode([1,2]), false, false, [a#5]
+- Scan OneRowRelation[]
{code}
**After**
{code}
scala> sql("select a from (select explode(array(1,2)) a) T group by a, a, a").explain()
== Physical Plan ==
WholeStageCodegen
: +- TungstenAggregate(key=[a#5], functions=[], output=[a#5])
: +- INPUT
+- Exchange hashpartitioning(a#5, 200), None
+- WholeStageCodegen
: +- TungstenAggregate(key=[a#5], functions=[], output=[a#5])
: +- INPUT
+- Generate explode([1,2]), false, false, [a#5]
+- Scan OneRowRelation[]
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org