You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2014/06/18 18:14:24 UTC
[jira] [Updated] (SPARK-2176) extra unnecessary exchange operator
in group by
[ https://issues.apache.org/jira/browse/SPARK-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated SPARK-2176:
----------------------------
Description:
{code}
hql("explain select * from src group by key").collect().foreach(println)
[ExplainCommand [plan#27:0]]
[ Aggregate false, [key#25], [key#25,value#26]]
[ Exchange (HashPartitioning [key#25:0], 200)]
[ Exchange (HashPartitioning [key#25:0], 200)]
[ Aggregate true, [key#25], [key#25]]
[ HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None]
{code}
There are two exchange operators.
However, if we do not use explain...
{code}
hql("select * from src group by key")
res4: org.apache.spark.sql.SchemaRDD =
SchemaRDD[8] at RDD at SchemaRDD.scala:100
== Query Plan ==
Aggregate false, [key#8], [key#8,value#9]
Exchange (HashPartitioning [key#8:0], 200)
Aggregate true, [key#8], [key#8]
HiveTableScan [key#8,value#9], (MetastoreRelation default, src, None), None
{code}
The plan is fine.
was:
{code}
hql("explain select * from src group by key").collect().foreach(println)
[ExplainCommand [plan#27:0]]
[ Aggregate false, [key#25], [key#25,value#26]]
[ Exchange (HashPartitioning [key#25:0], 200)]
[ Exchange (HashPartitioning [key#25:0], 200)]
[ Aggregate true, [key#25], [key#25]]
[ HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None]
{code}
There are two exchange operators.
> extra unnecessary exchange operator in group by
> -----------------------------------------------
>
> Key: SPARK-2176
> URL: https://issues.apache.org/jira/browse/SPARK-2176
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Reynold Xin
> Assignee: Yin Huai
>
> {code}
> hql("explain select * from src group by key").collect().foreach(println)
> [ExplainCommand [plan#27:0]]
> [ Aggregate false, [key#25], [key#25,value#26]]
> [ Exchange (HashPartitioning [key#25:0], 200)]
> [ Exchange (HashPartitioning [key#25:0], 200)]
> [ Aggregate true, [key#25], [key#25]]
> [ HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None]
> {code}
> There are two exchange operators.
> However, if we do not use explain...
> {code}
> hql("select * from src group by key")
> res4: org.apache.spark.sql.SchemaRDD =
> SchemaRDD[8] at RDD at SchemaRDD.scala:100
> == Query Plan ==
> Aggregate false, [key#8], [key#8,value#9]
> Exchange (HashPartitioning [key#8:0], 200)
> Aggregate true, [key#8], [key#8]
> HiveTableScan [key#8,value#9], (MetastoreRelation default, src, None), None
> {code}
> The plan is fine.
--
This message was sent by Atlassian JIRA
(v6.2#6252)