You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2014/06/18 18:14:24 UTC

[jira] [Updated] (SPARK-2176) extra unnecessary exchange operator in group by

     [ https://issues.apache.org/jira/browse/SPARK-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yin Huai updated SPARK-2176:
----------------------------

    Description: 
{code}
hql("explain select * from src group by key").collect().foreach(println)

[ExplainCommand [plan#27:0]]
[ Aggregate false, [key#25], [key#25,value#26]]
[  Exchange (HashPartitioning [key#25:0], 200)]
[   Exchange (HashPartitioning [key#25:0], 200)]
[    Aggregate true, [key#25], [key#25]]
[     HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None]
{code}

There are two exchange operators.

However, if we do not use explain...
{code}
hql("select * from src group by key")

res4: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[8] at RDD at SchemaRDD.scala:100
== Query Plan ==
Aggregate false, [key#8], [key#8,value#9]
 Exchange (HashPartitioning [key#8:0], 200)
  Aggregate true, [key#8], [key#8]
   HiveTableScan [key#8,value#9], (MetastoreRelation default, src, None), None
{code}
The plan is fine.

  was:
{code}
hql("explain select * from src group by key").collect().foreach(println)

[ExplainCommand [plan#27:0]]
[ Aggregate false, [key#25], [key#25,value#26]]
[  Exchange (HashPartitioning [key#25:0], 200)]
[   Exchange (HashPartitioning [key#25:0], 200)]
[    Aggregate true, [key#25], [key#25]]
[     HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None]
{code}

There are two exchange operators.



> extra unnecessary exchange operator in group by
> -----------------------------------------------
>
>                 Key: SPARK-2176
>                 URL: https://issues.apache.org/jira/browse/SPARK-2176
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Yin Huai
>
> {code}
> hql("explain select * from src group by key").collect().foreach(println)
> [ExplainCommand [plan#27:0]]
> [ Aggregate false, [key#25], [key#25,value#26]]
> [  Exchange (HashPartitioning [key#25:0], 200)]
> [   Exchange (HashPartitioning [key#25:0], 200)]
> [    Aggregate true, [key#25], [key#25]]
> [     HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None]
> {code}
> There are two exchange operators.
> However, if we do not use explain...
> {code}
> hql("select * from src group by key")
> res4: org.apache.spark.sql.SchemaRDD = 
> SchemaRDD[8] at RDD at SchemaRDD.scala:100
> == Query Plan ==
> Aggregate false, [key#8], [key#8,value#9]
>  Exchange (HashPartitioning [key#8:0], 200)
>   Aggregate true, [key#8], [key#8]
>    HiveTableScan [key#8,value#9], (MetastoreRelation default, src, None), None
> {code}
> The plan is fine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)