You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by invkrh <in...@gmail.com> on 2015/06/08 15:33:31 UTC

[SparkSQL ] What is Exchange in physical plan for ?

Hi,

DataFrame.explain() shows the physical plan of a query. I noticed there are
a lot of `Exchange`s in it, like below:

Project [period#20L,categoryName#0,regionName#10,action#15,list_id#16L]
 ShuffledHashJoin [region#18], [regionCode#9], BuildRight
  Exchange (HashPartitioning [region#18], 12)
   Project [categoryName#0,list_id#16L,period#20L,action#15,region#18]
    ShuffledHashJoin [refCategoryID#3], [category#17], BuildRight
     Exchange (HashPartitioning [refCategoryID#3], 12)
      Project [categoryName#0,refCategoryID#3]
       PhysicalRDD
[categoryName#0,familyName#1,parentRefCategoryID#2,refCategoryID#3],
MapPartitionsRDD[5] at mapPartitions at SQLContext.scala:439
     Exchange (HashPartitioning [category#17], 12)
      Project [timestamp_sec#13L AS
period#20L,category#17,region#18,action#15,list_id#16L]
       PhysicalRDD
[syslog#12,timestamp_sec#13L,timestamp_usec#14,action#15,list_id#16L,category#17,region#18,expiration_time#19],
MapPartitionsRDD[16] at map at SQLContext.scala:394
  Exchange (HashPartitioning [regionCode#9], 12)
   Project [regionName#10,regionCode#9]
    PhysicalRDD
[cityName#4,countryCode#5,countryName#6,dptCode#7,dptName#8,regionCode#9,regionName#10,zipCode#11],
MapPartitionsRDD[11] at mapPartitions at SQLContext.scala:439

I find also its class:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala.

So what does it mean ? 

Thank you.

Hao.



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkSQL-What-is-Exchange-in-physical-plan-for-tp12659.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


RE: [SparkSQL ] What is Exchange in physical plan for ?

Posted by "Cheng, Hao" <ha...@intel.com>.
It means the data shuffling, and its arguments also show the partitioning strategy.

-----Original Message-----
From: invkrh [mailto:invkrh@gmail.com] 
Sent: Monday, June 8, 2015 9:34 PM
To: dev@spark.apache.org
Subject: [SparkSQL ] What is Exchange in physical plan for ?

Hi,

DataFrame.explain() shows the physical plan of a query. I noticed there are a lot of `Exchange`s in it, like below:

Project [period#20L,categoryName#0,regionName#10,action#15,list_id#16L]
 ShuffledHashJoin [region#18], [regionCode#9], BuildRight
  Exchange (HashPartitioning [region#18], 12)
   Project [categoryName#0,list_id#16L,period#20L,action#15,region#18]
    ShuffledHashJoin [refCategoryID#3], [category#17], BuildRight
     Exchange (HashPartitioning [refCategoryID#3], 12)
      Project [categoryName#0,refCategoryID#3]
       PhysicalRDD
[categoryName#0,familyName#1,parentRefCategoryID#2,refCategoryID#3],
MapPartitionsRDD[5] at mapPartitions at SQLContext.scala:439
     Exchange (HashPartitioning [category#17], 12)
      Project [timestamp_sec#13L AS
period#20L,category#17,region#18,action#15,list_id#16L]
       PhysicalRDD
[syslog#12,timestamp_sec#13L,timestamp_usec#14,action#15,list_id#16L,category#17,region#18,expiration_time#19],
MapPartitionsRDD[16] at map at SQLContext.scala:394
  Exchange (HashPartitioning [regionCode#9], 12)
   Project [regionName#10,regionCode#9]
    PhysicalRDD
[cityName#4,countryCode#5,countryName#6,dptCode#7,dptName#8,regionCode#9,regionName#10,zipCode#11],
MapPartitionsRDD[11] at mapPartitions at SQLContext.scala:439

I find also its class:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala.

So what does it mean ? 

Thank you.

Hao.



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkSQL-What-is-Exchange-in-physical-plan-for-tp12659.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org For additional commands, e-mail: dev-help@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org