You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by hvanhovell <gi...@git.apache.org> on 2018/06/21 12:40:03 UTC
[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/16677#discussion_r197116936
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala ---
@@ -247,6 +253,10 @@ object ShuffleExchangeExec {
val projection = UnsafeProjection.create(h.partitionIdExpression :: Nil, outputAttributes)
row => projection(row).getInt(0)
case RangePartitioning(_, _) | SinglePartition => identity
+ case LocalPartitioning(_, _) =>
+ (row: InternalRow) => {
+ TaskContext.get().partitionId()
--- End diff --
Can we try to do this once per partition instead of for each row?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org