You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/06/08 01:07:38 UTC

[GitHub] [spark] koertkuipers commented on pull request #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled

koertkuipers commented on pull request #27986:
URL: https://github.com/apache/spark/pull/27986#issuecomment-640310420


   adaptive execution estimates the number of partitions for a shuffle using `spark.sql.adaptive.shuffle.targetPostShuffleInputSize` as its target size per shuffled partition. i was surprised to find out however that it does not do this for a `DataFrame.repartition(...)`. i dont understand why since under the hood its also just a shuffle no different than a `DataFrame.groupBy`.
   will this pull request fix this issue? from looking at code i dont understand if it does so, it doesnt look like it to me.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org