You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/19 08:18:47 UTC

[GitHub] [spark] stczwd opened a new pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

stczwd opened a new pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946
 
 
   ### Why are the changes needed?
   EnsureRequirements adds ShuffleExchangeExec (RangePartitioning) after Sort if RoundRobinPartitioning behinds it. This will cause 2 shuffles, and the number of partitions in the final stage is not the number specified by RoundRobinPartitioning.
   Example SQL: select /*+ REPARTITION(5) */ * from test order by a
   Before fix:
   == Physical Plan ==
   *(1) Sort [a#0 ASC NULLS FIRST], true, 0
   +- Exchange rangepartitioning(a#0 ASC NULLS FIRST, 200), true, [id=#11]
      +- Exchange RoundRobinPartitioning(5), false, [id=#9]
         +- Scan hive default.test [a#0, b#1], HiveTableRelation `default`.`test`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [a#0, b#1]
   After fix:
   == Physical Plan ==
   *(1) Sort [a#0 ASC NULLS FIRST], true, 0
   +- Exchange rangepartitioning(a#0 ASC NULLS FIRST, 5), true, [id=#11]
      +- Scan hive default.test [a#0, b#1], HiveTableRelation `default`.`test`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [a#0, b#1]
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   Run suite Tests and add new test for this.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568695325
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568743128
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115718/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-570570665
 
 
   for join, it doesn't require `OrderedDistribution`, but `HashClusteredDistribution`.
   
   This PR only affects sort.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360379564
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,24 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: EnsureRequirements replace Exchange " +
+      "if child has SortExec and RoundRobinPartitioning") {
 
 Review comment:
   Because HashPartitioning should also be concerned. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568747286
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115729/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568747283
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360796170
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -55,12 +54,12 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
 
     withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> numPartitions.toString) {
       // The default chi-sq value should be low
-      assert(computeChiSquareTest() < 100)
+      assert(computeChiSquareTest() < 10)
 
 Review comment:
   the physical plan is same as before, what caused this change?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568709268
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568624156
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20471/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-571034381
 
 
   > Yes it is. But it is similar with outer join. `df.join(df.repartition(10), Seq("id"), "left")` result the 200 partitions, and now `df.repartition(10).sort("id")` result the 10 partitions. Should they be same ?
   > Sorry for the wrong example. I mean user should use the right way to change partition. Obviously df.repartition(10).sort("id") should return the spark sql shuffle partitions.
   
   Thanks for pay attention on this. The main problem you described is whether we should change partition num for OrderedDistribution. 
   Hm, it's you add `REPARTITION` hint in [SPARK-28746](https://github.com/apache/spark/pull/25464), you may know what it means to users. In other cases, `REPARTITION` hint will change result partition number with shuffles, but it didn't work with order by, which confused users. `REPARTITION` is a great way for users to control final result num, we should keep it works on every queries.
   Besides, `sort("id")` is a global OrderDistribution, which usually generate the final result. It is not easy to set partition number with defaultShufflePartitions, especially on large queries with multiple shuffles. 
   Finally, changing partition num is good way for use to control shuffle and final results with `df.repartition(10).sort("id")`. Users may won't write `df.repartition(10).sort("id")` unless they want change the final partition num. It is not a normal case in other scenes.
   
   Correct me if I'm wrong. Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568709274
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115719/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360407496
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   The test wants to test a sort with shuffle. Do we still have shuffle in this query?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360924930
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -55,12 +54,12 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
 
     withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> numPartitions.toString) {
       // The default chi-sq value should be low
-      assert(computeChiSquareTest() < 100)
+      assert(computeChiSquareTest() < 10)
 
 Review comment:
   They are not same, we had two shuffles before, one was RoundRobinPartitioning, the other was RangePartitioning.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568742853
 
 
   **[Test build #115718 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115718/testReport)** for PR 26946 at commit [`52ce660`](https://github.com/apache/spark/commit/52ce6603aa4e3695360b54f2481bd9a0f9142016).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567858259
 
 
   shall we add an end-to-end test for `SELECT /*+ REPARTITION(5) */ * FROM test ORDER BY a`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568651737
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115674/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568651732
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-570779371
 
 
   > This PR only affects sort.
   
   Yes it is. But it is similar with outer join.  `df.join(df.repartition(10), Seq("id"), "left")` result the 200 partitions, and now `df.repartition(10).sort("id")` result the 10 partitions. Should they be same ?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568888939
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568889659
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20559/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567998647
 
 
   **[Test build #115630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115630/testReport)** for PR 26946 at commit [`5915a12`](https://github.com/apache/spark/commit/5915a124d1cba718c15bf861e60fd7c4b4dee472).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568747179
 
 
   **[Test build #115729 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115729/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568697251
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you edited a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
ulysses-you edited a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-570535969
 
 
   Wait, another point. 
   First this scence also exists  in `join` or `window` operator, as @viirya say, exists other Distribution. 
   For e.g. join
   ```
   val df = spark.range(1, 10, 2)
   df.join(df.repartition(10), Seq("id"), "left").explain(true)
   
   // physical plan like this
   = Physical Plan ==
   *(5) Project [id#0L]
   +- SortMergeJoin [id#0L], [id#83L], LeftOuter
      :- *(2) Sort [id#0L ASC NULLS FIRST], false, 0
      :  +- Exchange hashpartitioning(id#0L, 200), true, [id=#378]
      :     +- *(1) Range (1, 10, step=1, splits=40)
      +- *(4) Sort [id#83L ASC NULLS FIRST], false, 0
         +- Exchange hashpartitioning(id#83L, 200), true, [id=#384]
            +- Exchange RoundRobinPartitioning(10), false, [id=#383]
               +- *(3) Range (1, 10, step=1, splits=40)
   ```
   
   And then there is a little difference between `2 -> 10 -> 200` and `2 -> 10` because of different operator complexity. Repartition may be is a light operator compare with sort or join or else algorithm. So it's not sure `2 -> 10` is always run faster than `2 -> 10 -> 200`.
   
   The last, if end user really want result partition is 10, should use `df.sort("id").repartition(10)` instead, not the `df.repartition(10).sort("id")`. Pruning shuffle may mislead user.
   
   cc @HyukjinKwon @cloud-fan @maropu 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd edited a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd edited a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568383598
 
 
   > We can add an end-to-end test, check the physical plan of a query, and count shuffles.
   
   Sure,I will add some tests for these cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568909404
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360795637
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   In the current master
   ```
   scala> spark.range(0, 10000, 1, 10).sort("id").explain(true)
   == Parsed Logical Plan ==
   'Sort ['id ASC NULLS FIRST], true
   +- Range (0, 10000, step=1, splits=Some(10))
   
   == Analyzed Logical Plan ==
   id: bigint
   Sort [id#8L ASC NULLS FIRST], true
   +- Range (0, 10000, step=1, splits=Some(10))
   
   == Optimized Logical Plan ==
   Range (0, 10000, step=1, splits=Some(10))
   
   == Physical Plan ==
   *(1) Range (0, 10000, step=1, splits=10)
   ```
   
   If we add a shuffle now, then it's a regression and we should fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568655859
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20486/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-569181203
 
 
   Merged to master, I guess :-).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360987423
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -55,12 +54,12 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
 
     withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> numPartitions.toString) {
       // The default chi-sq value should be low
-      assert(computeChiSquareTest() < 100)
+      assert(computeChiSquareTest() < 10)
 
 Review comment:
   ah i see

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568712637
 
 
   **[Test build #115729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115729/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568889659
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20559/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568697257
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20513/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r361106110
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,52 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you edited a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
ulysses-you edited a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-571070694
 
 
   I see you want to make a way that change partition easily after sort.
   
   Only one thing I not sure. If `df.repartition(10).sort("id")` result the 10 partitions, user will think `df.repartition(10).(what need shuffle operator)` result 10 partitions too but actually not. It's a special handle for sort.
   
   I don not know how committer think about it, or it's just fine.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568889538
 
 
   **[Test build #115766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115766/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568687705
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-571101291
 
 
   what `.sort` should guarantee is that the output is ordered, and users shouldn't care about the number of partitions.
   
   It's more efficient to shuffle only once for query `df.repartition(10).sort("id")`. This is just an optimization and nothing about semantic.
   
   For `df.join(df.repartition(10), Seq("id"), "left")`, again we don't care about the number of result partitions. If there is a way to save shuffles, please propose.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568709236
 
 
   **[Test build #115719 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115719/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360712480
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ##########
 @@ -55,6 +55,8 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] {
         child
       case (child, BroadcastDistribution(mode)) =>
         BroadcastExchangeExec(mode, child)
+      case (ShuffleExchangeExec(partitioning, child, _), distribution: OrderedDistribution) =>
+        ShuffleExchangeExec(distribution.createPartitioning(partitioning.numPartitions), child)
 
 Review comment:
   Sound reasonable. Any suitable cases?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568082112
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568687709
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115689/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568651737
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115674/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568709268
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568694977
 
 
   **[Test build #115718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115718/testReport)** for PR 26946 at commit [`52ce660`](https://github.com/apache/spark/commit/52ce6603aa4e3695360b54f2481bd9a0f9142016).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568695325
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568081472
 
 
   **[Test build #115630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115630/testReport)** for PR 26946 at commit [`5915a12`](https://github.com/apache/spark/commit/5915a124d1cba718c15bf861e60fd7c4b4dee472).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568082116
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115630/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568655856
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-570571809
 
 
   >  if end user really want result partition is 10, should use df.sort("id").repartition(10) instead, not the df.repartition(10).sort("id"). Pruning shuffle may mislead user.
   
   df.sort("id").repartition(10) returns wrong result. Global sort result would be repartitioned with disordered.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360291694
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,42 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
+      "if SortExec is followed by RoundRobinPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = RoundRobinPartitioning(5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: RoundRobinPartitioning, _, _) => true
+      case _ => false}.isEmpty,
+      "RoundRobinPartitioning should be changed to RangePartitioning")
+  }
+
+  test("SPARK-30036: Romove unnecessary HashPartitioning " +
+    "if SortExec is followed by HashPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = HashPartitioning(Literal(1) :: Nil, 5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: HashPartitioning, _, _) => true
+      case _ => false}.isEmpty,
 
 Review comment:
   ditto

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360795637
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   In the current master
   ```
   scala> spark.range(0, 10000, 1, 10).sort("id").explain(true)
   == Parsed Logical Plan ==
   'Sort ['id ASC NULLS FIRST], true
   +- Range (0, 10000, step=1, splits=Some(10))
   
   == Analyzed Logical Plan ==
   id: bigint
   Sort [id#8L ASC NULLS FIRST], true
   +- Range (0, 10000, step=1, splits=Some(10))
   
   == Optimized Logical Plan ==
   Range (0, 10000, step=1, splits=Some(10))
   
   == Physical Plan ==
   *(1) Range (0, 10000, step=1, splits=10)
   ```
   
   If we add a shuffle now, then it's a regression and we should fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568889656
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568693917
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567974480
 
 
   > shall we add an end-to-end test for `SELECT /*+ REPARTITION(5) */ * FROM test ORDER BY a`?
   @cloud-fan The modification of this patch only affects the number of final partitions and does not affect the overall result. Is it enough to check the rule? Or, maybe check the number of partitions?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360290729
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   why this change?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567980903
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20427/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568623945
 
 
   **[Test build #115674 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115674/testReport)** for PR 26946 at commit [`fa03fcb`](https://github.com/apache/spark/commit/fa03fcbbb08c1cdcd3f25bc1b8fb03d7a8535cf0).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360644536
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   Yes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568696876
 
 
   **[Test build #115719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115719/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r361093943
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,52 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
+      "if SortExec is followed by RoundRobinPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = RoundRobinPartitioning(5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: RoundRobinPartitioning, _, _) => true
+      case _ => false
+    }.isEmpty,
+      "RoundRobinPartitioning should be changed to RangePartitioning")
+
+    val query = testData.select('key, 'value).repartition(2).sort('key.asc)
+    assert(query.rdd.getNumPartitions == 2)
+    assert(query.rdd.collectPartitions()(0).map(_.get(0)).toSeq == (1 to 50))
+  }
+
+  test("SPARK-30036: Romove unnecessary HashPartitioning " +
+    "if SortExec is followed by HashPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = HashPartitioning(Literal(1) :: Nil, 5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: HashPartitioning, _, _) => true
+      case _ => false
+    }.isEmpty,
+      "HashPartitioning should be changed to RangePartitioning")
+
+    val query = testData.select('key, 'value).repartition(5, 'key).sort('key.asc)
 
 Review comment:
   I'm not very sure about this. `df.repartition` is a low-level API that allows users to hash-partition the data. There is also a `df.repartitionByRange` to do range partitioning. We shouldn't break users' expectations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568743122
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568695334
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20512/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568711049
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20522/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360770543
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   I'm a bit confused. `spark.range` reports `RangePartitioning`, so there shouldn't be any shuffles. What gets changed?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360788805
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   ```
   == Optimized Logical Plan ==
   Sort [id#8L DESC NULLS LAST], true
   +- Range (0, 100, step=1, splits=Some(10))
   
   == Physical Plan ==
   *(2) Sort [id#8L DESC NULLS LAST], true, 0
   +- Exchange rangepartitioning(id#8L DESC NULLS LAST, 4), true, [id=#33]
      +- *(1) Range (0, 100, step=1, splits=10)
   ```
   The sort api will add global=true in SortExec, similar with orderby, which will cause SortExec's requiredChildDistribution to be parsed into OrderedDistribution.
   EnsureRequirements will add a ShuffleExchangeExec with rangePartitioning after OrderedDistribution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r361101878
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,52 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
+      "if SortExec is followed by RoundRobinPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = RoundRobinPartitioning(5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: RoundRobinPartitioning, _, _) => true
+      case _ => false
+    }.isEmpty,
+      "RoundRobinPartitioning should be changed to RangePartitioning")
+
+    val query = testData.select('key, 'value).repartition(2).sort('key.asc)
+    assert(query.rdd.getNumPartitions == 2)
+    assert(query.rdd.collectPartitions()(0).map(_.get(0)).toSeq == (1 to 50))
+  }
+
+  test("SPARK-30036: Romove unnecessary HashPartitioning " +
+    "if SortExec is followed by HashPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = HashPartitioning(Literal(1) :: Nil, 5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: HashPartitioning, _, _) => true
+      case _ => false
+    }.isEmpty,
+      "HashPartitioning should be changed to RangePartitioning")
+
+    val query = testData.select('key, 'value).repartition(5, 'key).sort('key.asc)
 
 Review comment:
   HashPartitioning or RangePartitioning won't take effect if it is followed by OrderedDistribution. If user really want hashPartitioning, they should use sortWithinPartitions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568909404
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568696876
 
 
   **[Test build #115719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115719/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maryannxue commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
maryannxue commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-578914513
 
 
   Think we should revert this PR. The change in test `ConfigBehaviorSuite` cannot be fully justified. Plus, this is not the right approach to fix this kind of issue.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568651732
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568697257
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20513/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568624152
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568655859
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20486/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568655640
 
 
   **[Test build #115689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115689/testReport)** for PR 26946 at commit [`52ce660`](https://github.com/apache/spark/commit/52ce6603aa4e3695360b54f2481bd9a0f9142016).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568687705
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568712637
 
 
   **[Test build #115729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115729/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360924460
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
 
 Review comment:
   okey

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-580861982
 
 
   After more thoughts, I think it's wrong to use optimization to fix a bug.
   
   Looking into the bug, the issue is: the `Repartition` operator added by the hint is under the `Sort` operator, not above it. This is because our parser treats ORDER BY as the last clause, while the hint is associated with the SELECT clause. The parser rule is like `SELECT ... UNION/INTERSECT SELECT ... ORDER BY`. That's why we add the `Sort` operator at the end.
   
   I think #27096 is in the right way to optimize redundant shuffles, but we still need to fix the bug about how to handle hints in the parser.
   
   I'm reverting this. Let's fix the bug in the parser.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568709274
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115719/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568694977
 
 
   **[Test build #115718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115718/testReport)** for PR 26946 at commit [`52ce660`](https://github.com/apache/spark/commit/52ce6603aa4e3695360b54f2481bd9a0f9142016).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567980897
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568747283
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568909407
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115766/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567980903
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20427/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568687542
 
 
   **[Test build #115689 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115689/testReport)** for PR 26946 at commit [`52ce660`](https://github.com/apache/spark/commit/52ce6603aa4e3695360b54f2481bd9a0f9142016).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360760641
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ##########
 @@ -55,6 +55,8 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] {
         child
       case (child, BroadcastDistribution(mode)) =>
         BroadcastExchangeExec(mode, child)
+      case (ShuffleExchangeExec(partitioning, child, _), distribution: OrderedDistribution) =>
+        ShuffleExchangeExec(distribution.createPartitioning(partitioning.numPartitions), child)
 
 Review comment:
   I just tried few possible cases, but can not have a concrete case like this. Maybe this is the only case possibly.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r361103312
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,52 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
+      "if SortExec is followed by RoundRobinPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = RoundRobinPartitioning(5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: RoundRobinPartitioning, _, _) => true
+      case _ => false
+    }.isEmpty,
+      "RoundRobinPartitioning should be changed to RangePartitioning")
+
+    val query = testData.select('key, 'value).repartition(2).sort('key.asc)
+    assert(query.rdd.getNumPartitions == 2)
+    assert(query.rdd.collectPartitions()(0).map(_.get(0)).toSeq == (1 to 50))
+  }
+
+  test("SPARK-30036: Romove unnecessary HashPartitioning " +
+    "if SortExec is followed by HashPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = HashPartitioning(Literal(1) :: Nil, 5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: HashPartitioning, _, _) => true
+      case _ => false
+    }.isEmpty,
+      "HashPartitioning should be changed to RangePartitioning")
+
+    val query = testData.select('key, 'value).repartition(5, 'key).sort('key.asc)
 
 Review comment:
   makes sense

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568360846
 
 
   We can add an end-to-end test, check the physical plan of a query, and count shuffles.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567980897
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568687709
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115689/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568747286
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115729/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360760641
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ##########
 @@ -55,6 +55,8 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] {
         child
       case (child, BroadcastDistribution(mode)) =>
         BroadcastExchangeExec(mode, child)
+      case (ShuffleExchangeExec(partitioning, child, _), distribution: OrderedDistribution) =>
+        ShuffleExchangeExec(distribution.createPartitioning(partitioning.numPartitions), child)
 
 Review comment:
   I just tried few possible cases, but can not have a concrete case like this. Maybe this is the only case possibly. So I think this should be fine.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r361104180
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,52 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
 
 Review comment:
   nit `Romove` -> `Remove`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-567998647
 
 
   **[Test build #115630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115630/testReport)** for PR 26946 at commit [`5915a12`](https://github.com/apache/spark/commit/5915a124d1cba718c15bf861e60fd7c4b4dee472).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568711042
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360291470
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
 ##########
 @@ -421,6 +421,42 @@ class PlannerSuite extends SharedSparkSession {
     }
   }
 
+  test("SPARK-30036: Romove unnecessary RoundRobinPartitioning " +
+      "if SortExec is followed by RoundRobinPartitioning") {
+    val distribution = OrderedDistribution(SortOrder(Literal(1), Ascending) :: Nil)
+    val partitioning = RoundRobinPartitioning(5)
+    assert(!partitioning.satisfies(distribution))
+
+    val inputPlan = SortExec(SortOrder(Literal(1), Ascending) :: Nil,
+      global = true,
+      child = ShuffleExchangeExec(
+        partitioning,
+        DummySparkPlan(outputPartitioning = partitioning)))
+    val outputPlan = EnsureRequirements(spark.sessionState.conf).apply(inputPlan)
+    assert(outputPlan.find {
+      case ShuffleExchangeExec(_: RoundRobinPartitioning, _, _) => true
+      case _ => false}.isEmpty,
 
 Review comment:
   nit:
   ```
   ...find {
     case ...
     case ...
   }.isEmpty
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568710611
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568383598
 
 
   > We can add an end-to-end test, check the physical plan of a query, and count shuffles.
   Sure,I will add some tests for these cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568082112
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-570779726
 
 
   > df.sort("id").repartition(10) returns wrong result. Global sort result would be repartitioned with disordered.
   
   Sorry for the wrong example. I mean user should use the right way to change partition. Obviously `df.repartition(10).sort("id")` should return the spark sql shuffle partitions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568651626
 
 
   **[Test build #115674 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115674/testReport)** for PR 26946 at commit [`fa03fcb`](https://github.com/apache/spark/commit/fa03fcbbb08c1cdcd3f25bc1b8fb03d7a8535cf0).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360795988
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
 
 Review comment:
   shall we remove this comment now? it's not useful anymore as we do add shuffle, the range output partitioning doesn't matter.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568655856
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568711042
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568889538
 
 
   **[Test build #115766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115766/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360673289
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ##########
 @@ -55,6 +55,8 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] {
         child
       case (child, BroadcastDistribution(mode)) =>
         BroadcastExchangeExec(mode, child)
+      case (ShuffleExchangeExec(partitioning, child, _), distribution: OrderedDistribution) =>
+        ShuffleExchangeExec(distribution.createPartitioning(partitioning.numPartitions), child)
 
 Review comment:
   This considers a special case for OrderedDistribution. Generally, if ShuffleExchangeExec is followed by any unsatisfying distribution , we should always trim the ShuffleExchangeExec and apply the partitioning of distribution. Don't we?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-569193555
 
 
   yea merged to master!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
stczwd commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360380206
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
+      val data = spark.range(0, n, 1, 10).sort($"id".desc)
 
 Review comment:
   This test is failed because of pruning repartition, the partition number is not right in this case. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568624152
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-570535969
 
 
   Wait, another point. First this scence also exists  in `join` or `window` operator, as @viirya say, exists other Distribution. 
   For e.g. join
   ```
   val df = spark.range(1, 10, 2)
   df.join(df.repartition(10), Seq("id"), "left").explain(true)
   
   // physical plan like this
   = Physical Plan ==
   *(5) Project [id#0L]
   +- SortMergeJoin [id#0L], [id#83L], LeftOuter
      :- *(2) Sort [id#0L ASC NULLS FIRST], false, 0
      :  +- Exchange hashpartitioning(id#0L, 200), true, [id=#378]
      :     +- *(1) Range (1, 10, step=1, splits=40)
      +- *(4) Sort [id#83L ASC NULLS FIRST], false, 0
         +- Exchange hashpartitioning(id#83L, 200), true, [id=#384]
            +- Exchange RoundRobinPartitioning(10), false, [id=#383]
               +- *(3) Range (1, 10, step=1, splits=40)
   ```
   
   And then there is a little difference between `2 -> 10 -> 200` and `2 -> 10` because of different operator complexity. Repartition may be is a light operator compare with sort or join or else algorithm. So it's not sure `2 -> 10` is always run faster than `2 -> 10 -> 200`.
   
   The last, if end user really want result partition is 10, should use `df.sort("id").repartition(10)` instead, not the `df.repartition(10).sort("id")`. Pruning shuffle may mislead user.
   
   cc @HyukjinKwon @cloud-fan @maropu 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568624156
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20471/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#discussion_r360795988
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala
 ##########
 @@ -39,9 +39,8 @@ class ConfigBehaviorSuite extends QueryTest with SharedSparkSession {
     def computeChiSquareTest(): Double = {
       val n = 10000
       // Trigger a sort
-      // Range has range partitioning in its output now. To have a range shuffle, we
-      // need to run a repartition first.
-      val data = spark.range(0, n, 1, 1).repartition(10).sort($"id".desc)
+      // Range has range partitioning in its output now.
 
 Review comment:
   shall we remove this comment now? it's not useful as we do add shuffle, the range output partitioning doesn't matter.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-571070694
 
 
   I see you want to make a way that change partition easily after sort.
   
   Only one thing I not sure. If `df.repartition(10).sort("id")` result the 10 partitions, user will think `df.repartition(10).xxx` result 10 partitions too but actually not. It's a special handle for sort.
   
   I don not know how committer think about it, or it's just fine.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568082116
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115630/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568909407
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115766/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568623945
 
 
   **[Test build #115674 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115674/testReport)** for PR 26946 at commit [`fa03fcb`](https://github.com/apache/spark/commit/fa03fcbbb08c1cdcd3f25bc1b8fb03d7a8535cf0).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568655640
 
 
   **[Test build #115689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115689/testReport)** for PR 26946 at commit [`52ce660`](https://github.com/apache/spark/commit/52ce6603aa4e3695360b54f2481bd9a0f9142016).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568711049
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20522/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568700651
 
 
   Looks fine to me

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568909306
 
 
   **[Test build #115766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115766/testReport)** for PR 26946 at commit [`d2615b6`](https://github.com/apache/spark/commit/d2615b61a6e5f7ed849f826d25d45315953386c9).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568889656
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568743122
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568695334
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/20512/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568697251
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #26946: [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by
URL: https://github.com/apache/spark/pull/26946#issuecomment-568743128
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/115718/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org