You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by GitBox <gi...@apache.org> on 2022/12/12 04:41:07 UTC

[GitHub] [incubator-kyuubi] ulysses-you opened a new pull request, #3962: Add two conditions to decide if add shuffle before writing

ulysses-you opened a new pull request, #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962

   <!--
   Thanks for sending a pull request!
   
   Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/CONTRIBUTING.html
     2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
   -->
   
   ### _Why are the changes needed?_
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you add a feature, you can talk about the use case of it.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   add two conditions to decide if we should add shuffle.
   
   1. make sure AQE is enabled, otherwise it is no meaning to add a shuffle
   2. try to reduce the performance regression if add a shuffle
   
   for condition 2: we do not add shuffle if the original plan does not have shuffle
   
   ### _How was this patch tested?_
   - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
   
   - [ ] Add screenshots for manual tests if appropriate
   
   - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] ulysses-you commented on a diff in pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#discussion_r1045607716


##########
docs/extensions/engines/spark/rules.md:
##########
@@ -68,17 +68,18 @@ Now, you can enjoy the Kyuubi SQL Extension.
 

Review Comment:
   good catch !



##########
extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala:
##########
@@ -108,14 +108,37 @@ abstract class RepartitionBeforeWritingHiveBase extends RepartitionBuilder {
   }
 }
 
-trait RepartitionBeforeWriteHelper {
-  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = plan match {
-    case Project(_, child) => canInsertRepartitionByExpression(child)
-    case SubqueryAlias(_, child) => canInsertRepartitionByExpression(child)
-    case Limit(_, _) => false
-    case _: Sort => false
-    case _: RepartitionByExpression => false
-    case _: Repartition => false
-    case _ => true
+trait RepartitionBeforeWriteHelper extends Rule[LogicalPlan] {
+  private def hasBenefit(plan: LogicalPlan): Boolean = {
+    def probablyHasShuffle: Boolean = plan.find {

Review Comment:
   `exists` is ready after branch-3.3, so I use a `find` to be compatible with old version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] yaooqinn commented on a diff in pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
yaooqinn commented on code in PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#discussion_r1045571876


##########
extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala:
##########
@@ -108,14 +108,37 @@ abstract class RepartitionBeforeWritingHiveBase extends RepartitionBuilder {
   }
 }
 
-trait RepartitionBeforeWriteHelper {
-  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = plan match {
-    case Project(_, child) => canInsertRepartitionByExpression(child)
-    case SubqueryAlias(_, child) => canInsertRepartitionByExpression(child)
-    case Limit(_, _) => false
-    case _: Sort => false
-    case _: RepartitionByExpression => false
-    case _: Repartition => false
-    case _ => true
+trait RepartitionBeforeWriteHelper extends Rule[LogicalPlan] {
+  private def hasBenefit(plan: LogicalPlan): Boolean = {
+    def probablyHasShuffle: Boolean = plan.find {
+      case _: Join => true
+      case _: Aggregate => true
+      case _: Distinct => true
+      case _: Deduplicate => true
+      case _: Window => true
+      case s: Sort if s.global => true
+      case _: RepartitionOperation => true
+      case _: GlobalLimit => true
+      case _ => false
+    }.isDefined
+
+    conf.getConf(KyuubiSQLConf.INSERT_REPARTITION_BEFORE_WRITE_IF_NO_SHUFFLE) || probablyHasShuffle
+  }
+
+  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = {
+    def canInsert(p: LogicalPlan): Boolean = p match {
+      case Project(_, child) => canInsert(child)
+      case SubqueryAlias(_, child) => canInsert(child)
+      case Limit(_, _) => false
+      case _: Sort => false
+      case _: RepartitionByExpression => false
+      case _: Repartition => false
+      case _ => true
+    }
+
+    // 1. make sure AQE is enabled, otherwise it is no meaning to add a shuffle
+    // 2. make sure it does not break the semantics of original plan
+    // 3. try to reduce the performance regression if add a shuffle
+    conf.getConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED) && canInsert(plan) && hasBenefit(plan)

Review Comment:
   conf.adaptiveExecutionEnabled?



##########
extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala:
##########
@@ -108,14 +108,37 @@ abstract class RepartitionBeforeWritingHiveBase extends RepartitionBuilder {
   }
 }
 
-trait RepartitionBeforeWriteHelper {
-  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = plan match {
-    case Project(_, child) => canInsertRepartitionByExpression(child)
-    case SubqueryAlias(_, child) => canInsertRepartitionByExpression(child)
-    case Limit(_, _) => false
-    case _: Sort => false
-    case _: RepartitionByExpression => false
-    case _: Repartition => false
-    case _ => true
+trait RepartitionBeforeWriteHelper extends Rule[LogicalPlan] {
+  private def hasBenefit(plan: LogicalPlan): Boolean = {
+    def probablyHasShuffle: Boolean = plan.find {
+      case _: Join => true
+      case _: Aggregate => true
+      case _: Distinct => true
+      case _: Deduplicate => true
+      case _: Window => true
+      case s: Sort if s.global => true
+      case _: RepartitionOperation => true
+      case _: GlobalLimit => true
+      case _ => false
+    }.isDefined
+
+    conf.getConf(KyuubiSQLConf.INSERT_REPARTITION_BEFORE_WRITE_IF_NO_SHUFFLE) || probablyHasShuffle
+  }
+
+  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = {
+    def canInsert(p: LogicalPlan): Boolean = p match {
+      case Project(_, child) => canInsert(child)
+      case SubqueryAlias(_, child) => canInsert(child)
+      case Limit(_, _) => false
+      case _: Sort => false
+      case _: RepartitionByExpression => false
+      case _: Repartition => false
+      case _ => true
+    }
+
+    // 1. make sure AQE is enabled, otherwise it is no meaning to add a shuffle
+    // 2. make sure it does not break the semantics of original plan
+    // 3. try to reduce the performance regression if add a shuffle

Review Comment:
    try to avoid adding a shuffle if it has potential performance regression



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] ulysses-you commented on pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#issuecomment-1346394760

   thanks for review, merging to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] ulysses-you closed pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
ulysses-you closed pull request #3962: Add two conditions to decide if add shuffle before writing
URL: https://github.com/apache/incubator-kyuubi/pull/3962


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] codecov-commenter commented on pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#issuecomment-1346019506

   # [Codecov](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3962](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (09fc9b2) into [master](https://codecov.io/gh/apache/incubator-kyuubi/commit/4efd4d0bb0ac9cadb1f36408782a6c91dd7139e7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4efd4d0) will **increase** coverage by `0.01%`.
   > The diff coverage is `70.37%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3962      +/-   ##
   ============================================
   + Coverage     51.98%   51.99%   +0.01%     
     Complexity       13       13              
   ============================================
     Files           522      522              
     Lines         28876    28985     +109     
     Branches       3864     3883      +19     
   ============================================
   + Hits          15012    15072      +60     
   - Misses        12494    12533      +39     
   - Partials       1370     1380      +10     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/kyuubi/sql/RepartitionBeforeWritingBase.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZW5zaW9ucy9zcGFyay9reXV1YmktZXh0ZW5zaW9uLXNwYXJrLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zcWwvUmVwYXJ0aXRpb25CZWZvcmVXcml0aW5nQmFzZS5zY2FsYQ==) | `0.00% <0.00%> (ø)` | |
   | [...ha/client/zookeeper/ZookeeperDiscoveryClient.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWhhL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2hhL2NsaWVudC96b29rZWVwZXIvWm9va2VlcGVyRGlzY292ZXJ5Q2xpZW50LnNjYWxh) | `72.10% <0.00%> (+0.37%)` | :arrow_up: |
   | [...g/apache/kyuubi/operation/BatchJobSubmission.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9vcGVyYXRpb24vQmF0Y2hKb2JTdWJtaXNzaW9uLnNjYWxh) | `76.68% <64.28%> (-3.06%)` | :arrow_down: |
   | [...rg/apache/kyuubi/engine/spark/SparkSQLEngine.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZXJuYWxzL2t5dXViaS1zcGFyay1zcWwtZW5naW5lL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUva3l1dWJpL2VuZ2luZS9zcGFyay9TcGFya1NRTEVuZ2luZS5zY2FsYQ==) | `70.40% <75.00%> (+0.29%)` | :arrow_up: |
   | [...g/apache/kyuubi/session/KyuubiSessionManager.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zZXNzaW9uL0t5dXViaVNlc3Npb25NYW5hZ2VyLnNjYWxh) | `94.02% <88.88%> (+0.75%)` | :arrow_up: |
   | [...in/scala/org/apache/kyuubi/sql/KyuubiSQLConf.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZXh0ZW5zaW9ucy9zcGFyay9reXV1YmktZXh0ZW5zaW9uLXNwYXJrLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zcWwvS3l1dWJpU1FMQ29uZi5zY2FsYQ==) | `99.06% <100.00%> (+0.04%)` | :arrow_up: |
   | [...in/scala/org/apache/kyuubi/config/KyuubiConf.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWNvbW1vbi9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9jb25maWcvS3l1dWJpQ29uZi5zY2FsYQ==) | `97.50% <100.00%> (-0.02%)` | :arrow_down: |
   | [...ala/org/apache/kyuubi/session/SessionLimiter.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zZXNzaW9uL1Nlc3Npb25MaW1pdGVyLnNjYWxh) | `82.14% <100.00%> (+4.87%)` | :arrow_up: |
   | [...rg/apache/kyuubi/ctl/cmd/log/LogBatchCommand.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLWN0bC9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9jdGwvY21kL2xvZy9Mb2dCYXRjaENvbW1hbmQuc2NhbGE=) | `60.00% <0.00%> (-7.70%)` | :arrow_down: |
   | [.../kyuubi/server/mysql/constant/MySQLErrorCode.scala](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-a3l1dWJpLXNlcnZlci9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2t5dXViaS9zZXJ2ZXIvbXlzcWwvY29uc3RhbnQvTXlTUUxFcnJvckNvZGUuc2NhbGE=) | `13.84% <0.00%> (-6.16%)` | :arrow_down: |
   | ... and [17 more](https://codecov.io/gh/apache/incubator-kyuubi/pull/3962/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] ulysses-you commented on a diff in pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#discussion_r1045583449


##########
extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala:
##########
@@ -108,14 +108,37 @@ abstract class RepartitionBeforeWritingHiveBase extends RepartitionBuilder {
   }
 }
 
-trait RepartitionBeforeWriteHelper {
-  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = plan match {
-    case Project(_, child) => canInsertRepartitionByExpression(child)
-    case SubqueryAlias(_, child) => canInsertRepartitionByExpression(child)
-    case Limit(_, _) => false
-    case _: Sort => false
-    case _: RepartitionByExpression => false
-    case _: Repartition => false
-    case _ => true
+trait RepartitionBeforeWriteHelper extends Rule[LogicalPlan] {
+  private def hasBenefit(plan: LogicalPlan): Boolean = {
+    def probablyHasShuffle: Boolean = plan.find {
+      case _: Join => true
+      case _: Aggregate => true
+      case _: Distinct => true
+      case _: Deduplicate => true
+      case _: Window => true
+      case s: Sort if s.global => true
+      case _: RepartitionOperation => true
+      case _: GlobalLimit => true
+      case _ => false
+    }.isDefined
+
+    conf.getConf(KyuubiSQLConf.INSERT_REPARTITION_BEFORE_WRITE_IF_NO_SHUFFLE) || probablyHasShuffle
+  }
+
+  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = {
+    def canInsert(p: LogicalPlan): Boolean = p match {
+      case Project(_, child) => canInsert(child)
+      case SubqueryAlias(_, child) => canInsert(child)
+      case Limit(_, _) => false
+      case _: Sort => false
+      case _: RepartitionByExpression => false
+      case _: Repartition => false
+      case _ => true
+    }
+
+    // 1. make sure AQE is enabled, otherwise it is no meaning to add a shuffle
+    // 2. make sure it does not break the semantics of original plan
+    // 3. try to reduce the performance regression if add a shuffle

Review Comment:
   addressed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] cfmcgrady commented on a diff in pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
cfmcgrady commented on code in PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#discussion_r1045588799


##########
docs/extensions/engines/spark/rules.md:
##########
@@ -68,17 +68,18 @@ Now, you can enjoy the Kyuubi SQL Extension.
 

Review Comment:
   > Kyuubi provides SQL extension out of box. Due to the version compatibility with Apache Spark, currently we only support Apache Spark branch-3.1 (i.e 3.1.1 and 3.1.2).
   
   as we are here, can you also correct the supported version in the L21?



##########
extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala:
##########
@@ -108,14 +108,37 @@ abstract class RepartitionBeforeWritingHiveBase extends RepartitionBuilder {
   }
 }
 
-trait RepartitionBeforeWriteHelper {
-  def canInsertRepartitionByExpression(plan: LogicalPlan): Boolean = plan match {
-    case Project(_, child) => canInsertRepartitionByExpression(child)
-    case SubqueryAlias(_, child) => canInsertRepartitionByExpression(child)
-    case Limit(_, _) => false
-    case _: Sort => false
-    case _: RepartitionByExpression => false
-    case _: Repartition => false
-    case _ => true
+trait RepartitionBeforeWriteHelper extends Rule[LogicalPlan] {
+  private def hasBenefit(plan: LogicalPlan): Boolean = {
+    def probablyHasShuffle: Boolean = plan.find {

Review Comment:
   nit: `plan.find {...}.isDefined` -> `plan.exists {...}`



##########
extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala:
##########
@@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand
 import org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand
 import org.apache.spark.sql.hive.execution.{CreateHiveTableAsSelectCommand, InsertIntoHiveTable, OptimizedCreateHiveTableAsSelectCommand}
-import org.apache.spark.sql.internal.StaticSQLConf
+import org.apache.spark.sql.internal.{SQLConf, StaticSQLConf}

Review Comment:
   nit: unused import



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [incubator-kyuubi] ulysses-you commented on pull request #3962: Add two conditions to decide if add shuffle before writing

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #3962:
URL: https://github.com/apache/incubator-kyuubi/pull/3962#issuecomment-1345950911

   cc @cfmcgrady @yaooqinn


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org