You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/14 14:11:10 UTC

[GitHub] [spark] aof00 opened a new pull request #30376: change 'spark.sql.adaptive.skewedPartitionThresholdInBytes' to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' #SPARK-33451

aof00 opened a new pull request #30376:
URL: https://github.com/apache/spark/pull/30376


   JIRA Issue: https://issues.apache.org/jira/browse/SPARK-33451
   
   In the 'Optimizing Skew Join' section of the following two pages:
   1. [https://spark.apache.org/docs/3.0.0/sql-performance-tuning.html](https://spark.apache.org/docs/3.0.0/sql-performance-tuning.html)
   2. [https://spark.apache.org/docs/3.0.1/sql-performance-tuning.html](https://spark.apache.org/docs/3.0.1/sql-performance-tuning.html)
   
   The configuration 'spark.sql.adaptive.skewedPartitionThresholdInBytes' should be changed to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes', The former is missing the 'skewJoin'.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #30376: [SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' in documentation

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #30376:
URL: https://github.com/apache/spark/pull/30376


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] janekdb commented on a change in pull request #30376: change 'spark.sql.adaptive.skewedPartitionThresholdInBytes' to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' #SPARK-33451

Posted by GitBox <gi...@apache.org>.
janekdb commented on a change in pull request #30376:
URL: https://github.com/apache/spark/pull/30376#discussion_r523472097



##########
File path: docs/sql-performance-tuning.md
##########
@@ -280,7 +280,7 @@ Data skew can severely downgrade the performance of join queries. This feature d
        <td><code>spark.sql.adaptive.skewJoin.skewedPartitionFactor</code></td>
        <td>10</td>
        <td>
-         A partition is considered as skewed if its size is larger than this factor multiplying the median partition size and also larger than <code>spark.sql.adaptive.skewedPartitionThresholdInBytes</code>.
+         A partition is considered as skewed if its size is larger than this factor multiplying the median partition size and also larger than <code>spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes</code>.

Review comment:
       LGTM




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30376: [SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' in documentation

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727681809


   Merged to master and branch-3.0.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30376: change 'spark.sql.adaptive.skewedPartitionThresholdInBytes' to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' #SPARK-33451

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727213008


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30376: [SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' in documentation

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727681686


   @aof00, I fixed the PR title and description but please keep the GitHub PR template and fix the title according to http://spark.apache.org/contributing.html next time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] aof00 commented on pull request #30376: [SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' in documentation

Posted by GitBox <gi...@apache.org>.
aof00 commented on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727688223


   > @aof00, I fixed the PR title and description but please keep the GitHub PR template and fix the title according to http://spark.apache.org/contributing.html next time.
   
   OK, I'll remember that next time! Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon removed a comment on pull request #30376: [SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' in documentation

Posted by GitBox <gi...@apache.org>.
HyukjinKwon removed a comment on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727681752


   Merged to master and branch-3.0.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30376: change 'spark.sql.adaptive.skewedPartitionThresholdInBytes' to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' #SPARK-33451

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727213008


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30376: change 'spark.sql.adaptive.skewedPartitionThresholdInBytes' to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' #SPARK-33451

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727213182


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #30376: [SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' in documentation

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #30376:
URL: https://github.com/apache/spark/pull/30376#issuecomment-727681752


   Merged to master and branch-3.0.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org