You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@uniffle.apache.org by GitBox <gi...@apache.org> on 2022/09/07 12:39:53 UTC
[GitHub] [incubator-uniffle] leixm commented on pull request #190: [Improvement][AQE] Avoid calling getShuffleResult multiple times
leixm commented on PR #190:
URL: https://github.com/apache/incubator-uniffle/pull/190#issuecomment-1239337208
### Environment
Shuffle Server Num : 5
Shuffle Write: 48G
Configuration: --conf spark.sql.shuffle.partitions=5000 --conf spark.sql.adaptive.enabled=true --conf spark.sql.adaptive.shuffle.targetPostShuffleInputSize=64MB
We measure the performance of get_shuffle_result by the following metrics:
- get_shuffle_result_times: The number of calls of the get_shuffle_result interface
- get_shuffle_result_cost: Time consumption of get_shuffle_result interface
- get_shuffle_result_for_multi_part_times:he number of calls of the get_shuffle_result_for_multi_part interface
- get_shuffle_result_for_multi_part_cost: Time consumption of get_shuffle_result_for_multi_part interface
### Test Results
Before issue_136
| serverId | get_shuffle_result_times | get_shuffle_result_cost(ms) |
| -------- | ------------------------ | --------------------------- |
| Server1 | 1000 | 157614 |
| Server2 | 1000 | 426897 |
| Server3 | 1000 | 269488 |
| Server4 | 1000 | 906758 |
| Server5 | 1001 | 123217 |
| sum | 5001 | 1883974 |
After issue_136
| serverId | get_shuffle_result_for_multi_part_times | get_shuffle_result_for_multi_part_cost(ms) |
| -------- | --------------------------------------- | ------------------------------------------ |
| Server1 | 833 | 870720 |
| Server2 | 833 | 260865 |
| Server3 | 834 | 333202 |
| Server4 | 833 | 90277 |
| Server5 | 835 | 94113 |
| sum | 4168 | 1649177 |
### Summarize
The number of interface requests is reduced by 16%, and the total time is reduced by 12.5%. If we assign consecutive partitions to a server, the improvement will be more obvious.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org