You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/22 07:12:36 UTC
[GitHub] [spark] caican00 opened a new pull request, #37608: update
caican00 opened a new pull request, #37608:
URL: https://github.com/apache/spark/pull/37608
### What changes were proposed in this pull request?
`Traversable.toMap` changed to `collections.breakOut`, that eliminates intermediate tuple collection creation.
I optimized it with reference to this pr:https://github.com/apache/spark/pull/18693
An introduction to Collections. BreakOut can be found at [Stack Overflow article](https://stackoverflow.com/questions/1715681/scala-2-8-breakout).
### Why are the changes needed?
When `DeserializeToObject` is executed, converting Tuple2 to Scala Map via '. ToMap 'takes a lot of cpu time.
![image](https://user-images.githubusercontent.com/94670132/185860416-f147ddd7-65b3-4dcb-b9d6-9a872015e003.png)
![image](https://user-images.githubusercontent.com/94670132/185860432-2aec4c48-898a-4d66-8d34-2221ab7e9408.png)
### How was this patch tested?
Unit tests run.
No performance tests performed yet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] caican00 closed pull request #37608: [SPARK-40175][SQL]Speed up conversion of Tuple2 to Scala Map
Posted by GitBox <gi...@apache.org>.
caican00 closed pull request #37608: [SPARK-40175][SQL]Speed up conversion of Tuple2 to Scala Map
URL: https://github.com/apache/spark/pull/37608
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] caican00 commented on pull request #37608: update
Posted by GitBox <gi...@apache.org>.
caican00 commented on PR #37608:
URL: https://github.com/apache/spark/pull/37608#issuecomment-1221948497
> @caican00 mind creating a JIRA, and fix the PR title? See also https://spark.apache.org/contributing.html.
>
> Also, we should probably fix it in the `master` branch instead of `branch-3.3`. Otherwise, looks pretty good
thanks, i will close this pr and open a new pr to master branch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org