You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/22 07:12:36 UTC

[GitHub] [spark] caican00 opened a new pull request, #37608: update

caican00 opened a new pull request, #37608:
URL: https://github.com/apache/spark/pull/37608

   ### What changes were proposed in this pull request?
   `Traversable.toMap` changed to `collections.breakOut`, that eliminates intermediate tuple collection creation.
   I optimized it with reference to this pr:https://github.com/apache/spark/pull/18693
   An introduction to Collections. BreakOut can be found at [Stack Overflow article](https://stackoverflow.com/questions/1715681/scala-2-8-breakout).
   
   ### Why are the changes needed?
   When `DeserializeToObject` is executed, converting Tuple2 to Scala Map via '. ToMap 'takes a lot of cpu time.
   ![image](https://user-images.githubusercontent.com/94670132/185860416-f147ddd7-65b3-4dcb-b9d6-9a872015e003.png)
   ![image](https://user-images.githubusercontent.com/94670132/185860432-2aec4c48-898a-4d66-8d34-2221ab7e9408.png)
   
   
   ### How was this patch tested?
   Unit tests run.
   No performance tests performed yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] caican00 closed pull request #37608: [SPARK-40175][SQL]Speed up conversion of Tuple2 to Scala Map

Posted by GitBox <gi...@apache.org>.
caican00 closed pull request #37608: [SPARK-40175][SQL]Speed up conversion of Tuple2 to Scala Map
URL: https://github.com/apache/spark/pull/37608


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] caican00 commented on pull request #37608: update

Posted by GitBox <gi...@apache.org>.
caican00 commented on PR #37608:
URL: https://github.com/apache/spark/pull/37608#issuecomment-1221948497

   > @caican00 mind creating a JIRA, and fix the PR title? See also https://spark.apache.org/contributing.html.
   > 
   > Also, we should probably fix it in the `master` branch instead of `branch-3.3`. Otherwise, looks pretty good
   
   thanks, i will close this pr and open a new pr to master branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org