You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/27 06:21:35 UTC

[GitHub] [spark] LuciferYang commented on pull request #37876: [SPARK-40175][CORE][SQL][MLLIB][DSTREAM][R] Optimize the performance of `keys.zip(values).toMap` code pattern

LuciferYang commented on PR #37876:
URL: https://github.com/apache/spark/pull/37876#issuecomment-1259032970

   @caican00 Yes, it was also clear before that when the collection size is greater than 500, there will be no significant performance improvement.
   
   In fact, according to the test results from GA, when the collection size is between 100 and 500, the revenue is about 10%, and this is only a partial tuning. I don't think it will significantly improve the overall situation,  but this performance is similar to `keys.zip(values)(collection.breakOut[From, T, To])`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org