You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/26 19:45:13 UTC

[GitHub] [spark] JoshRosen commented on issue #24709: [SPARK-27841][SQL] Improve UTF8String to/fromString()/numBytesForFirstByte() performance

JoshRosen commented on issue #24709: [SPARK-27841][SQL] Improve UTF8String to/fromString()/numBytesForFirstByte() performance
URL: https://github.com/apache/spark/pull/24709#issuecomment-496026174
 
 
   By the way, if I was to prioritize these changes for inclusion / consideration, I'd rank them as;
   
   1.  `numBytesForFirstByte()`
   2. `fromString()`
   3. `toString()`
   
   The `fromString()` changes have a significantly larger impact than `toString()` because they result in a much more significant reduction in garbage creation.
   
   Since this is all just an experiment, I'd be totally cool with spinning off a subset of these changes to a separate, much tinier PR in case we decide that only some of these are worthwhile.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org