You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/26 19:45:13 UTC
[GitHub] [spark] JoshRosen commented on issue #24709: [SPARK-27841][SQL]
Improve UTF8String to/fromString()/numBytesForFirstByte() performance
JoshRosen commented on issue #24709: [SPARK-27841][SQL] Improve UTF8String to/fromString()/numBytesForFirstByte() performance
URL: https://github.com/apache/spark/pull/24709#issuecomment-496026174
By the way, if I was to prioritize these changes for inclusion / consideration, I'd rank them as;
1. `numBytesForFirstByte()`
2. `fromString()`
3. `toString()`
The `fromString()` changes have a significantly larger impact than `toString()` because they result in a much more significant reduction in garbage creation.
Since this is all just an experiment, I'd be totally cool with spinning off a subset of these changes to a separate, much tinier PR in case we decide that only some of these are worthwhile.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org