You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/05 21:50:12 UTC

[GitHub] [arrow-datafusion] tustvold commented on issue #2148: Research performance improvements in N-way merging

tustvold commented on issue #2148:
URL: https://github.com/apache/arrow-datafusion/issues/2148#issuecomment-1089403444

   One other thing to throw into the mix would be to optimise sorts of dictionary encoded columns. If the dictionary is sorted, the savings could be significant as you only need to compare the integer keys. 
   
   Even if the dictionary isn't sorted it might be faster to sort the dictionary first, and then sort the now sorted keys. 
   
   Just an idea as at least in the case of IOx, we will only been sorting on dictionary encoded string columns and not plain columns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org