You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/05 05:01:36 UTC

[GitHub] [arrow-datafusion] yjshen commented on pull request #2132: Reduce SortExec memory usage by void constructing single huge batch

yjshen commented on PR #2132:
URL: https://github.com/apache/arrow-datafusion/pull/2132#issuecomment-1088269824

   Thanks @alamb @Dandandan for your review!
   
   > It is fair to say that this PR's core change is to only copy data for the "sort keys" ( rather than all of the columns? If so it I think this is a good approach and state-of-the-art)
   
   Yes, that's the change in this PR.
   
   > I also tested it out using the IOx suite and that passed 🎉 : https://github.com/influxdata/influxdb_iox/pull/4230
   
   I've updated the implementation with `extend with slice` you and Daniël mentioned. Do you mind retrigger tests in InfluxIOx for more tests? 
   
   > Maybe we can eventually write a blog about your sorting adventures (in the vein of
   https://duckdb.org/2021/08/27/external-sorting.html) -- you have just as much good stuff to report.
   
   Sounds great!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org