You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/23 10:21:14 UTC

[GitHub] [arrow] amol- commented on pull request #14369: ARROW-14656: [Python] Add sort helper function for Array, ChunkedArray and StructArray

amol- commented on PR #14369:
URL: https://github.com/apache/arrow/pull/14369#issuecomment-1324832068

   > > Are you saying we should get those done first and measure it before we think about merging this in case it turns out to be slower than the current implementation?
   > 
   > Yes, this would be better. Otherwise I would suggest directly reusing the current Table/Batch sorting for Struct array sorting (the one thing missing being handling the top-level null bitmap).
   
   To avoid mixing too many concerns in a single activity, I'd like to propose we focus this task on providing the capability and then tackle performance improvements and benchmarking into a dedicated activity. I agree we don't want to introduce any slowdown, but as far as we are talking about adding the capability to something that wasn't supported, I think we can live for a certain amount of time with having struct fields sorting slower than table sorting while we work on performance improvements for it.
   
   This has been a feature under work for months at this point and I think it would provide immediate value to users while giving us the time to work on refining and improving it.
   
   Obviously any work at migrating Tables and RecordBatches to the NestedValueComparator (or whatever will come) should be subordinate to ensuring the comparator is on pair with the current implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org