You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/21 08:16:34 UTC

[GitHub] [arrow-datafusion] richox opened a new issue, #4300: Introduce tournament tree to achieve better k-way sort-merging

richox opened a new issue, #4300:
URL: https://github.com/apache/arrow-datafusion/issues/4300

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   `SortPreservingMergeStream` currently uses a binary heap for k-way merging, which gets `O(NlogS)` time complexity (S=num_streams). however, when a top record is taken from the heap, we need to perform a pop/push operation and are likely to take `2*logS` comparisons.
   
   an improved way is to use a Tournament Tree (aka Loser Tree) to do the selection. when the top record is taken, the tree structure is not modified, and only the path from bottom to top is visited. the number of comparisons is always `logS`.
   
   reference: https://en.wikipedia.org/wiki/K-way_merge_algorithm#Tournament_Tree
   
   **Describe the solution you'd like**
   implement the loser tree algorithm in `SortPreservingMergeStream`.
   
   **Describe alternatives you've considered**
   
   **Additional context**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #4300: Introduce tournament tree to achieve better k-way sort-merging

Posted by GitBox <gi...@apache.org>.
alamb closed issue #4300: Introduce tournament tree to achieve better k-way sort-merging
URL: https://github.com/apache/arrow-datafusion/issues/4300


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org