You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/10 16:36:24 UTC

[GitHub] [pinot] klsince commented on pull request #8879: [Draft] extract sort fields into a temp file for better data locality for sorting

klsince commented on PR #8879:
URL: https://github.com/apache/pinot/pull/8879#issuecomment-1152542139

   > For dedup mode the sort is over all dimensions, so will this help in that case?
   
   Not much help for dedup or basically cases where many fields are used to sort. For dedup or rollup, will consider using hashmap w/o sorting. 
   
   If sorting with multi fields is anyway wanted, one quick though is to sort the data file in chunks (with current method of quicksort and an auxiliary index array), then copy the sorted chunks into temp files (one file per sorted chunk), then merge those temp files up for following processing. Compared with quicksort over large data file directly, this might reduce page faults, by tuning the chunk size and chunk number.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org