You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/02/14 18:44:29 UTC

[GitHub] justinborromeo commented on issue #7036: [Proposal] K-Way Merge for Time-Ordered Scans

justinborromeo commented on issue #7036: [Proposal] K-Way Merge for Time-Ordered Scans
URL: https://github.com/apache/incubator-druid/issues/7036#issuecomment-463743611
 
 
   > Does this mean, the merged result in historicals would be a sequence of a single `ScanResultValue` which contains all time-ordered events of input segments? I understand the merge needs to block for sorting, but how does it work with stream merge in the broker? I guess the broker would also wait until it gets all events in `ScanResultValue`?
   
   Yes.  From `CachingClusteredClient.SpecificQueryRunnable#run()`,
   
   ```
   return new LazySequence<>(() -> {
           List<Sequence<T>> sequencesByInterval = new ArrayList<>(alreadyCachedResults.size() + segmentsByServer.size());
           addSequencesFromCache(sequencesByInterval, alreadyCachedResults);
           addSequencesFromServer(sequencesByInterval, segmentsByServer);
           return Sequences
               .simple(sequencesByInterval)
               .flatMerge(seq -> seq, query.getResultOrdering());
         });
   ```
   
   From what I understand, the `Sequences#flatMerge()` function accepts a List of ordered sequences and the ordering for the sequences, then performs a k-way merge.  Since it's assumed that the returned Sequences from the Historicals are already sorted, only the first element of each sequence needs to be materialized to perform the merge.  This should be able to maintain the streaming nature of the Scan query.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org