You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/02/08 02:24:46 UTC
[GitHub] justinborromeo opened a new issue #7036: [Proposal] K-Way Merge for Time-Ordered Scans

justinborromeo opened a new issue #7036: [Proposal] K-Way Merge for Time-Ordered Scans
URL: https://github.com/apache/incubator-druid/issues/7036
 
 
   # Motivation
   
   The motivation is to support time-ordering for scan queries that return greater than 100000 rows.  In PR #7024 (which is in review), the ability to time-order small result sets (<100K rows) in memory was added.  It would be nice to have this capability for larger result sets so that SELECT queries can be avoided (time ordering is the only advantage that SELECT has over SCAN).  However, sorting larger datasets in-memory runs the risk of Broker nodes encountering OOMEs.  A different approach needs to be used.  See #6088 for more information.
   
   # Proposed Changes
   
   ## Overview
   
   Scan queries return rows in a streaming manner.  Each Historical node executes a query by using a single-threaded sequential scan of target segments.  The query runner streams rows to the client/broker as they're discovered.
   
   ## Required changes
   
   To properly implement this feature, a k-way merge would need to occur at both the Historical level (to sort rows from segments) and the Broker level (to sort rows from each Historical and cache) for time-ordered results to be returned to the client.
   
   ## Principles
   
   - The change should not affect the performance of non-time-ordered scan queries.
   - The performance impact of sorting should be as small as possible.
   
   ## Implementation
   
   ### Historical-level
   
   Since the existing scan query runner on the Historical is unable to look at all the segments at once, it is impossible to do a k-way merge using the existing query runner.  I propose creating a new query runner factory called something like `OrderedScanQueryRunnerFactor` which would return a single runner when mergeRunners() is called.  This runner would open each segment file, perform a k-way merge on the files, and return a Sequence of ScanResultValues.  The # of segments queried would have to be capped at some value to prevent the server from running out of memory.
   
   ### Broker-level
   
   The k-way merge at the broker level is easier to implement in that the existing Sequence.flatMerge() can be used.
   
   # Changed Interfaces
   
   - No breaking API changes.
   - Returned rows won't be returned with their segmentId (same as in #7024).
   - New properties:
   
       - druid.query.scan.maxSegmentsToTimeOrder
           - Description: The maximum # of segments that a  Historical is allowed to scan in a time-ordered query.  Query fails if number of segments is above this  limit
           - Where: Historical 
           - Default:  TBD
           - Format: int
   
   # Migration / Operational impact
   
   No migration needed.
   
   # Future work
   
   A file-based k-way merge will be needed for large queries that touch more segments than the threshold.
   
   
   
   Feel free to let me know if something I wrote is incorrect from a technical point of view.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org