You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/10/06 16:15:38 UTC

[GitHub] [druid] 599166320 commented on pull request #13168: ScanQuery supports multi column orderBy queries

599166320 commented on PR #13168:
URL: https://github.com/apache/druid/pull/13168#issuecomment-1270349887

@paul-rogers
I have done the following work in this commit:

1. For the sorting of ordinary columns, when traversing the segment, I prevent the scanquery object from passing the orderByLimit parameter to the cursor. (that is, the simplest solution mentioned above)

2. Improve and add more unit tests

3. Performance optimization mentioned above

The following points should be noted:

I added an `n-way merge strategy` to ScanQueryRunnerFactory. Eventually, I did not add a limit to the generated Sequence<ScanResultValue> like nWayMergeAndLimit. I don't think it is necessary.

This is because the merge operation of both data nodes `(Historical/Realtime)` and `brokers` will be implemented in ScanQueryOrderByLimitRowIterator.

Another thing to note is that I didn't put the sorting of `__time` and the sorting of ordinary columns together, because `__time` is special. In Druid, `__time` is actually a special index. Unlike ordinary sorting, it must traverse all sorted data.

When you review again, see if there is anything else to improve?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org