You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "walterddr (via GitHub)" <gi...@apache.org> on 2023/02/07 21:26:25 UTC

[GitHub] [pinot] walterddr opened a new pull request, #10241: [multistage] optimize limit without order by

walterddr opened a new pull request, #10241:
URL: https://github.com/apache/pinot/pull/10241

   Currently running limit without order by still runs a priority queue, this is not necessary. Move to use a direct list append.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10241: [multistage] optimize limit without order by

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10241:
URL: https://github.com/apache/pinot/pull/10241#discussion_r1099366991


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java:
##########
@@ -137,8 +153,15 @@ private void consumeInputBlocks() {
         }
 
         List<Object[]> container = block.getContainer();
-        for (Object[] row : container) {
-          SelectionOperatorUtils.addToPriorityQueue(row, _rows, _numRowsToKeep);
+        if (_priorityQueue == null) {
+          // TODO: when push-down properly, we shouldn't get more than _numRowsToKeep
+          if (_rows.size() <= _numRowsToKeep) {
+            _rows.addAll(container);

Review Comment:
   yeah sounds good. i put the trimming at return block instead of insert but both are correct



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on pull request #10241: [multistage] optimize limit without order by

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on PR #10241:
URL: https://github.com/apache/pinot/pull/10241#issuecomment-1423004960

   actually convert it back to ready. we will follow up with the rule changes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10241: [multistage] optimize limit without order by

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10241:
URL: https://github.com/apache/pinot/pull/10241#issuecomment-1421519301

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#10241](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cfdcaf7) into [master](https://codecov.io/gh/apache/pinot/commit/c6b7a9ef95c188369defa8f2abfceb177be3bf06?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c6b7a9e) will **decrease** coverage by `54.71%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #10241       +/-   ##
   =============================================
   - Coverage     68.42%   13.72%   -54.71%     
   + Complexity     5091      178     -4913     
   =============================================
     Files          2016     1961       -55     
     Lines        109081   106612     -2469     
     Branches      16581    16292      -289     
   =============================================
   - Hits          74640    14633    -60007     
   - Misses        29125    90830    +61705     
   + Partials       5316     1149     -4167     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `13.72% <0.00%> (+0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...che/pinot/query/runtime/operator/SortOperator.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9Tb3J0T3BlcmF0b3IuamF2YQ==) | `0.00% <0.00%> (-95.84%)` | :arrow_down: |
   | [...src/main/java/org/apache/pinot/sql/FilterKind.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcWwvRmlsdGVyS2luZC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ain/java/org/apache/pinot/core/data/table/Key.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL0tleS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ain/java/org/apache/pinot/common/CustomObject.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vQ3VzdG9tT2JqZWN0LmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...n/java/org/apache/pinot/core/data/table/Table.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL1RhYmxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/pinot/core/data/table/Record.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL1JlY29yZC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/pinot/core/util/GroupByUtils.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL0dyb3VwQnlVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1546 more](https://codecov.io/gh/apache/pinot/pull/10241?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10241: [multistage] optimize limit without order by

Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10241:
URL: https://github.com/apache/pinot/pull/10241#discussion_r1099365213


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java:
##########
@@ -137,8 +153,15 @@ private void consumeInputBlocks() {
         }
 
         List<Object[]> container = block.getContainer();
-        for (Object[] row : container) {
-          SelectionOperatorUtils.addToPriorityQueue(row, _rows, _numRowsToKeep);
+        if (_priorityQueue == null) {
+          // TODO: when push-down properly, we shouldn't get more than _numRowsToKeep
+          if (_rows.size() <= _numRowsToKeep) {
+            _rows.addAll(container);

Review Comment:
   should we add only `_numRowsToKeep  - _rows.size()` number of rows vs all of container ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10241: [multistage] optimize limit without order by

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10241:
URL: https://github.com/apache/pinot/pull/10241#discussion_r1099667163


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java:
##########
@@ -107,16 +114,25 @@ private TransferableBlock produceSortedBlock() {
     }
 
     if (!_isSortedBlockConstructed) {
-      LinkedList<Object[]> rows = new LinkedList<>();
-      while (_rows.size() > _offset) {
-        Object[] row = _rows.poll();
-        rows.addFirst(row);
-      }
       _isSortedBlockConstructed = true;
-      if (rows.size() == 0) {
-        return TransferableBlockUtils.getEndOfStreamTransferableBlock();
+      if (_priorityQueue == null) {

Review Comment:
   it doesn't need to be as the entire operator is single-threaded. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on pull request #10241: [multistage] optimize limit without order by

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on PR #10241:
URL: https://github.com/apache/pinot/pull/10241#issuecomment-1422881055

   realize that there are several more issues to address: limit without order by is still using single node reduce. which is bad - change PinotSortExchange rule
   
   converting it into a draft and incorporate this fix in as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr merged pull request #10241: [multistage] optimize limit without order by

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr merged PR #10241:
URL: https://github.com/apache/pinot/pull/10241


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10241: [multistage] optimize limit without order by

Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10241:
URL: https://github.com/apache/pinot/pull/10241#discussion_r1099367353


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java:
##########
@@ -70,8 +72,13 @@ public SortOperator(MultiStageOperator upstreamOperator, List<RexExpression> col
     _upstreamErrorBlock = null;
     _isSortedBlockConstructed = false;
     _numRowsToKeep = _fetch > 0 ? _fetch + _offset : defaultHolderCapacity;
-    _rows = new PriorityQueue<>(_numRowsToKeep,
-        new SortComparator(collationKeys, collationDirections, dataSchema, false));
+    if (collationKeys.isEmpty()) {
+      _priorityQueue = null;
+    } else {
+      _priorityQueue = new PriorityQueue<>(_numRowsToKeep,
+          new SortComparator(collationKeys, collationDirections, dataSchema, false));
+    }
+    _rows = new ArrayList<>();

Review Comment:
   I don't think this initialization is needed for both. Only for `non ORDER BY` scenario imo ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10241: [multistage] optimize limit without order by

Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10241:
URL: https://github.com/apache/pinot/pull/10241#discussion_r1099375564


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java:
##########
@@ -107,16 +114,25 @@ private TransferableBlock produceSortedBlock() {
     }
 
     if (!_isSortedBlockConstructed) {
-      LinkedList<Object[]> rows = new LinkedList<>();
-      while (_rows.size() > _offset) {
-        Object[] row = _rows.poll();
-        rows.addFirst(row);
-      }
       _isSortedBlockConstructed = true;
-      if (rows.size() == 0) {
-        return TransferableBlockUtils.getEndOfStreamTransferableBlock();
+      if (_priorityQueue == null) {

Review Comment:
   Not related to changes in this PR but just wondering if `_isSortedBlockConstructed` be volatile ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org