You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/09/01 12:09:09 UTC

[GitHub] [pinot] gortiz commented on a diff in pull request #8979: optimize `order by sorted ASC, unsorted` and `order by DESC` cases

gortiz commented on code in PR #8979:
URL: https://github.com/apache/pinot/pull/8979#discussion_r960574343


##########
pinot-core/src/main/java/org/apache/pinot/core/plan/SelectionPlanNode.java:
##########
@@ -50,67 +52,113 @@ public Operator<IntermediateResultsBlock> run() {
     List<ExpressionContext> expressions = SelectionOperatorUtils.extractExpressions(_queryContext, _indexSegment);
     int limit = _queryContext.getLimit();
 
-    if (limit > 0) {
-      List<OrderByExpressionContext> orderByExpressions = _queryContext.getOrderByExpressions();
-      if (orderByExpressions == null) {
-        // Selection only
-        TransformOperator transformOperator = new TransformPlanNode(_indexSegment, _queryContext, expressions,
-            Math.min(limit, DocIdSetPlanNode.MAX_DOC_PER_CALL)).run();
-        return new SelectionOnlyOperator(_indexSegment, _queryContext, expressions, transformOperator);
-      } else {
-        // Selection order-by
-        if (isAllOrderByColumnsSorted(orderByExpressions)) {
-          // All order-by columns are sorted, no need to sort the records
-          TransformOperator transformOperator = new TransformPlanNode(_indexSegment, _queryContext, expressions,
-              Math.min(limit + _queryContext.getOffset(), DocIdSetPlanNode.MAX_DOC_PER_CALL)).run();
-          return new SelectionOrderByOperator(_indexSegment, _queryContext, expressions, transformOperator, true);
-        } else if (orderByExpressions.size() == expressions.size()) {
-          // All output expressions are ordered
-          TransformOperator transformOperator =
-              new TransformPlanNode(_indexSegment, _queryContext, expressions, DocIdSetPlanNode.MAX_DOC_PER_CALL).run();
-          return new SelectionOrderByOperator(_indexSegment, _queryContext, expressions, transformOperator, false);
-        } else {
-          // Not all output expressions are ordered, only fetch the order-by expressions and docId to avoid the
-          // unnecessary data fetch
-          List<ExpressionContext> expressionsToTransform = new ArrayList<>(orderByExpressions.size());
-          for (OrderByExpressionContext orderByExpression : orderByExpressions) {
-            expressionsToTransform.add(orderByExpression.getExpression());
-          }
-          TransformOperator transformOperator =
-              new TransformPlanNode(_indexSegment, _queryContext, expressionsToTransform,
-                  DocIdSetPlanNode.MAX_DOC_PER_CALL).run();
-          return new SelectionOrderByOperator(_indexSegment, _queryContext, expressions, transformOperator, false);
-        }
-      }
-    } else {
+    if (limit == 0) {
       // Empty selection (LIMIT 0)
       TransformOperator transformOperator = new TransformPlanNode(_indexSegment, _queryContext, expressions, 0).run();
       return new EmptySelectionOperator(_indexSegment, expressions, transformOperator);
     }
+    List<OrderByExpressionContext> orderByExpressions = _queryContext.getOrderByExpressions();
+
+    if (orderByExpressions == null) {
+      // Selection only
+      // ie: SELECT ... FROM Table WHERE ... LIMIT 10
+      int actualLimit = Math.min(limit, DocIdSetPlanNode.MAX_DOC_PER_CALL);
+      TransformPlanNode planNode = new TransformPlanNode(_indexSegment, _queryContext, expressions, actualLimit);
+      TransformOperator transformOperator = planNode.run();
+
+      return new SelectionOnlyOperator(_indexSegment, _queryContext, expressions, transformOperator);
+    }
+    int numOrderByExpressions = orderByExpressions.size();
+    // Although it is a break of abstraction, some code, specially merging, assumes that if there is an order by
+    // expression the operator will return a block whose selection result is a priority queue.
+    int sortedColumnsPrefixSize = getSortedColumnsPrefix(orderByExpressions);

Review Comment:
   Let me verify if I correctly understand it. Given a table `T` with three rows where with a single column called `col` whose values are `1`, `null` and `2`.
   
   1. `col` is considered sorted.
   2. If the default null value of `col` is null, `select col from T order by T` should return `1`, `2`, `null`
   
   Is that right? And if it is... why do we considered `col` a sorted column if, in fact, it is not sorted in the segment? More formally, given for each d1 and d2 such as d1 < d2 then it is not guaranteed that segment[d1][col] <= segment[d2][col]



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org