You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2023/01/16 10:06:47 UTC

[GitHub] [doris] Gabriel39 commented on a diff in pull request #15642: [Enhancement](topn) support two phase read for topn query

Gabriel39 commented on code in PR #15642:
URL: https://github.com/apache/doris/pull/15642#discussion_r1071033878


##########
fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java:
##########
@@ -591,6 +591,33 @@ public void analyze(Analyzer analyzer) throws UserException {
         analyzeAggregation(analyzer);
         createAnalyticInfo(analyzer);
         eliminatingSortNode();
+        if (checkEnableTwoPhaseRead()) {
+            // If optimize enabled, we try our best to read less columns from ScanNode,
+            // here we analyze conjunct exprs and ordering exprs before resultExprs,
+            // rest of resultExprs will be marked as `INVALID`, such columns will
+            // be prevent from reading from ScanNode.Those columns will be finally
+            // read by the second fetch phase
+            LOG.info("two phase read optimize enabled");

Review Comment:
   ```suggestion
               LOG.debug("two phase read optimize enabled");
   ```



##########
gensrc/thrift/PlanNodes.thrift:
##########
@@ -891,6 +894,8 @@ struct TExchangeNode {
   2: optional TSortInfo sort_info
   // This is tHe number of rows to skip before returning results
   3: optional i64 offset
+  // used for second phase fetch

Review Comment:
   ```suggestion
     // Nodes in this cluster, used for second phase fetch
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java:
##########
@@ -619,6 +646,39 @@ public void analyze(Analyzer analyzer) throws UserException {
         }
     }
 
+    // Check whether enable two phase read optimize, if enabled query will be devieded into two phase read:
+    // 1. read conjuncts columns and order by columns along with an extra RowId column from ScanNode
+    // 2. sort and filter data, and get final RowId column, spawn RPC to other BE to fetch final data
+    // 3. final matrialize all data
+    public boolean checkEnableTwoPhaseRead() {
+        // Only handle the simplest `SELECT ... FROM <tbl> WHERE ... ORDER BY ... LIMIT ...` query
+        if (getAggInfo() != null
+                || getHavingPred() != null
+                || getWithClause() != null) {
+            return false;
+        }
+        // single olap table
+        List<TableRef> tblRefs = getTableRefs();
+        if (tblRefs.size() != 1 || !(tblRefs.get(0) instanceof BaseTableRef)) {
+            return false;
+        }
+        TableRef tbl = tblRefs.get(0);
+        if (tbl.getTable().getType() != Table.TableType.OLAP) {
+            return false;
+        }
+        // need enable light schema change

Review Comment:
   Why is light schema change required?



##########
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java:
##########
@@ -1052,6 +1060,10 @@ protected void toThrift(TPlanNode msg) {
         if (pushDownAggNoGroupingOp != null) {
             msg.olap_scan_node.setPushDownAggTypeOpt(pushDownAggNoGroupingOp);
         }
+
+        if (orderingExprs != null) {

Review Comment:
   Is this always false?



##########
fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java:
##########
@@ -619,6 +646,39 @@ public void analyze(Analyzer analyzer) throws UserException {
         }
     }
 
+    // Check whether enable two phase read optimize, if enabled query will be devieded into two phase read:
+    // 1. read conjuncts columns and order by columns along with an extra RowId column from ScanNode
+    // 2. sort and filter data, and get final RowId column, spawn RPC to other BE to fetch final data
+    // 3. final matrialize all data
+    public boolean checkEnableTwoPhaseRead() {

Review Comment:
   Add a session variable to turn off this feature



##########
fe/fe-core/src/main/java/org/apache/doris/planner/SortNode.java:
##########
@@ -142,6 +144,15 @@ public void setUseTopnOpt(boolean useTopnOpt) {
         this.useTopnOpt = useTopnOpt;
     }
 
+    public List<Expr> getResolvedTupleExprs() {
+        return resolvedTupleExprs;
+    }
+
+    public boolean isUseTopNTwoPhaseOptimize() {

Review Comment:
   ```suggestion
       public boolean useTwoPhaseTopNOptimization() {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org