You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "walterddr (via GitHub)" <gi...@apache.org> on 2023/11/02 16:53:25 UTC

[PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

walterddr opened a new pull request, #11937:
URL: https://github.com/apache/pinot/pull/11937

   alternative PR to #11843


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385973343


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/server/ServerOpChainExecutionContext.java:
##########
@@ -0,0 +1,86 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan.server;
+
+import java.util.List;
+import java.util.concurrent.ExecutorService;
+import org.apache.pinot.common.request.PinotQuery;
+import org.apache.pinot.core.query.executor.QueryExecutor;
+import org.apache.pinot.core.query.request.ServerQueryRequest;
+import org.apache.pinot.query.planner.plannode.PlanNode;
+import org.apache.pinot.query.runtime.plan.DistributedStagePlan;
+import org.apache.pinot.query.runtime.plan.OpChainExecutionContext;
+
+
+/**
+ * Extension of the {@link OpChainExecutionContext} for {@link DistributedStagePlan} runs on leaf-stage.
+ *
+ * The difference is that: on a leaf-stage server node, {@link PlanNode} are split into {@link PinotQuery} part and
+ * {@link org.apache.pinot.query.runtime.operator.OpChain} part and are connected together in this context.
+ */
+public class ServerOpChainExecutionContext extends OpChainExecutionContext {

Review Comment:
   yeah i need to extend `OpChainExecutionContext`. b/c the visitor pattern of `PhysicalPlanVisitor` expects an `OpChainExecutionContext`
   
   I am not particularly happy about this but in order to reuse the logic in PhysicalPlanVisitor this is the only option i can think of 
   
   we can obviously create an interface of the execution context and let both visitor depend on a utility class that constructs the operator. but that's too much refactoring in this PR. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1386013270


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/LeafStageTransferableBlockOperator.java:
##########
@@ -113,6 +113,9 @@ public String toExplainString() {
 
   @Override
   protected TransferableBlock getNextBlock() {
+    if (_isFinished) {

Review Comment:
   fixed in https://github.com/apache/pinot/pull/11970. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1387469626


##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotRuleUtils.java:
##########
@@ -68,19 +69,28 @@ public static boolean isAggregate(RelNode rel) {
   }
 
   // TODO: optimize this part out as it is not efficient to scan the entire subtree for exchanges.
-  public static boolean noExchangeInSubtree(RelNode relNode) {
-    if (relNode instanceof HepRelVertex) {
-      relNode = ((HepRelVertex) relNode).getCurrentRel();
-    }
-    if (relNode instanceof Exchange) {
+  public static boolean canPushDynamicBroadcastToLeaf(RelNode relNode) {

Review Comment:
   change this to positive-search: only allow nodes that doesn't change row-semantics 
   - JOIN/PROJECT/FILTER/TABLE_SCAN
   - technically WINDOW and SORT also doesn't change row-semantics but SORT could have limit and WINDOW is not implemented on leaf. will follow up later



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1388238978


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/server/ServerOpChainExecutionContext.java:
##########
@@ -0,0 +1,86 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan.server;
+
+import java.util.List;
+import java.util.concurrent.ExecutorService;
+import org.apache.pinot.common.request.PinotQuery;
+import org.apache.pinot.core.query.executor.QueryExecutor;
+import org.apache.pinot.core.query.request.ServerQueryRequest;
+import org.apache.pinot.query.planner.plannode.PlanNode;
+import org.apache.pinot.query.runtime.plan.DistributedStagePlan;
+import org.apache.pinot.query.runtime.plan.OpChainExecutionContext;
+
+
+/**
+ * Extension of the {@link OpChainExecutionContext} for {@link DistributedStagePlan} runs on leaf-stage.
+ *
+ * The difference is that: on a leaf-stage server node, {@link PlanNode} are split into {@link PinotQuery} part and
+ * {@link org.apache.pinot.query.runtime.operator.OpChain} part and are connected together in this context.
+ */
+public class ServerOpChainExecutionContext extends OpChainExecutionContext {

Review Comment:
   discussed offline, we reverted the server extension on physical plan visitor and use the base logic instead. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr merged PR #11937:
URL: https://github.com/apache/pinot/pull/11937


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385969096


##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinToDynamicBroadcastRule.java:
##########
@@ -142,7 +142,7 @@ public boolean matches(RelOptRuleCall call) {
     RelNode right = join.getRight() instanceof HepRelVertex ? ((HepRelVertex) join.getRight()).getCurrentRel()
         : join.getRight();
     return left instanceof Exchange && right instanceof Exchange
-        && PinotRuleUtils.noExchangeInSubtree(left.getInput(0))
+        && PinotRuleUtils.canPushDynamicBroadcastToLeaf(left.getInput(0))

Review Comment:
   yes i think this is in the TODO item for this function. unfortunately, there's not too much we can do here.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #11937:
URL: https://github.com/apache/pinot/pull/11937#issuecomment-1791238059

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#11937](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (51eb13b) into [master](https://app.codecov.io/gh/apache/pinot/commit/8d49b112c203e6e6f46ee9f4142cfd1616fc6470?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (8d49b11) will **decrease** coverage by `61.41%`.
   > Report is 5 commits behind head on master.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #11937       +/-   ##
   =============================================
   - Coverage     61.40%    0.00%   -61.41%     
   =============================================
     Files          2378     2304       -74     
     Lines        128851   125194     -3657     
     Branches      19926    19385      -541     
   =============================================
   - Hits          79127        0    -79127     
   - Misses        44002   125194    +81192     
   + Partials       5722        0     -5722     
   ```
   
   | [Flag](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [integration](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (-0.01%)` | :arrow_down: |
   | [integration1](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [integration2](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (ø)` | |
   | [java-11](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (-34.75%)` | :arrow_down: |
   | [java-21](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (-61.30%)` | :arrow_down: |
   | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (-61.40%)` | :arrow_down: |
   | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [temurin](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (-61.41%)` | :arrow_down: |
   | [unittests](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/11937/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Files](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...ery/optimizer/filter/NumericalFilterOptimizer.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9vcHRpbWl6ZXIvZmlsdGVyL051bWVyaWNhbEZpbHRlck9wdGltaXplci5qYXZh) | `0.00% <ø> (-81.96%)` | :arrow_down: |
   | [...va/org/apache/pinot/query/runtime/QueryRunner.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9RdWVyeVJ1bm5lci5qYXZh) | `0.00% <0.00%> (-80.71%)` | :arrow_down: |
   | [...ot/common/request/context/RequestContextUtils.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVxdWVzdC9jb250ZXh0L1JlcXVlc3RDb250ZXh0VXRpbHMuamF2YQ==) | `0.00% <0.00%> (-67.64%)` | :arrow_down: |
   | [...ot/common/request/context/predicate/Predicate.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVxdWVzdC9jb250ZXh0L3ByZWRpY2F0ZS9QcmVkaWNhdGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...src/main/java/org/apache/pinot/sql/FilterKind.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcWwvRmlsdGVyS2luZC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/core/plan/FilterPlanNode.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL0ZpbHRlclBsYW5Ob2RlLmphdmE=) | `0.00% <0.00%> (-82.23%)` | :arrow_down: |
   | [...n/request/context/predicate/ConstantPredicate.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVxdWVzdC9jb250ZXh0L3ByZWRpY2F0ZS9Db25zdGFudFByZWRpY2F0ZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ot/query/runtime/plan/OpChainExecutionContext.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9wbGFuL09wQ2hhaW5FeGVjdXRpb25Db250ZXh0LmphdmE=) | `0.00% <0.00%> (-96.16%)` | :arrow_down: |
   | [...ime/plan/server/ServerOpChainExecutionContext.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9wbGFuL3NlcnZlci9TZXJ2ZXJPcENoYWluRXhlY3V0aW9uQ29udGV4dC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...runtime/plan/server/ServerPhysicalPlanVisitor.java](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9wbGFuL3NlcnZlci9TZXJ2ZXJQaHlzaWNhbFBsYW5WaXNpdG9yLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [2 more](https://app.codecov.io/gh/apache/pinot/pull/11937?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | |
   
   ... and [1976 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/11937/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1393023033


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/server/ServerPlanRequestUtils.java:
##########
@@ -118,13 +118,6 @@ public static OpChain compileLeafStage(OpChainExecutionContext executionContext,
   private static void constructPinotQueryPlan(ServerPlanRequestContext serverContext) {
     DistributedStagePlan stagePlan = serverContext.getStagePlan();
     PinotQuery pinotQuery = serverContext.getPinotQuery();
-    Integer leafNodeLimit =

Review Comment:
   refactor was wrong here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385819787


##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotRuleUtils.java:
##########
@@ -68,19 +69,28 @@ public static boolean isAggregate(RelNode rel) {
   }
 
   // TODO: optimize this part out as it is not efficient to scan the entire subtree for exchanges.
-  public static boolean noExchangeInSubtree(RelNode relNode) {
-    if (relNode instanceof HepRelVertex) {
-      relNode = ((HepRelVertex) relNode).getCurrentRel();
-    }
-    if (relNode instanceof Exchange) {
+  public static boolean canPushDynamicBroadcastToLeaf(RelNode relNode) {
+    relNode = PinotRuleUtils.unboxRel(relNode);
+    if (relNode instanceof TableScan) {
+      // reaching table means it is plannable.
+      return true;
+    } else if (relNode instanceof Exchange) {
+      // we do not allow any exchanges in between join and table scan.
       return false;
-    }
-    for (RelNode child : relNode.getInputs()) {
-      if (!noExchangeInSubtree(child)) {
+    } else if (relNode instanceof Join) {
+      // always check only the left child for dynamic broadcast
+      return canPushDynamicBroadcastToLeaf(((Join) relNode).getLeft());

Review Comment:
   Do we have guarantee this is semi join instead of inner join?



##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotRuleUtils.java:
##########
@@ -68,19 +69,28 @@ public static boolean isAggregate(RelNode rel) {
   }
 
   // TODO: optimize this part out as it is not efficient to scan the entire subtree for exchanges.
-  public static boolean noExchangeInSubtree(RelNode relNode) {
-    if (relNode instanceof HepRelVertex) {
-      relNode = ((HepRelVertex) relNode).getCurrentRel();
-    }
-    if (relNode instanceof Exchange) {
+  public static boolean canPushDynamicBroadcastToLeaf(RelNode relNode) {
+    relNode = PinotRuleUtils.unboxRel(relNode);
+    if (relNode instanceof TableScan) {
+      // reaching table means it is plannable.
+      return true;
+    } else if (relNode instanceof Exchange) {
+      // we do not allow any exchanges in between join and table scan.
       return false;
-    }
-    for (RelNode child : relNode.getInputs()) {
-      if (!noExchangeInSubtree(child)) {
+    } else if (relNode instanceof Join) {
+      // always check only the left child for dynamic broadcast
+      return canPushDynamicBroadcastToLeaf(((Join) relNode).getLeft());
+    } else if (relNode instanceof Aggregate) {
+      // if there's aggregation before join, join cannot be planned as dynamic broadcast.
+      return false;

Review Comment:
   How do we differentiate aggregate after join vs join after aggregate?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -37,11 +37,13 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
   protected final String _operatorId;
   protected final OpChainStats _opChainStats;
   protected boolean _isEarlyTerminated;
+  protected boolean _isFinished;

Review Comment:
   This shouldn't be added to the base



##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinToDynamicBroadcastRule.java:
##########
@@ -142,7 +142,7 @@ public boolean matches(RelOptRuleCall call) {
     RelNode right = join.getRight() instanceof HepRelVertex ? ((HepRelVertex) join.getRight()).getCurrentRel()
         : join.getRight();
     return left instanceof Exchange && right instanceof Exchange
-        && PinotRuleUtils.noExchangeInSubtree(left.getInput(0))
+        && PinotRuleUtils.canPushDynamicBroadcastToLeaf(left.getInput(0))

Review Comment:
   Not introduced in this PR, but can we break these checks into multiple steps? We should avoid doing the expensive `canPushDynamicBroadcastToLeaf()` unless all other checks pass



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/LeafStageTransferableBlockOperator.java:
##########
@@ -113,6 +113,9 @@ public String toExplainString() {
 
   @Override
   protected TransferableBlock getNextBlock() {
+    if (_isFinished) {

Review Comment:
   Why do we need this? We shouldn't need this unless the upstream operator breaks the contract of not calling next block when EOS is already returned



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/server/ServerOpChainExecutionContext.java:
##########
@@ -0,0 +1,86 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan.server;
+
+import java.util.List;
+import java.util.concurrent.ExecutorService;
+import org.apache.pinot.common.request.PinotQuery;
+import org.apache.pinot.core.query.executor.QueryExecutor;
+import org.apache.pinot.core.query.request.ServerQueryRequest;
+import org.apache.pinot.query.planner.plannode.PlanNode;
+import org.apache.pinot.query.runtime.plan.DistributedStagePlan;
+import org.apache.pinot.query.runtime.plan.OpChainExecutionContext;
+
+
+/**
+ * Extension of the {@link OpChainExecutionContext} for {@link DistributedStagePlan} runs on leaf-stage.
+ *
+ * The difference is that: on a leaf-stage server node, {@link PlanNode} are split into {@link PinotQuery} part and
+ * {@link org.apache.pinot.query.runtime.operator.OpChain} part and are connected together in this context.
+ */
+public class ServerOpChainExecutionContext extends OpChainExecutionContext {

Review Comment:
   Suggest not introducing this as subclass but keep the existing `ServerPlanRequestContext`. If you find these extra properties apply to `OpChain` in general, we may directly add them into `OpChainExecutionContext`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385973343


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/server/ServerOpChainExecutionContext.java:
##########
@@ -0,0 +1,86 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan.server;
+
+import java.util.List;
+import java.util.concurrent.ExecutorService;
+import org.apache.pinot.common.request.PinotQuery;
+import org.apache.pinot.core.query.executor.QueryExecutor;
+import org.apache.pinot.core.query.request.ServerQueryRequest;
+import org.apache.pinot.query.planner.plannode.PlanNode;
+import org.apache.pinot.query.runtime.plan.DistributedStagePlan;
+import org.apache.pinot.query.runtime.plan.OpChainExecutionContext;
+
+
+/**
+ * Extension of the {@link OpChainExecutionContext} for {@link DistributedStagePlan} runs on leaf-stage.
+ *
+ * The difference is that: on a leaf-stage server node, {@link PlanNode} are split into {@link PinotQuery} part and
+ * {@link org.apache.pinot.query.runtime.operator.OpChain} part and are connected together in this context.
+ */
+public class ServerOpChainExecutionContext extends OpChainExecutionContext {

Review Comment:
   yeah i need to extend `OpChainExecutionContext`. b/c the visitor pattern of `PhysicalPlanVisitor` expects an `OpChainExecutionContext`
   
   I am not particularly happy about this but in order to reuse the logic in PhysicalPlanVisitor this is the only option i can think of 
   
   we can obviously create an interface of the execution context and refactor the logic in PhysicalPlanVisitorvisitor to a utility class. but that's too much refactoring in this PR IMO
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385969803


##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotRuleUtils.java:
##########
@@ -68,19 +69,28 @@ public static boolean isAggregate(RelNode rel) {
   }
 
   // TODO: optimize this part out as it is not efficient to scan the entire subtree for exchanges.
-  public static boolean noExchangeInSubtree(RelNode relNode) {
-    if (relNode instanceof HepRelVertex) {
-      relNode = ((HepRelVertex) relNode).getCurrentRel();
-    }
-    if (relNode instanceof Exchange) {
+  public static boolean canPushDynamicBroadcastToLeaf(RelNode relNode) {
+    relNode = PinotRuleUtils.unboxRel(relNode);
+    if (relNode instanceof TableScan) {
+      // reaching table means it is plannable.
+      return true;
+    } else if (relNode instanceof Exchange) {
+      // we do not allow any exchanges in between join and table scan.
       return false;
-    }
-    for (RelNode child : relNode.getInputs()) {
-      if (!noExchangeInSubtree(child)) {
+    } else if (relNode instanceof Join) {
+      // always check only the left child for dynamic broadcast
+      return canPushDynamicBroadcastToLeaf(((Join) relNode).getLeft());
+    } else if (relNode instanceof Aggregate) {
+      // if there's aggregation before join, join cannot be planned as dynamic broadcast.
+      return false;

Review Comment:
   there's no need to differentiate. from a top-down walk. the moment you reach agg, you are guarantee that you have join on top of an aggregate (b/c the starting point of this call is a join



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385969488


##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotRuleUtils.java:
##########
@@ -68,19 +69,28 @@ public static boolean isAggregate(RelNode rel) {
   }
 
   // TODO: optimize this part out as it is not efficient to scan the entire subtree for exchanges.
-  public static boolean noExchangeInSubtree(RelNode relNode) {
-    if (relNode instanceof HepRelVertex) {
-      relNode = ((HepRelVertex) relNode).getCurrentRel();
-    }
-    if (relNode instanceof Exchange) {
+  public static boolean canPushDynamicBroadcastToLeaf(RelNode relNode) {
+    relNode = PinotRuleUtils.unboxRel(relNode);
+    if (relNode instanceof TableScan) {
+      // reaching table means it is plannable.
+      return true;
+    } else if (relNode instanceof Exchange) {
+      // we do not allow any exchanges in between join and table scan.
       return false;
-    }
-    for (RelNode child : relNode.getInputs()) {
-      if (!noExchangeInSubtree(child)) {
+    } else if (relNode instanceof Join) {
+      // always check only the left child for dynamic broadcast
+      return canPushDynamicBroadcastToLeaf(((Join) relNode).getLeft());

Review Comment:
   only semi join can trigger dynamic broadcast as of now (so no inner join will be pushed to leaf to begin with)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage][feature] leaf planning with multi-semi join support [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11937:
URL: https://github.com/apache/pinot/pull/11937#discussion_r1385972559


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/LeafStageTransferableBlockOperator.java:
##########
@@ -113,6 +113,9 @@ public String toExplainString() {
 
   @Override
   protected TransferableBlock getNextBlock() {
+    if (_isFinished) {

Review Comment:
   correct. i think there's a bug in the behavior of Window & Aggregate Operator that causes this. previously this was handled b/c MailboxReceivedOperator will return multiple EOS.
   
   I can correct that behavior but decided to move that to a separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org