You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/03/25 15:04:53 UTC

[GitHub] [pinot] walterddr opened a new pull request #8412: add pinot-query-runtime

walterddr opened a new pull request #8412:
URL: https://github.com/apache/pinot/pull/8412


   Adding pinot-query-runtime
   
   Design Doc
   ===
   See: [PEP pinot-query-runtime](https://docs.google.com/document/d/10-vL_bUrI-Pi2oYudWyUlQl9Kf0cLrW-Z8hGczkCPik/edit#heading=h.7at2eb9ro4b2)
   
   Details
   ===
   add pinot-query-runtime interfaces and its initial implementations. 
       - inter-stage exchange interface
       - worker executor interface
       - dispatcher and dispatch-receiving server interface
       - row-based runtime operator (incomplete)
   
   several component that's need in order to access current pinot-server functionality
       - converter between stage plan into existing PinotQuery/QueryContext
       - after-execution exchange sender connected to InstanceResponseOperator


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   70.69%           
     Complexity                          ?     4436           
   ===========================================================
     Files                               ?     1704           
     Lines                               ?    87696           
     Branches                            ?    13162           
   ===========================================================
     Hits                                ?    61995           
     Misses                              ?    21432           
     Partials                            ?     4269           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.64% <0.00%> (?)` | |
   | integration2 | `27.25% <0.00%> (?)` | |
   | unittests1 | `66.92% <0.00%> (?)` | |
   | unittests2 | `14.85% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...082f5b5](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837147388



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+

Review comment:
       Can we also include an enum `RpcType` or something similar ?
   
   Also, I was thinking if this will evolve to something like the following then can we at least name the files accordingly and separate the message definitions ?
   
   - RPC and message definition between brokers and server (BrokerServerRPC.proto)
   - RPC and message definition between servers (ServerRPC.proto)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837200274



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;

Review comment:
       Why do we need stageId and instanceId ?
   
   `QueryPlan` comprises of multiple stages so these properties are applicable to a particular `StageNode` and not the entire `QueryPlan` imo




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   27.30%           
   ===========================================================
     Files                               ?     1642           
     Lines                               ?    86111           
     Branches                            ?    12999           
   ===========================================================
     Hits                                ?    23515           
     Misses                              ?    60384           
     Partials                            ?     2212           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration2 | `27.30% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...8d22da8](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837170999



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.

Review comment:
       You meant to use a different class name in javadoc ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837102803



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;

Review comment:
       optional ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837235644



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/DataTableBlock.java
##########
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.blocks;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Block;
+import org.apache.pinot.core.common.BlockDocIdSet;
+import org.apache.pinot.core.common.BlockDocIdValueSet;
+import org.apache.pinot.core.common.BlockMetadata;
+import org.apache.pinot.core.common.BlockValSet;
+import org.apache.pinot.core.operator.blocks.InstanceResponseBlock;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * A {@code DataTableBlock} is a row-based data block backed by a {@link DataTable}.
+ */
+public class DataTableBlock implements Block {

Review comment:
       We should discuss this a little bit more before going down the **row-oriented** path
   
   I think we will end up rewriting all the existing **columnar** operators into their corresponding **row-wise** counterparts in this new multi-stage query engine and that is probably a lot of unnecessary / avoidable work if we could just use a columnar / record batch like wire format for exchange messages between executors for shuffle / partition / broadcast and at least try to use existing columnar operators. 
   
   I think it is ok to see if reusing existing columnar operators is possible in the new engine or may be just copy them over here and do minimal changes as needed. So it's probably ok to have 2 different versions of columnar operators (if reuse is absolutely impossible) -- one for existing engine and one for new engine
   
   What is more concerning to me is that since we are columnar (storage format, inmemory format and execution wise) but here we are kind of regressing by doing something that is not favorable for OLAP imo. I think we should stick with columnar for performance (and related benefits in OLAP) reasons because otherwise we will find ourselves doing multiple rewrites
   
   - rewrite all current columnar operators into rowwise for multi-engine to work
   - rewrite all multi-engine row-wise operators at a later point in future into columnar counterparts to improve performance (I am certain this will be one of the things targeted later to improve performance)
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   69.65%           
     Complexity                          ?     4437           
   ===========================================================
     Files                               ?     1704           
     Lines                               ?    87695           
     Branches                            ?    13162           
   ===========================================================
     Hits                                ?    61081           
     Misses                              ?    22372           
     Partials                            ?     4242           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.72% <0.00%> (?)` | |
   | unittests1 | `66.93% <0.00%> (?)` | |
   | unittests2 | `14.82% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...3eedf44](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   14.85%           
     Complexity                          ?      246           
   ===========================================================
     Files                               ?     1659           
     Lines                               ?    85815           
     Branches                            ?    12959           
   ===========================================================
     Hits                                ?    12752           
     Misses                              ?    72130           
     Partials                            ?      933           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests2 | `14.85% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...082f5b5](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837101660



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;

Review comment:
       I think these should be INTs as discussed in previous PR ? or may be we should settle on StageNode serialization format first ? Once it goes in here, will remain forever




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
walterddr commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r838632119



##########
File path: pinot-common/src/main/proto/mailbox.proto
##########
@@ -0,0 +1,55 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotMailbox {
+  rpc open(stream MailboxContent) returns (stream MailboxStatus);
+}
+
+message MailboxStatus {
+  string mailboxId = 1;

Review comment:
       it is for debuggability purpose. we could use a hash function to compute from the mailboxID to a hash if performance is of concern.

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;

Review comment:
       yes StageId should be INT. I will make that change along with this PR.
   instanceId cannot be as it is a stringify format of the `ServerInstance`

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;
+  repeated string scannedTables = 2;
+  map<string, SegmentMetadata> segmentMetadata = 3;
+}

Review comment:
       this is the same as the current server instance to segment list map. 
   however i can't add a `map<string, repeated string>` in proto so I need to wrap the list of segment names into `SegmentMetadata`.
   
   maybe I should rename it to `SegmentNameList`?

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;

Review comment:
       here QueryPlan is actually StagePlan. renaming it to QueryStagePlan. 

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;

Review comment:
       not really optional since some of the physical info has to go over metadata (like broker address / metrics prefix) but yeah I can make it optional since technically you can run without the metadata.

##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/DataTableBlock.java
##########
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.blocks;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Block;
+import org.apache.pinot.core.common.BlockDocIdSet;
+import org.apache.pinot.core.common.BlockDocIdValueSet;
+import org.apache.pinot.core.common.BlockMetadata;
+import org.apache.pinot.core.common.BlockValSet;
+import org.apache.pinot.core.operator.blocks.InstanceResponseBlock;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * A {@code DataTableBlock} is a row-based data block backed by a {@link DataTable}.
+ */
+public class DataTableBlock implements Block {

Review comment:
       good idea. let's put a TODO (or design discussion section on this)

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+

Review comment:
       so I purposefully not use Broker/Server concept but Worker/Mailbox because.
   1. technically speaking a Worker can be Broker (Worker without local segments) or Server (Worker with local segments). 
   2. It makes more sense to differentiate whether a proto defines controll flow (Worker) or data flow (Mailbox)
   
   I will add more comment to the proto files. 

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;

Review comment:
       not sure what this mean. could you clarify a bit more regarding this comment? 
   
   As in my design, once the job has been dispatched. all communication are between executors (of different machines)

##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {
+  private final String _stageId;
+  private ServerInstance _serverInstance;
+  private StageNode _stageRoot;
+  private Map<String, StageMetadata> _metadataMap;

Review comment:
       > I think we need to revisit this abstraction. It looks very confusing here
   > 
   > Multi-stage `QueryPlan` built at broker comprises of multiple stages. Since each `StageNode` will be executed on some executor / server, that information must already be available at the time of building the QueryPlan.
   
   you are absolutely right. 
   so QueryPlan already contains the entire "physical" plan of the multi-stage execution. 
   
   
   > 
   > So, then why should the information about `serverInstance` be tracked separately ? In fact, we are tracking it in both `StageMetadata` and then again outside here in the deserialized plan.
   
   DistributedQueryPlan is actually "ONE STAGE OF" the QueryPlan (or I think we should rename it DistributableStagePlan). and yes again you are right this should be in some sort of serializable format. 
   
   > 
   > The `StageNode` should be self-contained ideally
   > 
   > * root or leaf stage
   > * stageId
   > * operator tree
   > * executor assignment
   
   all the 4 components you describe above is assigned to a stage, not a stageNode. thus the split between StageNode _root; and StageMetadata. 
   
   
   

##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/BroadcastJoinOperator.java
##########
@@ -0,0 +1,162 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.BaseOperator;
+import org.apache.pinot.core.query.selection.SelectionOperatorUtils;
+import org.apache.pinot.query.planner.partitioning.KeySelector;
+import org.apache.pinot.query.runtime.blocks.DataTableBlock;
+import org.apache.pinot.query.runtime.blocks.DataTableBlockUtils;
+
+
+/**
+ * This basic {@code BroadcastJoinOperator} implement a basic broadcast join algorithm.
+ *
+ * <p>It takes the right table as the broadcast side and materialize a hash table. Then for each of the left table row,
+ * it looks up for the corresponding row(s) from the hash table and create a joint row.
+ *
+ * <p>For each of the data block received from the left table, it will generate a joint data block.
+ */
+public class BroadcastJoinOperator extends BaseOperator<DataTableBlock> {
+
+  private final HashMap<Object, List<Object[]>> _broadcastHashTable;
+  private final BaseOperator<DataTableBlock> _leftTableOperator;
+  private final BaseOperator<DataTableBlock> _rightTableOperator;
+
+  private DataSchema _leftTableSchema;
+  private DataSchema _rightTableSchema;
+  private int _resultRowSize;
+  private boolean _isHashTableBuilt;
+  private KeySelector<Object[], Object> _leftKeySelector;
+  private KeySelector<Object[], Object> _rightKeySelector;
+
+  public BroadcastJoinOperator(BaseOperator<DataTableBlock> leftTableOperator,
+      BaseOperator<DataTableBlock> rightTableOperator, KeySelector<Object[], Object> leftKeySelector,
+      KeySelector<Object[], Object> rightKeySelector) {
+    // TODO: this assumes right table is broadcast.
+    _leftKeySelector = leftKeySelector;
+    _rightKeySelector = rightKeySelector;
+    _leftTableOperator = leftTableOperator;
+    _rightTableOperator = rightTableOperator;
+    _isHashTableBuilt = false;
+    _broadcastHashTable = new HashMap<>();
+  }
+
+  @Override
+  public String getOperatorName() {
+    return null;

Review comment:
       good catch will update 

##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.

Review comment:
       actually importing the class ServerQueryRequest but only use it in javadoc causes checkstyle to complain. let me see what can be changed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   69.63%           
     Complexity                          ?     4436           
   ===========================================================
     Files                               ?     1704           
     Lines                               ?    87696           
     Branches                            ?    13162           
   ===========================================================
     Hits                                ?    61064           
     Misses                              ?    22393           
     Partials                            ?     4239           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.64% <0.00%> (?)` | |
   | unittests1 | `66.92% <0.00%> (?)` | |
   | unittests2 | `14.85% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...082f5b5](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   63.98%           
     Complexity                          ?     4437           
   ===========================================================
     Files                               ?     1659           
     Lines                               ?    85814           
     Branches                            ?    12959           
   ===========================================================
     Hits                                ?    54907           
     Misses                              ?    26963           
     Partials                            ?     3944           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `66.93% <0.00%> (?)` | |
   | unittests2 | `14.82% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...3eedf44](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837181518



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {
+  private final String _stageId;

Review comment:
       This should be encapsulated within the `_stageRoot`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837102135



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;

Review comment:
       (nit) endpoints or executors ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837206863



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/BroadcastJoinOperator.java
##########
@@ -0,0 +1,162 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.BaseOperator;
+import org.apache.pinot.core.query.selection.SelectionOperatorUtils;
+import org.apache.pinot.query.planner.partitioning.KeySelector;
+import org.apache.pinot.query.runtime.blocks.DataTableBlock;
+import org.apache.pinot.query.runtime.blocks.DataTableBlockUtils;
+
+
+/**
+ * This basic {@code BroadcastJoinOperator} implement a basic broadcast join algorithm.
+ *
+ * <p>It takes the right table as the broadcast side and materialize a hash table. Then for each of the left table row,
+ * it looks up for the corresponding row(s) from the hash table and create a joint row.
+ *
+ * <p>For each of the data block received from the left table, it will generate a joint data block.
+ */
+public class BroadcastJoinOperator extends BaseOperator<DataTableBlock> {
+
+  private final HashMap<Object, List<Object[]>> _broadcastHashTable;
+  private final BaseOperator<DataTableBlock> _leftTableOperator;
+  private final BaseOperator<DataTableBlock> _rightTableOperator;
+
+  private DataSchema _leftTableSchema;
+  private DataSchema _rightTableSchema;
+  private int _resultRowSize;
+  private boolean _isHashTableBuilt;
+  private KeySelector<Object[], Object> _leftKeySelector;
+  private KeySelector<Object[], Object> _rightKeySelector;
+
+  public BroadcastJoinOperator(BaseOperator<DataTableBlock> leftTableOperator,
+      BaseOperator<DataTableBlock> rightTableOperator, KeySelector<Object[], Object> leftKeySelector,
+      KeySelector<Object[], Object> rightKeySelector) {
+    // TODO: this assumes right table is broadcast.
+    _leftKeySelector = leftKeySelector;
+    _rightKeySelector = rightKeySelector;
+    _leftTableOperator = leftTableOperator;
+    _rightTableOperator = rightTableOperator;
+    _isHashTableBuilt = false;
+    _broadcastHashTable = new HashMap<>();
+  }
+
+  @Override
+  public String getOperatorName() {
+    return null;

Review comment:
       Why null ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837200855



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {
+  private final String _stageId;
+  private ServerInstance _serverInstance;
+  private StageNode _stageRoot;
+  private Map<String, StageMetadata> _metadataMap;

Review comment:
       I think we need to revisit this abstraction. It looks very confusing here
   
   Multi-stage `QueryPlan` built at broker comprises of multiple stages. Since each `StageNode` will be executed on some executor / server, that information must already be available at the time of building the QueryPlan.
   
   So, then why should the information about `serverInstance` be tracked separately ? In fact, we are tracking it in both `StageMetadata` and then again outside here in the deserialized plan.
   
   The `StageNode` should be self-contained ideally
   
   - root or leaf stage
   - stageId
   - operator tree
   - executor assignment
   
   May be it's ok to track the assignment inside metadata but not sure why we need list of instances since a given stage will be executed exactly once
   
   ```
   // used for assigning server/worker nodes.
     private List<ServerInstance> _serverInstances;
   ```
   
   Let's discuss this briefly please




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837187426



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {

Review comment:
       I saw the next class and makes sense that this is deserialized from protobuf message. But I still wonder why this implements serializable ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
walterddr commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r840921968



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/StringMailboxIdentifier.java
##########
@@ -0,0 +1,108 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.mailbox;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+
+
+public class StringMailboxIdentifier implements MailboxIdentifier {
+  private static final Joiner JOINER = Joiner.on(':');
+
+  private final String _mailboxIdString;
+  private final String _jobId;
+  private final String _partitionKey;
+  private final String _fromHost;
+  private final int _fromPort;
+  private final String _toHost;
+  private final int _toPort;
+
+  public StringMailboxIdentifier(String jobId, String partitionKey, String fromHost, int fromPort, String toHost,
+      int toPort) {
+    _jobId = jobId;
+    _partitionKey = partitionKey;
+    _fromHost = fromHost;
+    _fromPort = fromPort;
+    _toHost = toHost;
+    _toPort = toPort;
+    _mailboxIdString = JOINER.join(_jobId, _partitionKey, _fromHost, _fromPort, _toHost, _toPort);
+  }
+
+  public StringMailboxIdentifier(String mailboxId) {
+    _mailboxIdString = mailboxId;
+    String[] splits = mailboxId.split(":");
+    Preconditions.checkState(splits.length == 6);
+    _jobId = splits[0];
+    _partitionKey = splits[1];

Review comment:
       technically partitionKey shouldn't be part of the mailboxID. it should be a stateless function used by mailbox nodes to convert keySelector result (e.g. `uuid` to one of the group of receiving channels `fromHost:fromPort:toHost:toPort`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837153804



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;

Review comment:
       I suggest using int32




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837235644



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/DataTableBlock.java
##########
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.blocks;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Block;
+import org.apache.pinot.core.common.BlockDocIdSet;
+import org.apache.pinot.core.common.BlockDocIdValueSet;
+import org.apache.pinot.core.common.BlockMetadata;
+import org.apache.pinot.core.common.BlockValSet;
+import org.apache.pinot.core.operator.blocks.InstanceResponseBlock;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * A {@code DataTableBlock} is a row-based data block backed by a {@link DataTable}.
+ */
+public class DataTableBlock implements Block {

Review comment:
       We should discuss this a little bit more before going down the row oriented path
   
   I think we will end up rewriting all the existing columnar operators into their corresponding row-wise counterparts in this new multi-stage query engine and that is probably a lot of unnecessary / avoidable work if we could just use a columnar / record batch like wire format for exchange messages between executors for shuffle / partition / broadcast and at least try to use existing columnar operators. 
   
   I think it is ok to see if reusing existing columnar operators is possible in the new engine or may be just copy them over here and do minimal changes as needed. So it's probably ok to have 2 different versions of columnar operators (if reuse is absolutely impossible) -- one for existing engine and one for new engine
   
   What is more concerning to me is that since we are columnar (storage format, inmemory format and execution wise), I think we should stick with that for performance (and related benefits in OLAP) reasons because otherwise we will find ourselves doing another rewrite
   
   - rewrite all current columnar operators into rowwise for multi-engine to work
   - rewrite all multi-engine row-wise operators at a later point in future into columnar counterparts to improve performance (I am certain this will be one of the things targeted later to improve performance)
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1081984610


   Will go through few remaining files today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
walterddr commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r841091062



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;

Review comment:
       See: https://github.com/walterddr/pinot/pull/6. I added a field-encoder based serde 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   38.16%           
     Complexity                          ?      250           
   ===========================================================
     Files                               ?     1703           
     Lines                               ?    87694           
     Branches                            ?    13162           
   ===========================================================
     Hits                                ?    33467           
     Misses                              ?    51598           
     Partials                            ?     2629           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.64% <0.00%> (?)` | |
   | integration2 | `27.35% <0.00%> (?)` | |
   | unittests2 | `14.87% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...5317a8b](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837170772



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {

Review comment:
       I am confused by the purpose of this class.
   
   Does this represent the final distributed multi-stage plan built at the broker in QueryEnvironment ? If so, then this should be defined as a protobuf message




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837155996



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;
+  repeated string scannedTables = 2;

Review comment:
       (nit) `dataSources` is a better name considering the fact that only leaf stages will scan from actual "tables" and intermediary stages will take input from `exchange` output ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837235644



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/DataTableBlock.java
##########
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.blocks;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Block;
+import org.apache.pinot.core.common.BlockDocIdSet;
+import org.apache.pinot.core.common.BlockDocIdValueSet;
+import org.apache.pinot.core.common.BlockMetadata;
+import org.apache.pinot.core.common.BlockValSet;
+import org.apache.pinot.core.operator.blocks.InstanceResponseBlock;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * A {@code DataTableBlock} is a row-based data block backed by a {@link DataTable}.
+ */
+public class DataTableBlock implements Block {

Review comment:
       We should discuss this a little bit more before going down the **row-oriented** path
   
   I think we will end up rewriting all the existing columnar operators into their corresponding row-wise counterparts in this new multi-stage query engine and that is probably a lot of unnecessary / avoidable work if we could just use a columnar / record batch like wire format for exchange messages between executors for shuffle / partition / broadcast and at least try to use existing columnar operators. 
   
   I think it is ok to see if reusing existing columnar operators is possible in the new engine or may be just copy them over here and do minimal changes as needed. So it's probably ok to have 2 different versions of columnar operators (if reuse is absolutely impossible) -- one for existing engine and one for new engine
   
   What is more concerning to me is that since we are columnar (storage format, inmemory format and execution wise), I think we should stick with that for performance (and related benefits in OLAP) reasons because otherwise we will find ourselves doing another rewrite
   
   - rewrite all current columnar operators into rowwise for multi-engine to work
   - rewrite all multi-engine row-wise operators at a later point in future into columnar counterparts to improve performance (I am certain this will be one of the things targeted later to improve performance)
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837207994



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/BroadcastJoinOperator.java
##########
@@ -0,0 +1,162 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.BaseOperator;
+import org.apache.pinot.core.query.selection.SelectionOperatorUtils;
+import org.apache.pinot.query.planner.partitioning.KeySelector;
+import org.apache.pinot.query.runtime.blocks.DataTableBlock;
+import org.apache.pinot.query.runtime.blocks.DataTableBlockUtils;
+
+
+/**
+ * This basic {@code BroadcastJoinOperator} implement a basic broadcast join algorithm.
+ *
+ * <p>It takes the right table as the broadcast side and materialize a hash table. Then for each of the left table row,
+ * it looks up for the corresponding row(s) from the hash table and create a joint row.
+ *
+ * <p>For each of the data block received from the left table, it will generate a joint data block.
+ */
+public class BroadcastJoinOperator extends BaseOperator<DataTableBlock> {
+
+  private final HashMap<Object, List<Object[]>> _broadcastHashTable;
+  private final BaseOperator<DataTableBlock> _leftTableOperator;
+  private final BaseOperator<DataTableBlock> _rightTableOperator;
+
+  private DataSchema _leftTableSchema;
+  private DataSchema _rightTableSchema;
+  private int _resultRowSize;
+  private boolean _isHashTableBuilt;
+  private KeySelector<Object[], Object> _leftKeySelector;
+  private KeySelector<Object[], Object> _rightKeySelector;
+
+  public BroadcastJoinOperator(BaseOperator<DataTableBlock> leftTableOperator,
+      BaseOperator<DataTableBlock> rightTableOperator, KeySelector<Object[], Object> leftKeySelector,
+      KeySelector<Object[], Object> rightKeySelector) {
+    // TODO: this assumes right table is broadcast.
+    _leftKeySelector = leftKeySelector;
+    _rightKeySelector = rightKeySelector;
+    _leftTableOperator = leftTableOperator;
+    _rightTableOperator = rightTableOperator;
+    _isHashTableBuilt = false;
+    _broadcastHashTable = new HashMap<>();
+  }
+
+  @Override
+  public String getOperatorName() {
+    return null;
+  }
+
+  @Override
+  public List<Operator> getChildOperators() {
+    return null;

Review comment:
       Same here. This should not be null imo. Please leave a TODO at least
   
   This interface was added to enforce this on all new operators and implicitly they become part of EXPLAIN PLAN 

##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/BroadcastJoinOperator.java
##########
@@ -0,0 +1,162 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.operator;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Operator;
+import org.apache.pinot.core.operator.BaseOperator;
+import org.apache.pinot.core.query.selection.SelectionOperatorUtils;
+import org.apache.pinot.query.planner.partitioning.KeySelector;
+import org.apache.pinot.query.runtime.blocks.DataTableBlock;
+import org.apache.pinot.query.runtime.blocks.DataTableBlockUtils;
+
+
+/**
+ * This basic {@code BroadcastJoinOperator} implement a basic broadcast join algorithm.
+ *
+ * <p>It takes the right table as the broadcast side and materialize a hash table. Then for each of the left table row,
+ * it looks up for the corresponding row(s) from the hash table and create a joint row.
+ *
+ * <p>For each of the data block received from the left table, it will generate a joint data block.
+ */
+public class BroadcastJoinOperator extends BaseOperator<DataTableBlock> {
+
+  private final HashMap<Object, List<Object[]>> _broadcastHashTable;
+  private final BaseOperator<DataTableBlock> _leftTableOperator;
+  private final BaseOperator<DataTableBlock> _rightTableOperator;
+
+  private DataSchema _leftTableSchema;
+  private DataSchema _rightTableSchema;
+  private int _resultRowSize;
+  private boolean _isHashTableBuilt;
+  private KeySelector<Object[], Object> _leftKeySelector;
+  private KeySelector<Object[], Object> _rightKeySelector;
+
+  public BroadcastJoinOperator(BaseOperator<DataTableBlock> leftTableOperator,
+      BaseOperator<DataTableBlock> rightTableOperator, KeySelector<Object[], Object> leftKeySelector,
+      KeySelector<Object[], Object> rightKeySelector) {
+    // TODO: this assumes right table is broadcast.
+    _leftKeySelector = leftKeySelector;
+    _rightKeySelector = rightKeySelector;
+    _leftTableOperator = leftTableOperator;
+    _rightTableOperator = rightTableOperator;
+    _isHashTableBuilt = false;
+    _broadcastHashTable = new HashMap<>();
+  }
+
+  @Override
+  public String getOperatorName() {
+    return null;
+  }
+
+  @Override
+  public List<Operator> getChildOperators() {
+    return null;
+  }
+
+  @Nullable
+  @Override
+  public String toExplainString() {
+    return null;

Review comment:
       Same here 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837100852



##########
File path: pinot-common/src/main/proto/mailbox.proto
##########
@@ -0,0 +1,55 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotMailbox {
+  rpc open(stream MailboxContent) returns (stream MailboxStatus);
+}
+
+message MailboxStatus {
+  string mailboxId = 1;

Review comment:
       Why string ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837156869



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;
+  repeated string scannedTables = 2;
+  map<string, SegmentMetadata> segmentMetadata = 3;
+}

Review comment:
       Not sure I follow this. This is a mapping from `segmentName` to what ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837153804



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;

Review comment:
       I suggest using `int32`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   > :exclamation: Current head e2c08f6 differs from pull request most recent head 7486c22. Consider uploading reports for the commit 7486c22 to get more accurate results
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   66.96%           
     Complexity                          ?     4194           
   ===========================================================
     Files                               ?     1252           
     Lines                               ?    63028           
     Branches                            ?     9837           
   ===========================================================
     Hits                                ?    42209           
     Misses                              ?    17805           
     Partials                            ?     3014           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `66.96% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...7486c22](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837151834



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {

Review comment:
       Can you clarify if this StagePlan is a sub-plan / sub-tree specific to a worker ? May be add brief comment please




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837170772



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {

Review comment:
       I am confused by the purpose of this class.
   
   Does this represent the final distributed multi-stage plan built at the broker in `QueryEnvironment` ? If so, then this should be defined as a protobuf message imo




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837170999



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.

Review comment:
       The javadoc is probably for some other class ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
walterddr commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r840949126



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/DataTableBlock.java
##########
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.blocks;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Block;
+import org.apache.pinot.core.common.BlockDocIdSet;
+import org.apache.pinot.core.common.BlockDocIdValueSet;
+import org.apache.pinot.core.common.BlockMetadata;
+import org.apache.pinot.core.common.BlockValSet;
+import org.apache.pinot.core.operator.blocks.InstanceResponseBlock;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * A {@code DataTableBlock} is a row-based data block backed by a {@link DataTable}.
+ */
+public class DataTableBlock implements Block {

Review comment:
       we should probably estimate whether it is better to send data over the wire using columnar format
   1. there's tradeoff when encoding data in row vs. columnar format when key-partitioned block split operation is done (for example distributed hash-join)
   2. there's tradeoff when data arrived at the receiving end and being processed (obviously columnar format wins in many of the situation)
   in this case let's come up with a columnar format as well as a row-based format (for the data block sent over mailbox)
   
   because if columnar format turns up performing better, we don't need to reinvent a row-base operator code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837195252



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {
+  private final String _stageId;

Review comment:
       My understanding is that we are essentially setting up a sub-tree comprising of one or more `StageNode` on a particular executor / server. Hence, stageId should be part of the root stage imo




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   > :exclamation: Current head e2c08f6 differs from pull request most recent head 7486c22. Consider uploading reports for the commit 7486c22 to get more accurate results
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   64.04%           
     Complexity                          ?     4444           
   ===========================================================
     Files                               ?     1657           
     Lines                               ?    85812           
     Branches                            ?    12959           
   ===========================================================
     Hits                                ?    54959           
     Misses                              ?    26901           
     Partials                            ?     3952           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `66.96% <0.00%> (?)` | |
   | unittests2 | `14.85% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...7486c22](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837155996



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;
+  repeated string scannedTables = 2;

Review comment:
       (nit) dataSources is a better name considering the fact that only leaf stages will scan from actual "tables" and intermediary stages will take input from exchange output ?

##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {
+  bytes serializedStagePlan = 1;
+}
+
+message StageMetadata {
+  repeated string instances = 1;
+  repeated string scannedTables = 2;

Review comment:
       (nit) `dataSources` is a better name considering the fact that only leaf stages will scan from actual "tables" and intermediary stages will take input from exchange output ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   70.72%           
     Complexity                          ?     4444           
   ===========================================================
     Files                               ?     1703           
     Lines                               ?    87694           
     Branches                            ?    13162           
   ===========================================================
     Hits                                ?    62019           
     Misses                              ?    21414           
     Partials                            ?     4261           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.63% <0.00%> (?)` | |
   | integration2 | `27.20% <0.00%> (?)` | |
   | unittests1 | `66.97% <0.00%> (?)` | |
   | unittests2 | `14.88% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...5db80b5](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#issuecomment-1079168510


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > :exclamation: No coverage uploaded for pull request base (`multi_stage_query_engine@5984af9`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   ```diff
   @@                     Coverage Diff                     @@
   ##             multi_stage_query_engine    #8412   +/-   ##
   ===========================================================
     Coverage                            ?   64.01%           
     Complexity                          ?     4436           
   ===========================================================
     Files                               ?     1659           
     Lines                               ?    85815           
     Branches                            ?    12959           
   ===========================================================
     Hits                                ?    54931           
     Misses                              ?    26932           
     Partials                            ?     3952           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `66.92% <0.00%> (?)` | |
   | unittests2 | `14.85% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5984af9...082f5b5](https://codecov.io/gh/apache/pinot/pull/8412?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
walterddr commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r840926240



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,56 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+import "plan.proto";
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  StagePlan stagePlan = 2;
+}
+
+message QueryResponse {

Review comment:
       queryresponse is a bad word. we should make it a dispatchResponse because the actual data doesnt' come back from this result




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837151834



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;
+  StagePlan stagePlan = 3;
+  map<string, StageMetadata> stageMetadata = 4;
+}
+
+message StagePlan {

Review comment:
       Can you clarify if this StagePlan is a sub-plan specific to a worker ? May be add brief comment please




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837200855



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/DistributedQueryPlan.java
##########
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.plan;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.pinot.core.transport.ServerInstance;
+import org.apache.pinot.query.planner.StageMetadata;
+import org.apache.pinot.query.planner.nodes.StageNode;
+
+
+/**
+ * WorkerQueryRequest is the extended version of the {@link org.apache.pinot.core.query.request.ServerQueryRequest}.
+ */
+public class DistributedQueryPlan implements Serializable {
+  private final String _stageId;
+  private ServerInstance _serverInstance;
+  private StageNode _stageRoot;
+  private Map<String, StageMetadata> _metadataMap;

Review comment:
       I think we need to revisit this abstraction. It looks very confusing here
   
   Multi-stage `QueryPlan` built at broker comprises of multiple stages. Since each StageNode will be executed on some node, that information must already be available at the time of building the QueryPlan.
   
   So, then why should the information about `serverInstance` be tracked separately ? In fact, we are tracking it in both StageMetadata and then again outside here in the deserialized plan.
   
   The StageNode should be self-contained ideally
   
   - root or leaf stage
   - stageId
   - operator tree
   - executor assignment
   
   May be it's ok to track the assignment inside metadata but not sure why we need list of instances since a given stage will be executed exactly once
   
   ```
   // used for assigning server/worker nodes.
     private List<ServerInstance> _serverInstances;
   ```
   
   Let's discuss this briefly please




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837235644



##########
File path: pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/DataTableBlock.java
##########
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.query.runtime.blocks;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.Block;
+import org.apache.pinot.core.common.BlockDocIdSet;
+import org.apache.pinot.core.common.BlockDocIdValueSet;
+import org.apache.pinot.core.common.BlockMetadata;
+import org.apache.pinot.core.common.BlockValSet;
+import org.apache.pinot.core.operator.blocks.InstanceResponseBlock;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+/**
+ * A {@code DataTableBlock} is a row-based data block backed by a {@link DataTable}.
+ */
+public class DataTableBlock implements Block {

Review comment:
       We should discuss this a little bit more before going down the **row-oriented** path
   
   I think we will end up rewriting all the existing **columnar** operators into their corresponding **row-wise** counterparts in this new multi-stage query engine and that is probably a lot of unnecessary / avoidable work if we could just use a columnar / record batch like wire format for exchange messages between executors for shuffle / partition / broadcast and at least try to use existing columnar operators. 
   
   I think it is ok to see if reusing existing columnar operators is possible in the new engine or may be just copy them over here and do minimal changes as needed. So it's probably ok to have 2 different versions of columnar operators (if reuse is absolutely impossible) -- one for existing engine and one for new engine
   
   What is more concerning to me is that since we are columnar (storage format, inmemory format and execution wise), I think we should stick with that for performance (and related benefits in OLAP) reasons because otherwise we will find ourselves doing another rewrite
   
   - rewrite all current columnar operators into rowwise for multi-engine to work
   - rewrite all multi-engine row-wise operators at a later point in future into columnar counterparts to improve performance (I am certain this will be one of the things targeted later to improve performance)
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #8412: add pinot-query-runtime

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #8412:
URL: https://github.com/apache/pinot/pull/8412#discussion_r837153804



##########
File path: pinot-common/src/main/proto/worker.proto
##########
@@ -0,0 +1,75 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+//
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+package org.apache.pinot.common.proto;
+
+service PinotQueryWorker {
+  rpc Submit(QueryRequest) returns (QueryResponse);
+}
+
+message QueryRequest {
+  map<string, string> metadata = 1;
+  QueryPlan queryPlan = 2;
+}
+
+message QueryResponse {
+  map<string, string> metadata = 1;
+  bytes payload = 2;
+}
+
+message QueryPlan {
+  string stageId = 1;
+  string instanceId = 2;

Review comment:
       I suggest using `int32` as discussed in couple of places in the previous PR https://github.com/apache/pinot/pull/8340/




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org