You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "Jackie-Jiang (via GitHub)" <gi...@apache.org> on 2023/03/10 01:56:19 UTC

[GitHub] [pinot] Jackie-Jiang opened a new pull request, #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Jackie-Jiang opened a new pull request, #10405:
URL: https://github.com/apache/pinot/pull/10405

   Both `Projection` and `Transform` in pinot are SQL project operation.
   This PR introduce the `BaseProjectOperator` base class that represent the executor for SQL project, which generates the `ValueBlock`.
   The `BaseProjectOperator` is designed in a way that it can chain itself (take `ValueBlock` as both input and output).
   With the change, pass-through transform is no longer needed because high level operator (e.g. selection/aggregation/group-by) can directly take the output of `ProjectionOperator`.
   In the future, we may introduce more project operators (e.g. filter on transform, local join), and make flexible query plan to solve more complex queries.
   
   ## Incompatible (compile)
   Several interfaces are changed to use the new interface/class


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #10405:
URL: https://github.com/apache/pinot/pull/10405


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #10405:
URL: https://github.com/apache/pinot/pull/10405#discussion_r1134726150


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/query/AggregationOperator.java:
##########
@@ -63,25 +62,25 @@ protected AggregationResultsBlock getNextBlock() {
     } else {
       aggregationExecutor = new DefaultAggregationExecutor(_aggregationFunctions);
     }
-    TransformBlock transformBlock;
-    while ((transformBlock = _transformOperator.nextBlock()) != null) {
-      _numDocsScanned += transformBlock.getNumDocs();
-      aggregationExecutor.aggregate(transformBlock);
+    ValueBlock valueBlock;
+    while ((valueBlock = _projectOperator.nextBlock()) != null) {
+      _numDocsScanned += valueBlock.getNumDocs();
+      aggregationExecutor.aggregate(valueBlock);
     }
 
     // Build intermediate result block based on aggregation result from the executor
     return new AggregationResultsBlock(_aggregationFunctions, aggregationExecutor.getResult());
   }
 
   @Override
-  public List<Operator> getChildOperators() {
-    return Collections.singletonList(_transformOperator);
+  public List<BaseProjectOperator<?>> getChildOperators() {

Review Comment:
   IDE is showing warning because the generic type is not included



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #10405:
URL: https://github.com/apache/pinot/pull/10405#discussion_r1134727652


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/BaseProjectOperator.java:
##########
@@ -0,0 +1,44 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator;
+
+import java.util.Map;
+import org.apache.pinot.common.request.context.ExpressionContext;
+import org.apache.pinot.core.operator.blocks.ValueBlock;
+
+
+public abstract class BaseProjectOperator<T extends ValueBlock> extends BaseOperator<T> {

Review Comment:
   We don't right now, but I do want to make it more specific since `ValueBlock` is not a concrete class



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #10405:
URL: https://github.com/apache/pinot/pull/10405#discussion_r1134665549


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/ExpressionFilterOperator.java:
##########


Review Comment:
   Shall we consider make ExpressionDocIdSet also take columnContextMap instead of _dataSourceMap?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10405:
URL: https://github.com/apache/pinot/pull/10405#issuecomment-1464803397

   ## [Codecov](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#10405](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8bfe328) into [master](https://codecov.io/gh/apache/pinot/commit/2607c0ee1198daed32845ac25b14f76c7170e43c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2607c0e) will **increase** coverage by `36.22%`.
   > The diff coverage is `81.11%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #10405       +/-   ##
   =============================================
   + Coverage     32.16%   68.39%   +36.22%     
   - Complexity      273     6074     +5801     
   =============================================
     Files          2051     2049        -2     
     Lines        111229   111225        -4     
     Branches      16913    16916        +3     
   =============================================
   + Hits          35774    76068    +40294     
   + Misses        72261    29725    -42536     
   - Partials       3194     5432     +2238     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `24.50% <31.11%> (+0.01%)` | :arrow_up: |
   | unittests1 | `67.96% <78.20%> (?)` | |
   | unittests2 | `13.89% <0.00%> (+0.03%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ospatial/transform/function/StPolygonFunction.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9nZW9zcGF0aWFsL3RyYW5zZm9ybS9mdW5jdGlvbi9TdFBvbHlnb25GdW5jdGlvbi5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...t/core/operator/filter/CombinedFilterOperator.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvQ29tYmluZWRGaWx0ZXJPcGVyYXRvci5qYXZh) | `64.28% <0.00%> (+64.28%)` | :arrow_up: |
   | [...ator/streaming/StreamingSelectionOnlyOperator.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nU2VsZWN0aW9uT25seU9wZXJhdG9yLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../function/SelectTupleElementTransformFunction.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vU2VsZWN0VHVwbGVFbGVtZW50VHJhbnNmb3JtRnVuY3Rpb24uamF2YQ==) | `86.04% <ø> (+86.04%)` | :arrow_up: |
   | [...he/pinot/core/plan/StreamingSelectionPlanNode.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ1NlbGVjdGlvblBsYW5Ob2RlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...e/query/aggregation/groupby/GroupKeyGenerator.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L0dyb3VwS2V5R2VuZXJhdG9yLmphdmE=) | `100.00% <ø> (ø)` | |
   | [.../transform/function/ArrayMaxTransformFunction.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQXJyYXlNYXhUcmFuc2Zvcm1GdW5jdGlvbi5qYXZh) | `27.77% <13.33%> (+27.77%)` | :arrow_up: |
   | [.../transform/function/ArrayMinTransformFunction.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQXJyYXlNaW5UcmFuc2Zvcm1GdW5jdGlvbi5qYXZh) | `27.77% <13.33%> (+27.77%)` | :arrow_up: |
   | [...he/pinot/core/operator/blocks/ProjectionBlock.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9ibG9ja3MvUHJvamVjdGlvbkJsb2NrLmphdmE=) | `68.75% <33.33%> (+23.92%)` | :arrow_up: |
   | [...nsform/function/ArrayAverageTransformFunction.java](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci90cmFuc2Zvcm0vZnVuY3Rpb24vQXJyYXlBdmVyYWdlVHJhbnNmb3JtRnVuY3Rpb24uamF2YQ==) | `43.47% <40.00%> (+43.47%)` | :arrow_up: |
   | ... and [109 more](https://codecov.io/gh/apache/pinot/pull/10405?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ... and [1078 files with indirect coverage changes](https://codecov.io/gh/apache/pinot/pull/10405/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10405: [Refactor] Introduce BaseProjectOperator and ValueBlock

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10405:
URL: https://github.com/apache/pinot/pull/10405#discussion_r1134721850


##########
pinot-core/src/main/java/org/apache/pinot/core/geospatial/transform/function/BaseBinaryGeoTransformFunction.java:
##########
@@ -46,17 +46,15 @@ public abstract class BaseBinaryGeoTransformFunction extends BaseTransformFuncti
   private double[] _doubleResults;
 
   @Override
-  public void init(List<TransformFunction> arguments, Map<String, DataSource> dataSourceMap) {
-    Preconditions
-        .checkArgument(arguments.size() == 2, "2 arguments are required for transform function: %s", getName());
+  public void init(List<TransformFunction> arguments, Map<String, ColumnContext> columnContextMap) {

Review Comment:
   probably include `ColumnContext` in the PR description for the refactor/rename



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/BaseProjectOperator.java:
##########
@@ -0,0 +1,44 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator;
+
+import java.util.Map;
+import org.apache.pinot.common.request.context.ExpressionContext;
+import org.apache.pinot.core.operator.blocks.ValueBlock;
+
+
+public abstract class BaseProjectOperator<T extends ValueBlock> extends BaseOperator<T> {

Review Comment:
   do we need the generic here? seems like the usages are always ValueBlock instead of the generic extension of `T` right? any chance we want to use it differently going forward?



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/query/AggregationOperator.java:
##########
@@ -63,25 +62,25 @@ protected AggregationResultsBlock getNextBlock() {
     } else {
       aggregationExecutor = new DefaultAggregationExecutor(_aggregationFunctions);
     }
-    TransformBlock transformBlock;
-    while ((transformBlock = _transformOperator.nextBlock()) != null) {
-      _numDocsScanned += transformBlock.getNumDocs();
-      aggregationExecutor.aggregate(transformBlock);
+    ValueBlock valueBlock;
+    while ((valueBlock = _projectOperator.nextBlock()) != null) {
+      _numDocsScanned += valueBlock.getNumDocs();
+      aggregationExecutor.aggregate(valueBlock);
     }
 
     // Build intermediate result block based on aggregation result from the executor
     return new AggregationResultsBlock(_aggregationFunctions, aggregationExecutor.getResult());
   }
 
   @Override
-  public List<Operator> getChildOperators() {
-    return Collections.singletonList(_transformOperator);
+  public List<BaseProjectOperator<?>> getChildOperators() {

Review Comment:
   This API needs not to be change right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org