You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "xiangfu0 (via GitHub)" <gi...@apache.org> on 2023/07/06 22:14:34 UTC

[GitHub] [pinot] xiangfu0 opened a new pull request, #11048: [multistage] Register Pinot Transform Functions into Calcite

xiangfu0 opened a new pull request, #11048:
URL: https://github.com/apache/pinot/pull/11048

   - Introduce PinotSqlTransformFunction
   - Use jsonExtractScalar as first example
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257600731


##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/MultiStageEngineJsonPathClusterIntegrationTest.java:
##########
@@ -0,0 +1,77 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.integration.tests;
+
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.node.ArrayNode;
+import com.google.common.collect.ImmutableMap;
+import java.util.Properties;
+import org.apache.pinot.client.Connection;
+import org.apache.pinot.client.ConnectionFactory;
+import org.testng.Assert;
+import org.testng.annotations.Test;
+
+
+public class MultiStageEngineJsonPathClusterIntegrationTest extends JsonPathClusterIntegrationTest {

Review Comment:
   This test can be simplified once https://github.com/apache/pinot/pull/11064 is merged.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257383027


##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -207,6 +222,49 @@ public final void initNoDuplicate() {
     }
   }
 
+  /**
+   * This registers transform functions defined in {@link org.apache.pinot.common.function.TransformFunctionType}
+   * which are multistage enabled.
+   */
+  private void initTransformFunctions() {

Review Comment:
   ack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1259117729


##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -207,6 +222,49 @@ public final void initNoDuplicate() {
     }
   }
 
+  /**
+   * This registers transform functions defined in {@link org.apache.pinot.common.function.TransformFunctionType}
+   * which are multistage enabled.
+   */
+  private void initTransformFunctions() {
+    // Walk through all the Pinot transform types and register those that are supported in multistage.
+    for (TransformFunctionType transformFunctionType : TransformFunctionType.values()) {
+      if (transformFunctionType.getSqlKind() == null) {
+        // Skip registering functions which are standard Calcite functions and functions which are not yet supported.
+        continue;
+      }
+
+      // Register the aggregation function with Calcite along with all alternative names
+      List<PinotSqlTransformFunction> sqlTransformFunctions = new ArrayList<>();

Review Comment:
   this list is not necessary, you can simply register this one-by-one



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr merged pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr merged PR #11048:
URL: https://github.com/apache/pinot/pull/11048


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #11048:
URL: https://github.com/apache/pinot/pull/11048#issuecomment-1624478671

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11048?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#11048](https://app.codecov.io/gh/apache/pinot/pull/11048?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (e98be57) into [master](https://app.codecov.io/gh/apache/pinot/commit/878f61357b6868d98a3eb8c84ec1dfb414c44b6f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (878f613) will **decrease** coverage by `0.01%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #11048      +/-   ##
   ==========================================
   - Coverage    0.11%    0.11%   -0.01%     
   ==========================================
     Files        2200     2200              
     Lines      118822   118833      +11     
     Branches    18002    17993       -9     
   ==========================================
     Hits          137      137              
   - Misses     118665   118676      +11     
     Partials       20       20              
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `0.00% <0.00%> (ø)` | |
   | integration1temurin17 | `0.00% <0.00%> (ø)` | |
   | integration1temurin20 | `0.00% <0.00%> (ø)` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `0.00% <0.00%> (ø)` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin11 | `?` | |
   | unittests1temurin17 | `0.00% <0.00%> (ø)` | |
   | unittests1temurin20 | `0.00% <0.00%> (ø)` | |
   | unittests2temurin11 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin17 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin20 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/11048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...e/pinot/common/function/TransformFunctionType.java](https://app.codecov.io/gh/apache/pinot/pull/11048?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vVHJhbnNmb3JtRnVuY3Rpb25UeXBlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   
   ... and [9 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/11048/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257379976


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java:
##########
@@ -298,24 +298,32 @@ public boolean isSuperTypeOf(ColumnDataType subTypeCandidate) {
     public DataType toDataType() {
       switch (this) {
         case INT:
+        case INT_ARRAY:

Review Comment:
   The issue is that json_extract_key returns array.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1259200192


##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -207,6 +222,49 @@ public final void initNoDuplicate() {
     }
   }
 
+  /**
+   * This registers transform functions defined in {@link org.apache.pinot.common.function.TransformFunctionType}
+   * which are multistage enabled.
+   */
+  private void initTransformFunctions() {
+    // Walk through all the Pinot transform types and register those that are supported in multistage.
+    for (TransformFunctionType transformFunctionType : TransformFunctionType.values()) {
+      if (transformFunctionType.getSqlKind() == null) {
+        // Skip registering functions which are standard Calcite functions and functions which are not yet supported.
+        continue;
+      }
+
+      // Register the aggregation function with Calcite along with all alternative names
+      List<PinotSqlTransformFunction> sqlTransformFunctions = new ArrayList<>();

Review Comment:
   done.



##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -207,6 +222,49 @@ public final void initNoDuplicate() {
     }
   }
 
+  /**
+   * This registers transform functions defined in {@link org.apache.pinot.common.function.TransformFunctionType}
+   * which are multistage enabled.
+   */
+  private void initTransformFunctions() {
+    // Walk through all the Pinot transform types and register those that are supported in multistage.
+    for (TransformFunctionType transformFunctionType : TransformFunctionType.values()) {
+      if (transformFunctionType.getSqlKind() == null) {
+        // Skip registering functions which are standard Calcite functions and functions which are not yet supported.
+        continue;
+      }
+
+      // Register the aggregation function with Calcite along with all alternative names
+      List<PinotSqlTransformFunction> sqlTransformFunctions = new ArrayList<>();
+      PinotSqlTransformFunction aggFunction = generatePinotSqlTransformFunction(transformFunctionType.getName(),
+          transformFunctionType, false);
+      sqlTransformFunctions.add(aggFunction);
+      List<String> alternativeFunctionNames = transformFunctionType.getAliases();
+      if (alternativeFunctionNames == null || alternativeFunctionNames.size() == 0) {
+        // If no alternative function names are specified, generate one which converts camel case to have underscores
+        // as delimiters instead. E.g. boolAnd -> BOOL_AND
+        String alternativeFunctionName =
+            convertCamelCaseToUseUnderscores(transformFunctionType.getName());

Review Comment:
   removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257225029


##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -248,4 +316,16 @@ private boolean isPinotAggregationFunction(String name) {
     }
     return aggFunctionType != null && !aggFunctionType.isNativeCalciteAggregationFunctionType();
   }
+
+  private boolean isPinotTransformFunction(String name) {
+    TransformFunctionType transformFunctionType = null;
+    if (isTransformFunctionRegisteredWithOperatorTable(name)) {
+      try {
+        transformFunctionType = TransformFunctionType.valueOf(name);
+      } catch (IllegalArgumentException e) {
+        // Ignore

Review Comment:
   Expect null if function is not recognized.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257379055


##########
pinot-common/src/main/java/org/apache/pinot/common/function/TransformFunctionType.java:
##########
@@ -76,8 +89,13 @@ public enum TransformFunctionType {
   CAST("cast"),
 
   // string functions
-  JSONEXTRACTSCALAR("jsonExtractScalar"),
-  JSONEXTRACTKEY("jsonExtractKey"),
+  JSONEXTRACTSCALAR("jsonExtractScalar", ReturnTypes.cascade(
+      opBinding -> inferJsonExtractScalarExplicitTypeSpec(opBinding).orElse(
+          opBinding.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 2000)), SqlTypeTransforms.FORCE_NULLABLE),

Review Comment:
   we dont need to limit varchar to 2000 here



##########
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionRegistry.java:
##########
@@ -187,7 +187,7 @@ public static Object jsonExtractScalar(String jsonFieldName, String jsonPath, St
     }
 
     @ScalarFunction(names = {"jsonExtractKey", "json_extract_key"}, isPlaceholder = true)
-    public static String jsonExtractKey(String jsonFieldName, String jsonPath) {
+    public static Object jsonExtractKey(String jsonFieldName, String jsonPath) {

Review Comment:
   JsonExtractKey always returns String right? a JSON key cannot be any other type?



##########
pinot-common/src/main/java/org/apache/pinot/common/function/TransformFunctionType.java:
##########
@@ -180,4 +210,46 @@ public String getName() {
   public List<String> getAliases() {
     return _aliases;
   }
+
+  public SqlReturnTypeInference getSqlReturnTypeInference() {
+    return _sqlReturnTypeInference;
+  }
+
+  public SqlOperandTypeChecker getSqlOperandTypeChecker() {
+    return _sqlOperandTypeChecker;
+  }
+
+  /** Returns the optional explicit returning type specification. */
+  private static Optional<RelDataType> inferJsonExtractScalarExplicitTypeSpec(SqlOperatorBinding opBinding) {
+    if (opBinding.getOperandCount() > 2
+        && opBinding.isOperandLiteral(2, false)) {
+      String operandType = opBinding.getOperandLiteralValue(2, String.class).toUpperCase();
+      return Optional.of(inferExplicitTypeSpec(operandType, opBinding.getTypeFactory()));
+    }
+    return Optional.empty();
+  }
+
+  private static RelDataType inferExplicitTypeSpec(String operandType, RelDataTypeFactory typeFactory) {
+    switch (operandType) {
+      case "INT":
+      case "INTEGER":

Review Comment:
   INTEGER is valid SQL type name



##########
pinot-common/src/main/java/org/apache/pinot/common/function/TransformFunctionType.java:
##########
@@ -180,4 +210,46 @@ public String getName() {
   public List<String> getAliases() {
     return _aliases;
   }
+
+  public SqlReturnTypeInference getSqlReturnTypeInference() {
+    return _sqlReturnTypeInference;
+  }
+
+  public SqlOperandTypeChecker getSqlOperandTypeChecker() {
+    return _sqlOperandTypeChecker;
+  }
+
+  /** Returns the optional explicit returning type specification. */
+  private static Optional<RelDataType> inferJsonExtractScalarExplicitTypeSpec(SqlOperatorBinding opBinding) {
+    if (opBinding.getOperandCount() > 2
+        && opBinding.isOperandLiteral(2, false)) {
+      String operandType = opBinding.getOperandLiteralValue(2, String.class).toUpperCase();
+      return Optional.of(inferExplicitTypeSpec(operandType, opBinding.getTypeFactory()));
+    }
+    return Optional.empty();
+  }
+
+  private static RelDataType inferExplicitTypeSpec(String operandType, RelDataTypeFactory typeFactory) {

Review Comment:
   ```suggestion
     private static RelDataType inferExplicitTypeSpec(String operandTypeStr, RelDataTypeFactory typeFactory) {
   ```



##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -248,4 +316,16 @@ private boolean isPinotAggregationFunction(String name) {
     }
     return aggFunctionType != null && !aggFunctionType.isNativeCalciteAggregationFunctionType();
   }
+
+  private boolean isPinotTransformFunction(String name) {
+    TransformFunctionType transformFunctionType = null;
+    if (isTransformFunctionRegisteredWithOperatorTable(name)) {
+      try {
+        transformFunctionType = TransformFunctionType.valueOf(name);
+      } catch (IllegalArgumentException e) {
+        // Ignore

Review Comment:
   this error shouldn't occur if the isTransform... method is correctly capture the name



##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -207,6 +222,49 @@ public final void initNoDuplicate() {
     }
   }
 
+  /**
+   * This registers transform functions defined in {@link org.apache.pinot.common.function.TransformFunctionType}
+   * which are multistage enabled.
+   */
+  private void initTransformFunctions() {

Review Comment:
   see my change in https://github.com/apache/pinot/pull/11068 
   this method can be greatly simplified



##########
pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java:
##########
@@ -298,24 +298,32 @@ public boolean isSuperTypeOf(ColumnDataType subTypeCandidate) {
     public DataType toDataType() {
       switch (this) {
         case INT:
+        case INT_ARRAY:

Review Comment:
   why this change? let's dont mix array support in here if we can make that separate



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on PR #11048:
URL: https://github.com/apache/pinot/pull/11048#issuecomment-1626392152

   Can we add the [UTs](https://github.com/apache/pinot/blob/aee992c9a79b1651de0d37979b8c1a56bf2cea53/pinot-query-runtime/src/test/resources/queries/JsonType.json) for the jsonExtractScalar? Thanks.
   
   Let me know; I can push the UT changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257381062


##########
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionRegistry.java:
##########
@@ -187,7 +187,7 @@ public static Object jsonExtractScalar(String jsonFieldName, String jsonPath, St
     }
 
     @ScalarFunction(names = {"jsonExtractKey", "json_extract_key"}, isPlaceholder = true)
-    public static String jsonExtractKey(String jsonFieldName, String jsonPath) {
+    public static Object jsonExtractKey(String jsonFieldName, String jsonPath) {

Review Comment:
   ah.... i see



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] abhioncbr commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "abhioncbr (via GitHub)" <gi...@apache.org>.
abhioncbr commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1256666209


##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -248,4 +316,16 @@ private boolean isPinotAggregationFunction(String name) {
     }
     return aggFunctionType != null && !aggFunctionType.isNativeCalciteAggregationFunctionType();
   }
+
+  private boolean isPinotTransformFunction(String name) {
+    TransformFunctionType transformFunctionType = null;
+    if (isTransformFunctionRegisteredWithOperatorTable(name)) {
+      try {
+        transformFunctionType = TransformFunctionType.valueOf(name);
+      } catch (IllegalArgumentException e) {
+        // Ignore

Review Comment:
   Any particular reason for ignoring this exception?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1259118145


##########
pinot-query-planner/src/main/java/org/apache/calcite/sql/fun/PinotOperatorTable.java:
##########
@@ -207,6 +222,49 @@ public final void initNoDuplicate() {
     }
   }
 
+  /**
+   * This registers transform functions defined in {@link org.apache.pinot.common.function.TransformFunctionType}
+   * which are multistage enabled.
+   */
+  private void initTransformFunctions() {
+    // Walk through all the Pinot transform types and register those that are supported in multistage.
+    for (TransformFunctionType transformFunctionType : TransformFunctionType.values()) {
+      if (transformFunctionType.getSqlKind() == null) {
+        // Skip registering functions which are standard Calcite functions and functions which are not yet supported.
+        continue;
+      }
+
+      // Register the aggregation function with Calcite along with all alternative names
+      List<PinotSqlTransformFunction> sqlTransformFunctions = new ArrayList<>();
+      PinotSqlTransformFunction aggFunction = generatePinotSqlTransformFunction(transformFunctionType.getName(),
+          transformFunctionType, false);
+      sqlTransformFunctions.add(aggFunction);
+      List<String> alternativeFunctionNames = transformFunctionType.getAliases();
+      if (alternativeFunctionNames == null || alternativeFunctionNames.size() == 0) {
+        // If no alternative function names are specified, generate one which converts camel case to have underscores
+        // as delimiters instead. E.g. boolAnd -> BOOL_AND
+        String alternativeFunctionName =
+            convertCamelCaseToUseUnderscores(transformFunctionType.getName());

Review Comment:
   this convert should be moved to transform function type. this is the way we going to handle `_` separated vs camel case unless specific naming is specified. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1254983729


##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/JsonPathClusterIntegrationTest.java:
##########
@@ -260,15 +260,19 @@ public void testComplexQueries()
       Assert.assertEquals(k4.get("k4-k3"), "value-k4-k3-" + seqId);
       Assert.assertEquals(Double.parseDouble(k4.get("met").toString()), Double.parseDouble(seqId));
     }
+  }
 
+  @Test
+  public void testComplexQueries2()
+      throws Exception {
     //Group By Query
-    query = "Select" + " jsonExtractScalar(complexMapStr,'$.k1','STRING'),"
+    String query = "Select" + " jsonExtractScalar(complexMapStr,'$.k1','STRING'),"
         + " sum(jsonExtractScalar(complexMapStr,'$.k4.met','INT'))" + " from " + DEFAULT_TABLE_NAME
         + " group by jsonExtractScalar(complexMapStr,'$.k1','STRING')"
         + " order by sum(jsonExtractScalar(complexMapStr,'$.k4.met','INT')) DESC";
-    pinotResponse = postQuery(query);
+    JsonNode pinotResponse = postQuery(query);
     Assert.assertNotNull(pinotResponse.get("resultTable").get("rows"));
-    rows = (ArrayNode) pinotResponse.get("resultTable").get("rows");
+    ArrayNode rows = (ArrayNode) pinotResponse.get("resultTable").get("rows");
     for (int i = 0; i < rows.size(); i++) {
       String seqId = _sortedSequenceIds.get(NUM_TOTAL_DOCS - 1 - i);

Review Comment:
   Actually V1 results is wrong, when it's order by `sum(jsonExtractScalar(complexMapStr,'$.k4.met','INT'))`, the result is not in order of numbers, but lex.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257381275


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java:
##########
@@ -298,24 +298,32 @@ public boolean isSuperTypeOf(ColumnDataType subTypeCandidate) {
     public DataType toDataType() {
       switch (this) {
         case INT:
+        case INT_ARRAY:

Review Comment:
   i see. not sure if there's going to be any side effect of this change going forward with ARRAY supports. 
   can you add support like
   `SELECT cardinality(JSONEXTRACTKEY(...))` and see if that works?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1254983729


##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/JsonPathClusterIntegrationTest.java:
##########
@@ -260,15 +260,19 @@ public void testComplexQueries()
       Assert.assertEquals(k4.get("k4-k3"), "value-k4-k3-" + seqId);
       Assert.assertEquals(Double.parseDouble(k4.get("met").toString()), Double.parseDouble(seqId));
     }
+  }
 
+  @Test
+  public void testComplexQueries2()
+      throws Exception {
     //Group By Query
-    query = "Select" + " jsonExtractScalar(complexMapStr,'$.k1','STRING'),"
+    String query = "Select" + " jsonExtractScalar(complexMapStr,'$.k1','STRING'),"
         + " sum(jsonExtractScalar(complexMapStr,'$.k4.met','INT'))" + " from " + DEFAULT_TABLE_NAME
         + " group by jsonExtractScalar(complexMapStr,'$.k1','STRING')"
         + " order by sum(jsonExtractScalar(complexMapStr,'$.k4.met','INT')) DESC";
-    pinotResponse = postQuery(query);
+    JsonNode pinotResponse = postQuery(query);
     Assert.assertNotNull(pinotResponse.get("resultTable").get("rows"));
-    rows = (ArrayNode) pinotResponse.get("resultTable").get("rows");
+    ArrayNode rows = (ArrayNode) pinotResponse.get("resultTable").get("rows");
     for (int i = 0; i < rows.size(); i++) {
       String seqId = _sortedSequenceIds.get(NUM_TOTAL_DOCS - 1 - i);

Review Comment:
   Actually V1 results is wrong, when it's order by `sum(jsonExtractScalar(complexMapStr,'$.k4.met','INT'))`, the result is not in order of numbers, but LEX.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on PR #11048:
URL: https://github.com/apache/pinot/pull/11048#issuecomment-1627525393

   > Can we add the [UTs](https://github.com/apache/pinot/blob/aee992c9a79b1651de0d37979b8c1a56bf2cea53/pinot-query-runtime/src/test/resources/queries/JsonType.json) for the jsonExtractScalar? Thanks.
   > 
   > Let me know; I can push the UT changes.
   
   I will add. Meanwhile, I will wait for 
   
   
   > Can we add the [UTs](https://github.com/apache/pinot/blob/aee992c9a79b1651de0d37979b8c1a56bf2cea53/pinot-query-runtime/src/test/resources/queries/JsonType.json) for the jsonExtractScalar? Thanks.
   > 
   > Let me know; I can push the UT changes.
   
   Added support for jsonExtractKey


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257379893


##########
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionRegistry.java:
##########
@@ -187,7 +187,7 @@ public static Object jsonExtractScalar(String jsonFieldName, String jsonPath, St
     }
 
     @ScalarFunction(names = {"jsonExtractKey", "json_extract_key"}, isPlaceholder = true)
-    public static String jsonExtractKey(String jsonFieldName, String jsonPath) {
+    public static Object jsonExtractKey(String jsonFieldName, String jsonPath) {

Review Comment:
   From the test, seem the result type is an array of string



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #11048: [multistage] Register Pinot Transform Functions into Calcite

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #11048:
URL: https://github.com/apache/pinot/pull/11048#discussion_r1257440473


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/DataSchema.java:
##########
@@ -298,24 +298,32 @@ public boolean isSuperTypeOf(ColumnDataType subTypeCandidate) {
     public DataType toDataType() {
       switch (this) {
         case INT:
+        case INT_ARRAY:

Review Comment:
   Added a sample query in unit test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org