You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "xiangfu0 (via GitHub)" <gi...@apache.org> on 2023/07/20 23:20:54 UTC

[GitHub] [pinot] xiangfu0 opened a new pull request, #11143: Register distinctCountThetaSketch in v2

xiangfu0 opened a new pull request, #11143:
URL: https://github.com/apache/pinot/pull/11143

   - Register `distinctCountThetaSketch` and `distinctCountRawThetaSketch` in `AggregationFunctionType`
   - Enable `ThetaSketchIntegrationTest` for v2 query engine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11143: [multistage] Register theta sketch aggregation functions in v2 query engine

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11143:
URL: https://github.com/apache/pinot/pull/11143#discussion_r1271539937


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java:
##########
@@ -88,8 +88,14 @@ public enum AggregationFunctionType {
   DISTINCTCOUNTRAWHLL("distinctCountRawHLL"),
   DISTINCTCOUNTSMARTHLL("distinctCountSmartHLL"),
   FASTHLL("fastHLL"),
-  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch"),
-  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch"),
+  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch", ImmutableList.of("DISTINCT_COUNT_THETA_SKETCH"),
+      SqlKind.OTHER_FUNCTION, SqlFunctionCategory.USER_DEFINED_FUNCTION,
+      OperandTypes.family(ImmutableList.of(SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER), ordinal -> ordinal > 0),
+      ReturnTypes.BIGINT, ReturnTypes.explicit(SqlTypeName.OTHER)),
+  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch", ImmutableList.of("DISTINCT_COUNT_RAW_THETA_SKETCH"),
+      SqlKind.OTHER_FUNCTION, SqlFunctionCategory.USER_DEFINED_FUNCTION,
+      OperandTypes.family(ImmutableList.of(SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER), ordinal -> ordinal > 0),
+      ReturnTypes.BIGINT, ReturnTypes.explicit(SqlTypeName.OTHER)),

Review Comment:
   RAW return type is a base64 encoded string
   ```suggestion
         ReturnTypes.VARCHAR, ReturnTypes.explicit(SqlTypeName.OTHER)),
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] [multistage] Register theta sketch aggregation functions in v2 query engine [pinot]

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11143:
URL: https://github.com/apache/pinot/pull/11143#discussion_r1448214895


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java:
##########
@@ -88,8 +88,14 @@ public enum AggregationFunctionType {
   DISTINCTCOUNTRAWHLL("distinctCountRawHLL"),
   DISTINCTCOUNTSMARTHLL("distinctCountSmartHLL"),
   FASTHLL("fastHLL"),
-  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch"),
-  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch"),
+  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch", ImmutableList.of("DISTINCT_COUNT_THETA_SKETCH"),
+      SqlKind.OTHER_FUNCTION, SqlFunctionCategory.USER_DEFINED_FUNCTION,
+      OperandTypes.family(ImmutableList.of(SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER), ordinal -> ordinal > 0),
+      ReturnTypes.BIGINT, ReturnTypes.explicit(SqlTypeName.OTHER)),
+  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch", ImmutableList.of("DISTINCT_COUNT_RAW_THETA_SKETCH"),
+      SqlKind.OTHER_FUNCTION, SqlFunctionCategory.USER_DEFINED_FUNCTION,
+      OperandTypes.family(ImmutableList.of(SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER), ordinal -> ordinal > 0),
+      ReturnTypes.VARCHAR_2000, ReturnTypes.explicit(SqlTypeName.OTHER)),

Review Comment:
   this ReturnType is suppose to be VARBINARY right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11143: [multistage] Register theta sketch aggregation functions in v2 query engine

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11143:
URL: https://github.com/apache/pinot/pull/11143#discussion_r1270124518


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java:
##########
@@ -88,8 +88,12 @@ public enum AggregationFunctionType {
   DISTINCTCOUNTRAWHLL("distinctCountRawHLL"),
   DISTINCTCOUNTSMARTHLL("distinctCountSmartHLL"),
   FASTHLL("fastHLL"),
-  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch"),
-  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch"),
+  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch", ImmutableList.of("DISTINCT_COUNT_THETA_SKETCH"),
+      SqlKind.OTHER_FUNCTION, SqlFunctionCategory.USER_DEFINED_FUNCTION, OperandTypes.VARIADIC, ReturnTypes.BIGINT,

Review Comment:
   VARIADIC is problematic. we can only run these in v1. i don't know if we should do this and allow the problematic syntax in v2.
   
   we should only allow
   1. single column (1op)
   2. single column + theta sketch config (2op)
   
   and not allow the variating filter + set operation in v2



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #11143: [multistage] Register theta sketch aggregation functions in v2 query engine

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #11143:
URL: https://github.com/apache/pinot/pull/11143#discussion_r1270124518


##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java:
##########
@@ -88,8 +88,12 @@ public enum AggregationFunctionType {
   DISTINCTCOUNTRAWHLL("distinctCountRawHLL"),
   DISTINCTCOUNTSMARTHLL("distinctCountSmartHLL"),
   FASTHLL("fastHLL"),
-  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch"),
-  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch"),
+  DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch", ImmutableList.of("DISTINCT_COUNT_THETA_SKETCH"),
+      SqlKind.OTHER_FUNCTION, SqlFunctionCategory.USER_DEFINED_FUNCTION, OperandTypes.VARIADIC, ReturnTypes.BIGINT,

Review Comment:
   VARIADIC is problematic. we can only run these in v1. i don't know if we should do this and allow the problematic syntax in v2



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #11143: [multistage] Register theta sketch aggregation functions in v2 query engine

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #11143:
URL: https://github.com/apache/pinot/pull/11143#issuecomment-1644801520

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11143?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#11143](https://app.codecov.io/gh/apache/pinot/pull/11143?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (d408ed7) into [master](https://app.codecov.io/gh/apache/pinot/commit/7e782ddd8be23e4968fc327206679f2304b42ed5?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (7e782dd) will **increase** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #11143     +/-   ##
   =========================================
     Coverage    0.11%    0.11%             
   =========================================
     Files        2205     2150     -55     
     Lines      118320   115808   -2512     
     Branches    17907    17601    -306     
   =========================================
     Hits          137      137             
   + Misses     118163   115651   -2512     
     Partials       20       20             
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `?` | |
   | integration1temurin17 | `?` | |
   | integration1temurin20 | `?` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin11 | `?` | |
   | unittests1temurin17 | `?` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin17 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin20 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/11143?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...che/pinot/segment/spi/AggregationFunctionType.java](https://app.codecov.io/gh/apache/pinot/pull/11143?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL0FnZ3JlZ2F0aW9uRnVuY3Rpb25UeXBlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   
   ... and [57 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/11143/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr merged pull request #11143: [multistage] Register theta sketch aggregation functions in v2 query engine

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr merged PR #11143:
URL: https://github.com/apache/pinot/pull/11143


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org