You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/11/03 07:31:21 UTC

[GitHub] [pinot] atris opened a new pull request #7684: Support FST Index using Native FST Library

atris opened a new pull request #7684:
URL: https://github.com/apache/pinot/pull/7684


   This PR introduces a new index type which is built using the native FST library. The new index is used for serving regexp queries (using REGEXP_LIKE) and LIKE operator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#issuecomment-959092832


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#7684](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (726f33a) into [master](https://codecov.io/gh/apache/pinot/commit/2107b2cd1880ec74934ec5412dbee8c874f05610?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2107b2c) will **decrease** coverage by `3.13%`.
   > The diff coverage is `3.90%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/7684/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #7684      +/-   ##
   ==========================================
   - Coverage   30.79%   27.66%   -3.14%     
   ==========================================
     Files        1570     1572       +2     
     Lines       80024    80153     +129     
     Branches    11904    11923      +19     
   ==========================================
   - Hits        24644    22174    -2470     
   - Misses      53274    55969    +2695     
   + Partials     2106     2010      -96     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `27.66% <3.90%> (-0.05%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...inot/core/operator/filter/FilterOperatorUtils.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvRmlsdGVyT3BlcmF0b3JVdGlscy5qYXZh) | `74.32% <0.00%> (-3.76%)` | :arrow_down: |
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...l/realtime/converter/RealtimeSegmentConverter.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvUmVhbHRpbWVTZWdtZW50Q29udmVydGVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...ent/local/realtime/impl/RealtimeSegmentConfig.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9pbXBsL1JlYWx0aW1lU2VnbWVudENvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...gment/index/column/IntermediateIndexContainer.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2NvbHVtbi9JbnRlcm1lZGlhdGVJbmRleENvbnRhaW5lci5qYXZh) | `0.00% <ø> (ø)` | |
   | [...ent/index/column/PhysicalColumnIndexContainer.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2NvbHVtbi9QaHlzaWNhbENvbHVtbkluZGV4Q29udGFpbmVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...ndex/converter/SegmentV1V2ToV3FormatConverter.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2NvbnZlcnRlci9TZWdtZW50VjFWMlRvVjNGb3JtYXRDb252ZXJ0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...local/segment/index/datasource/BaseDataSource.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2RhdGFzb3VyY2UvQmFzZURhdGFTb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ocal/segment/index/datasource/EmptyDataSource.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2RhdGFzb3VyY2UvRW1wdHlEYXRhU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [179 more](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [2107b2c...726f33a](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r744856664



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/FieldConfig.java
##########
@@ -85,7 +85,7 @@ public FieldConfig(@JsonProperty(value = "name", required = true) String name,
 
   // If null, there won't be any index
   public enum IndexType {
-    INVERTED, SORTED, TEXT, FST, H3, JSON, RANGE
+    INVERTED, SORTED, TEXT, FST, NATIVE_FST, H3, JSON, RANGE

Review comment:
       I do not think that would work, since we do not allow specifying sub config per field in IndexingConfig?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#issuecomment-959092832


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#7684](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (726f33a) into [master](https://codecov.io/gh/apache/pinot/commit/2107b2cd1880ec74934ec5412dbee8c874f05610?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2107b2c) will **decrease** coverage by `3.13%`.
   > The diff coverage is `3.90%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/7684/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #7684      +/-   ##
   ==========================================
   - Coverage   30.79%   27.66%   -3.14%     
   ==========================================
     Files        1570     1572       +2     
     Lines       80024    80153     +129     
     Branches    11904    11923      +19     
   ==========================================
   - Hits        24644    22174    -2470     
   - Misses      53274    55969    +2695     
   + Partials     2106     2010      -96     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `27.66% <3.90%> (-0.05%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...inot/core/operator/filter/FilterOperatorUtils.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvRmlsdGVyT3BlcmF0b3JVdGlscy5qYXZh) | `74.32% <0.00%> (-3.76%)` | :arrow_down: |
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...l/realtime/converter/RealtimeSegmentConverter.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvUmVhbHRpbWVTZWdtZW50Q29udmVydGVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...ent/local/realtime/impl/RealtimeSegmentConfig.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9pbXBsL1JlYWx0aW1lU2VnbWVudENvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...gment/index/column/IntermediateIndexContainer.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2NvbHVtbi9JbnRlcm1lZGlhdGVJbmRleENvbnRhaW5lci5qYXZh) | `0.00% <ø> (ø)` | |
   | [...ent/index/column/PhysicalColumnIndexContainer.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2NvbHVtbi9QaHlzaWNhbENvbHVtbkluZGV4Q29udGFpbmVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...ndex/converter/SegmentV1V2ToV3FormatConverter.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2NvbnZlcnRlci9TZWdtZW50VjFWMlRvVjNGb3JtYXRDb252ZXJ0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...local/segment/index/datasource/BaseDataSource.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2RhdGFzb3VyY2UvQmFzZURhdGFTb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ocal/segment/index/datasource/EmptyDataSource.java](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2RhdGFzb3VyY2UvRW1wdHlEYXRhU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [179 more](https://codecov.io/gh/apache/pinot/pull/7684/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [2107b2c...726f33a](https://codecov.io/gh/apache/pinot/pull/7684?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
kishoreg commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742080623



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       should we call this FSTIndex? seems like leaking implementation detail

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       is this a formatting change?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       we should probably have a logical name and FST can be one such implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       what is the difference between FSTIndex and NativeFSTIndex?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       should we call this FSTIndex? seems like leaking implementation detail

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       is this a formatting change?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       we should probably have a logical name and FST can be one such implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       what is the difference between FSTIndex and NativeFSTIndex?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#issuecomment-963929033


   Superseded by #7729 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742158076



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       We already have a FSTIndex -- one of the key principles of this PR is to not touch the existing implementation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#issuecomment-959092832






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
kishoreg commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742080623



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       should we call this FSTIndex? seems like leaking implementation detail

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       is this a formatting change?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       we should probably have a logical name and FST can be one such implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       what is the difference between FSTIndex and NativeFSTIndex?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742158076



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       We already have a FSTIndex -- one of the key principles of this PR is to not touch the existing implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       FSTIndex is built on Lucene and NativeFSTIndex is built on Pinot's native FST

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       No, it adds a new parameter in the constructor

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       We already have a FSTIndex -- one of the key principles of this PR is to not touch the existing implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       FSTIndex is built on Lucene and NativeFSTIndex is built on Pinot's native FST

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       No, it adds a new parameter in the constructor




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r744071304



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/FieldConfig.java
##########
@@ -85,7 +85,7 @@ public FieldConfig(@JsonProperty(value = "name", required = true) String name,
 
   // If null, there won't be any index
   public enum IndexType {
-    INVERTED, SORTED, TEXT, FST, H3, JSON, RANGE
+    INVERTED, SORTED, TEXT, FST, NATIVE_FST, H3, JSON, RANGE

Review comment:
       Suggest treating native FST as FST, and add an enum in `IndexingConfig` for the FST version (`LUCENE` or `NATIVE`)

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/datasource/DataSource.java
##########
@@ -76,6 +76,12 @@
   @Nullable
   TextIndexReader getFSTIndex();
 
+  /**
+   * Returns the Native FST index for the column if exists, or {@code null} if not.
+   */
+  @Nullable
+  TextIndexReader getNativeFSTIndex();

Review comment:
       Since native FST and lucene FST are both FST, just different implementation, and I don't think they should co-exist, let's reuse the same `getFSTIndex()` for both of them. We shouldn't need to change any code on the query execution side. The loader should be able to tell which version of FST exists and load it as the FST index.

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java
##########
@@ -73,6 +73,7 @@
   private final List<String> _invertedIndexCreationColumns = new ArrayList<>();
   private final List<String> _textIndexCreationColumns = new ArrayList<>();
   private final List<String> _fstIndexCreationColumns = new ArrayList<>();
+  private final List<String> _nativeFSTIndexCreationColumns = new ArrayList<>();

Review comment:
       I'd suggest adding an enum for the FST type (can be `LUCENE` or `NATIVE` instead of keeping separate lists




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r744942806



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/SegmentGeneratorConfig.java
##########
@@ -73,6 +73,7 @@
   private final List<String> _invertedIndexCreationColumns = new ArrayList<>();
   private final List<String> _textIndexCreationColumns = new ArrayList<>();
   private final List<String> _fstIndexCreationColumns = new ArrayList<>();
+  private final List<String> _nativeFSTIndexCreationColumns = new ArrayList<>();

Review comment:
       Would that not limit the ability to create per column FST indices within same table?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742160195



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       FSTIndex is built on Lucene and NativeFSTIndex is built on Pinot's native FST

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       No, it adds a new parameter in the constructor




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
kishoreg commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742080623



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       should we call this FSTIndex? seems like leaking implementation detail

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       is this a formatting change?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       we should probably have a logical name and FST can be one such implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       what is the difference between FSTIndex and NativeFSTIndex?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r742158076



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -63,10 +63,10 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic
       }
       return new ScanBasedFilterOperator(predicateEvaluator, dataSource, numDocs);
     } else if (predicateType == Predicate.Type.REGEXP_LIKE) {
-      if (dataSource.getFSTIndex() != null && dataSource.getDataSourceMetadata().isSorted()) {
+      if (isFSTIndexPresent(dataSource) && dataSource.getDataSourceMetadata().isSorted()) {

Review comment:
       We already have a FSTIndex -- one of the key principles of this PR is to not touch the existing implementation

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -201,4 +201,13 @@ private static int getScanBasedFilterPriority(ScanBasedFilterOperator scanBasedF
       return basePriority + 1;
     }
   }
+
+  /**
+   * Perform a composite check to see if there is a FST index present
+   * @param dataSource Datasource to be checked
+   * @return
+   */
+  private static boolean isFSTIndexPresent(DataSource dataSource) {
+    return (dataSource.getFSTIndex() != null || dataSource.getNativeFSTIndex() != null);

Review comment:
       FSTIndex is built on Lucene and NativeFSTIndex is built on Pinot's native FST

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java
##########
@@ -291,8 +291,8 @@ public void run() {
           RealtimeSegmentConverter converter =
               new RealtimeSegmentConverter(_realtimeSegment, tempSegmentFolder.getAbsolutePath(), schema,

Review comment:
       No, it adds a new parameter in the constructor




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris commented on a change in pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #7684:
URL: https://github.com/apache/pinot/pull/7684#discussion_r744857774



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/datasource/DataSource.java
##########
@@ -76,6 +76,12 @@
   @Nullable
   TextIndexReader getFSTIndex();
 
+  /**
+   * Returns the Native FST index for the column if exists, or {@code null} if not.
+   */
+  @Nullable
+  TextIndexReader getNativeFSTIndex();

Review comment:
       The problem is during creation -- there is no way to specify at a per field level as to which FST type needs to be used. Hence the need for the extra index




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] atris closed pull request #7684: Support FST Index using Native FST Library

Posted by GitBox <gi...@apache.org>.
atris closed pull request #7684:
URL: https://github.com/apache/pinot/pull/7684


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org