You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "itschrispeck (via GitHub)" <gi...@apache.org> on 2023/07/28 07:45:15 UTC

[GitHub] [pinot] itschrispeck opened a new pull request, #11204: add mv support for native text index

itschrispeck opened a new pull request, #11204:
URL: https://github.com/apache/pinot/pull/11204

   This change adds native text index support for multi-value columns
   
   
   tag: `feature`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #11204: add multi-value support for native text index

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #11204:
URL: https://github.com/apache/pinot/pull/11204#discussion_r1280421788


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/invertedindex/NativeMutableTextIndex.java:
##########
@@ -63,9 +63,22 @@ public NativeMutableTextIndex(String column) {
 
   @Override
   public void add(String document) {
-    Iterable<String> tokens;
+    addHelper(document);
+    _nextDocId++;
+  }
+
+  @Override
+  public void add(String[] documents) {
+    for (String document : documents) {
+      addHelper(document);
+    }
+    _nextDocId++;
+  }
 
+  private void addHelper(String document) {
+    Iterable<String> tokens;
     tokens = analyze(document);
+

Review Comment:
   nit: Can we merge declaration and assignation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #11204: add multi-value support for native text index

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #11204:
URL: https://github.com/apache/pinot/pull/11204


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #11204: add multi-value support for native text index

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #11204:
URL: https://github.com/apache/pinot/pull/11204#issuecomment-1655266048

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#11204](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (ca7b2fb) into [master](https://app.codecov.io/gh/apache/pinot/commit/a86ba9c42fa94703988a477620966eb5f31f6d89?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (a86ba9c) will **increase** coverage by `0.00%`.
   > Report is 15 commits behind head on master.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #11204     +/-   ##
   =========================================
     Coverage    0.11%    0.11%             
   =========================================
     Files        2218     2169     -49     
     Lines      119138   116695   -2443     
     Branches    18022    17721    -301     
   =========================================
     Hits          137      137             
   + Misses     118981   116538   -2443     
     Partials       20       20             
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `?` | |
   | integration1temurin17 | `?` | |
   | integration1temurin20 | `?` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin11 | `?` | |
   | unittests1temurin17 | `?` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `?` | |
   | unittests2temurin17 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin20 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Files Changed](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...ime/impl/invertedindex/NativeMutableTextIndex.java](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9pbXBsL2ludmVydGVkaW5kZXgvTmF0aXZlTXV0YWJsZVRleHRJbmRleC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ment/creator/impl/text/NativeTextIndexCreator.java](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC90ZXh0L05hdGl2ZVRleHRJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ment/index/readers/text/NativeTextIndexReader.java](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L3JlYWRlcnMvdGV4dC9OYXRpdmVUZXh0SW5kZXhSZWFkZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...egment/local/segment/index/text/TextIndexType.java](https://app.codecov.io/gh/apache/pinot/pull/11204?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L3RleHQvVGV4dEluZGV4VHlwZS5qYXZh) | `0.00% <ø> (ø)` | |
   
   ... and [150 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/11204/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] itschrispeck commented on a diff in pull request #11204: add multi-value support for native text index

Posted by "itschrispeck (via GitHub)" <gi...@apache.org>.
itschrispeck commented on code in PR #11204:
URL: https://github.com/apache/pinot/pull/11204#discussion_r1280148413


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/invertedindex/NativeMutableTextIndex.java:
##########
@@ -114,8 +121,8 @@ public void close()
   private List<String> analyze(String document) {
     List<String> tokens = new ArrayList<>();
     try (TokenStream tokenStream = _analyzer.tokenStream(_column, document)) {
-      tokenStream.reset();
       CharTermAttribute attribute = tokenStream.getAttribute(CharTermAttribute.class);
+      tokenStream.reset();

Review Comment:
   this doesn't change behavior, but follows the intended flow from the [docs](https://lucene.apache.org/core/4_7_0/core/org/apache/lucene/analysis/TokenStream.html)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org