You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "swaminathanmanish (via GitHub)" <gi...@apache.org> on 2024/02/29 06:44:33 UTC

[PR] (WIP...) Adding record reader config/context param to record transformer [pinot]

swaminathanmanish opened a new pull request, #12520:
URL: https://github.com/apache/pinot/pull/12520

   **Whats in the PR:**
   Adding Record reader config/context to transform api, for transformer to use it during transformation of record.
   **Why its needed:**
   Custom transformers need to be perform transformations based on record reader configs (eg: file path). This change is to enable such transformations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] (WIP...) Adding record reader config/context param to record transformer [pinot]

Posted by "swaminathanmanish (via GitHub)" <gi...@apache.org>.
swaminathanmanish commented on PR #12520:
URL: https://github.com/apache/pinot/pull/12520#issuecomment-1970507677

   @snleee , @klsince - Based on our discussion, please take a look. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Adding record reader config/context param to record transformer [pinot]

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #12520:
URL: https://github.com/apache/pinot/pull/12520#discussion_r1508106925


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformer.java:
##########
@@ -43,4 +43,15 @@ default boolean isNoOp() {
    */
   @Nullable
   GenericRow transform(GenericRow record);
+
+  /**
+   * Transforms a record based on some custom rules using record reader context.
+   * @param record Record to transform
+   * @return Transformed record, or {@code null} if the record does not follow certain rules.
+   */
+  @Nullable
+  default GenericRow transform(GenericRow record,

Review Comment:
   nit: format? looks like the two params can be put on one line :p 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] (WIP...) Adding record reader config/context param to record transformer [pinot]

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #12520:
URL: https://github.com/apache/pinot/pull/12520#issuecomment-1970582482

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/12520?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   Attention: Patch coverage is `85.71429%` with `1 lines` in your changes are missing coverage. Please review.
   > Project coverage is 61.55%. Comparing base [(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) to head [(`e82c597`)](https://app.codecov.io/gh/apache/pinot/pull/12520?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 29 commits behind head on master.
   
   | [Files](https://app.codecov.io/gh/apache/pinot/pull/12520?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Patch % | Lines |
   |---|---|---|
   | [.../core/segment/processing/mapper/SegmentMapper.java](https://app.codecov.io/gh/apache/pinot/pull/12520?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvbWFwcGVyL1NlZ21lbnRNYXBwZXIuamF2YQ==) | 75.00% | [1 Missing :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12520?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   
   <details><summary>Additional details and impacted files</summary>
   
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #12520      +/-   ##
   ============================================
   - Coverage     61.75%   61.55%   -0.20%     
     Complexity      207      207              
   ============================================
     Files          2436     2450      +14     
     Lines        133233   133507     +274     
     Branches      20636    20684      +48     
   ============================================
   - Hits          82274    82180      -94     
   - Misses        44911    45229     +318     
   - Partials       6048     6098      +50     
   ```
   
   | [Flag](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [integration](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | [integration1](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | [integration2](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (ø)` | |
   | [java-11](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.67% <42.85%> (-34.04%)` | :arrow_down: |
   | [java-21](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.54% <85.71%> (-0.08%)` | :arrow_down: |
   | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.55% <85.71%> (-0.20%)` | :arrow_down: |
   | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.67% <42.85%> (-0.06%)` | :arrow_down: |
   | [temurin](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.55% <85.71%> (-0.20%)` | :arrow_down: |
   | [unittests](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.55% <85.71%> (-0.20%)` | :arrow_down: |
   | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.65% <42.85%> (-0.24%)` | :arrow_down: |
   | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/12520/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.68% <42.85%> (-0.05%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   </details>
   
   [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/pinot/pull/12520?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).   
   :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Adding record reader config/context param to record transformer [pinot]

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #12520:
URL: https://github.com/apache/pinot/pull/12520#discussion_r1508078762


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformer.java:
##########
@@ -43,4 +44,15 @@ default boolean isNoOp() {
    */
   @Nullable
   GenericRow transform(GenericRow record);
+
+  /**
+   * Transforms a record based on some custom rules using record reader context.
+   * @param record Record to transform
+   * @return Transformed record, or {@code null} if the record does not follow certain rules.
+   */
+  @Nullable
+  default GenericRow transformUsingRecordReaderContext(GenericRow record,

Review Comment:
   just call it `transform(record, config)`, as it's easy to tell them apart by param list.
   
   I'd suggest to pass in RecordReaderConfig (the interface), and make RecordReaderFileConfig implement RecordReaderConfig (which is an empty interface right now)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Adding record reader config/context param to record transformer [pinot]

Posted by "swaminathanmanish (via GitHub)" <gi...@apache.org>.
swaminathanmanish commented on code in PR #12520:
URL: https://github.com/apache/pinot/pull/12520#discussion_r1508091484


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformer.java:
##########
@@ -43,4 +44,15 @@ default boolean isNoOp() {
    */
   @Nullable
   GenericRow transform(GenericRow record);
+
+  /**
+   * Transforms a record based on some custom rules using record reader context.
+   * @param record Record to transform
+   * @return Transformed record, or {@code null} if the record does not follow certain rules.
+   */
+  @Nullable
+  default GenericRow transformUsingRecordReaderContext(GenericRow record,

Review Comment:
   Yes this makes sense to me. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Adding record reader config/context param to record transformer [pinot]

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince merged PR #12520:
URL: https://github.com/apache/pinot/pull/12520


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org