You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "KKcorps (via GitHub)" <gi...@apache.org> on 2023/07/04 08:29:00 UTC

[GitHub] [pinot] KKcorps opened a new pull request, #11027: Preserve non-null comparison column values during segment commit

KKcorps opened a new pull request, #11027:
URL: https://github.com/apache/pinot/pull/11027

   This PR completes the bug fix introduced with https://github.com/apache/pinot/pull/10704
   
   With the fix introduced previously, although we merge the comparison column values, we still mark the original columns as null in the bitmap so that it can help during restart.
   
   However, during the segment build, we simply filter out all the columns which have null set in the bitmap and thus we lose all these values.
   
   ## Reproducing the bug
   
   * create a partital upsert table with two comparison columns `mtime` and `mtime_2`
   
   * publish the following two records to the table
   
   ```jsonl
   { "rsvp_count": 23, "venue_name": "Venue A", "event_id": "E12345", "event_time": 1688372645422, "group_city": "San Francisco", "group_country": "USA", "group_id": 1234567890, "group_name": "OpenAI enthusiasts", "group_lat": 37.7749, "group_lon": -122.4194, "mtime_2": 1687341314322 }
   
   
   { "rsvp_count": 23, "venue_name": "Venue A", "event_id": "E12345", "event_time": 1688372645422, "group_city": "San Francisco", "group_country": "USA", "group_id": 1234567890, "group_name": "OpenAI enthusiasts", "group_lat": 37.7749, "group_lon": -122.4194, "mtime": 1687341314100 }
   ```
   
   * verify the records are showing up correctly in the table. Then do either reload or force commit.
   
   * You will see that no records show up in the table now. If you use `skipUpsert(true)` though, you will be able to see everything


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] egalpin commented on pull request #11027: Preserve non-null comparison column values during segment commit

Posted by "egalpin (via GitHub)" <gi...@apache.org>.
egalpin commented on PR #11027:
URL: https://github.com/apache/pinot/pull/11027#issuecomment-1622295892

   The motivation behind wanting to keep nullness encoded is being able to easily discern between a defaultValue representation of null Vs. an actual null so that comparing 2 defaultValues would not result in a comparison result of `0` therefore triggering upsert.
   
   I don't feel that this is a required "guard" anymore though; it may have been needed at one point but the various algorithms for multiple comparison column upsert have changed a lot throughout implementation.  We guard against any newly ingested record having all null comparison columns[1], so by the time we reach `compareTo` the values being compared would "at worst" be `<defaultValue>` of the previously ingested record and the non-null value (i.e. non-defaultValue) of the newly ingested record. Such a comparison would have the same result as the current implementation (current implementation checks if the previous record's value for the same comparison index is null and if so accepts the new record as a valid upsert). 
   
   All that said though, if we don't encode nullness then we lose the ability to perform valuable queries like "show me all records that have null `comparisonColumnX`" (i.e. "show me all records which have never had data written by a producer that uses `comparisonColumnX`).
   
   [1] https://github.com/egalpin/pinot/blob/68fdfa4a12926c7ce45dadc6a86d973ce5ff3669/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java#L597 ( -> this method has move, I think as of https://github.com/apache/pinot/pull/10703)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on pull request #11027: Preserve non-null comparison column values during segment commit

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #11027:
URL: https://github.com/apache/pinot/pull/11027#issuecomment-1624253572

   Picked the solution in #11044 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #11027: Preserve non-null comparison column values during segment commit

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #11027:
URL: https://github.com/apache/pinot/pull/11027#issuecomment-1619850625

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11027?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#11027](https://app.codecov.io/gh/apache/pinot/pull/11027?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (8a5299f) into [master](https://app.codecov.io/gh/apache/pinot/commit/0821e3b967f442ea6e9cb3d9b5b11f95d151020d?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (0821e3b) will **increase** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #11027     +/-   ##
   =========================================
     Coverage    0.11%    0.11%             
   =========================================
     Files        2197     2145     -52     
     Lines      118596   116241   -2355     
     Branches    17980    17715    -265     
   =========================================
     Hits          137      137             
   + Misses     118439   116084   -2355     
     Partials       20       20             
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `?` | |
   | integration1temurin17 | `?` | |
   | integration1temurin20 | `?` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin11 | `?` | |
   | unittests1temurin17 | `?` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin17 | `?` | |
   | unittests2temurin20 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/11027?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://app.codecov.io/gh/apache/pinot/pull/11027?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ocal/segment/readers/PinotSegmentRecordReader.java](https://app.codecov.io/gh/apache/pinot/pull/11027?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L3JlYWRlcnMvUGlub3RTZWdtZW50UmVjb3JkUmVhZGVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/pinot/segment/spi/IndexSegment.java](https://app.codecov.io/gh/apache/pinot/pull/11027?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL0luZGV4U2VnbWVudC5qYXZh) | `0.00% <0.00%> (ø)` | |
   
   ... and [59 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/11027/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on pull request #11027: Preserve non-null comparison column values during segment commit

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #11027:
URL: https://github.com/apache/pinot/pull/11027#issuecomment-1621078008

   I don't follow the special handling of comparison column (still treating them as null value) in #10704, and I feel we should just remove that special handling. Handling it in record reader is a very hacky solution.
   cc @egalpin for more context on that special handling


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang closed pull request #11027: Preserve non-null comparison column values during segment commit

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang closed pull request #11027: Preserve non-null comparison column values during segment commit
URL: https://github.com/apache/pinot/pull/11027


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org