You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "ankitsultana (via GitHub)" <gi...@apache.org> on 2024/02/10 23:37:56 UTC

[I] Partial Upsert Table Data Inconsistency Issues [pinot]

ankitsultana opened a new issue, #12396:
URL: https://github.com/apache/pinot/issues/12396

   We were seeing a pattern with some of our Partial Upsert tables where CRCs of replicas of the same segment were going out of sync. I took a deep-dive and found a bunch of issues which need to be addressed. Will use this ticket as a top-level ticket for tracking the individual issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [I] [partial-upsert] Partial Upsert Table Data Inconsistency Issues [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #12396:
URL: https://github.com/apache/pinot/issues/12396#issuecomment-1937436991

   cc @klsince 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [I] [partial-upsert] Partial Upsert Table Data Inconsistency Issues [pinot]

Posted by "deemoliu (via GitHub)" <gi...@apache.org>.
deemoliu commented on issue #12396:
URL: https://github.com/apache/pinot/issues/12396#issuecomment-1944497834

   +1 as @Jackie-Jiang mentioned current expected behavior is updating the record location when there is a tie to keep the newer record. However realistically, message orders are not guaranteed coming from Kafka side.
   
   another notes on the comparison columns, there are #1 normal cases that comparison value have the possibility to be same accidentally, and #2 there are cases the comparison column value are the same for all messages in purposes (e.g. scenario like filling empty field and avoid missing out of order events). I feel the case #1 cannot be avoided, if it can be solved then the second case can also be solved. meanwhile #2 we should have a proper way to handle it instead of `configure the comparison column value to use a consistent value`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [I] [partial-upsert] Partial Upsert Table Data Inconsistency Issues [pinot]

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on issue #12396:
URL: https://github.com/apache/pinot/issues/12396#issuecomment-1949460955

   Thanks for reporting the issues. 
   
   > For the first 2 issues, we need to figure out how to handle partial-upsert with same comparison value. For partial upsert, the insert order is critical, and we need to design a way to tell which is the final valid record. 
   
   +1 to this idea.
   
   I think a key for consistency is to make Pinot logic deterministic while deciding the final valid record for a PK (between the mutable segment and the immutable segment supposed to replace the mutable one), particularly when the comparison values are same, or the sorted column might change the processing order of docs in a segment.
   
   I didn't fully get that `Handling Table Rebalance` issue. But as it has mentioned the `allSegmentsLoaded` flag, I'd assume the issue is about during table rebalance, the consuming segment might get started even before the other immutable segments in the same table partition get fully loaded on the server (?). If so, this issue and the 2nd issue of `Bug in Replace Segment` might share a potential fix, which is to start a new consuming segment only after the immutable segments in the same table partition are fully loaded. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [I] [partial-upsert] Partial Upsert Table Data Inconsistency Issues [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #12396:
URL: https://github.com/apache/pinot/issues/12396#issuecomment-1942855800

   Thanks for summarizing the issues!
   
   For the first 2 issues, we need to figure out how to handle partial-upsert with same comparison value. For partial upsert, the insert order is critical, and we need to design a way to tell which is the final valid record.
   One way I can think of is to add an extra virtual column to track the version of the partial-upsert record for the tie comparison value. Within each segment, the version starts from 0, and whenever a new record has tie comparison value, we increment it by one. When replacing the committed segment, we can use this version number to break tie.
   Another similar approach is to record the original doc id for each record. The drawback is that it might take more storage because all values will be unique.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [I] [partial-upsert] Partial Upsert Table Data Inconsistency Issues [pinot]

Posted by "ankitsultana (via GitHub)" <gi...@apache.org>.
ankitsultana commented on issue #12396:
URL: https://github.com/apache/pinot/issues/12396#issuecomment-1949474503

   > I'd assume the issue is about during table rebalance, the consuming segment might get started even before the other immutable segments in the same table partition get fully loaded on the server (?)
   
   The case of "consuming segment starting on a server before all all immutable segments are fully loaded" is not something I am concerned about, because I think the allSegmentsLoaded boolean should be able to handle that. (correct me if I am wrong)
   
   The case I tried to highlight in #12400 is:
   
   1. Table rebalance is started. We have two replicas of a consuming segment S0 and S1.
   2. As part of the rebalance, one of the immutable segment is removed from the server hosting S0.
   3. Soon after, say there's a event whose primary key was in S0. S0 will think there was no record for this key, but S1 will merge it with the correct key.
   4. If a segment commit follows soon after, then S0 and S1 will diverge permanently, leading to data inconsistency/loss.
   
   Of course it can happen that the immutable segment can get removed in both replica servers. In that case replicas won't diverge, but still the data computed as part of Partial Upsert merger would be wrong in both replicas.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org