You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "Jackie-Jiang (via GitHub)" <gi...@apache.org> on 2024/02/01 19:58:32 UTC

[PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Jackie-Jiang opened a new pull request, #12351:
URL: https://github.com/apache/pinot/pull/12351

   Helix might send 2 state transitions when a segment is dropped from a server. This behavior breaks the `onConsumingToDropped()` callback because it is only invoked when processing `CONSUMING -> DROPPED` state transition.
   This PR added a cache for segments that just went through `CONSUMING -> OFFLINE` state transition to work around this problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #12351:
URL: https://github.com/apache/pinot/pull/12351#discussion_r1475184165


##########
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/SegmentOnlineOfflineStateModelFactory.java:
##########
@@ -46,6 +50,13 @@
  * 3. Delete an existed segment.
  */
 public class SegmentOnlineOfflineStateModelFactory extends StateModelFactory<StateModel> {
+  // NOTE: Helix might process CONSUMING -> DROPPED transition as 2 separate transitions: CONSUMING -> OFFLINE followed
+  // by OFFLINE -> DROPPED. Use this cache to track the segments that just went through CONSUMING -> OFFLINE transition
+  // to detect CONSUMING -> DROPPED transition.
+  // TODO: Check how Helix handle CONSUMING -> DROPPED transition and remove this cache if it's not needed.
+  private final Cache<Pair<String, String>, Boolean> _recentlyOffloadedConsumingSegments =

Review Comment:
   How about move the cache into the tableDataMgr? Considering the key of the map is pair<tableName, segmentName> anyway. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #12351:
URL: https://github.com/apache/pinot/pull/12351#issuecomment-1922135433

   Related to #12343 which has more context.
   cc @gortiz 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #12351:
URL: https://github.com/apache/pinot/pull/12351


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #12351:
URL: https://github.com/apache/pinot/pull/12351#issuecomment-1922262814

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   Attention: `58 lines` in your changes are missing coverage. Please review.
   > Comparison is base [(`17db0fd`)](https://app.codecov.io/gh/apache/pinot/commit/17db0fd17b688fe609b82697a5c3b3117d93cdba?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) 61.69% compared to head [(`d5ab682`)](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) 61.74%.
   > Report is 3 commits behind head on master.
   
   | [Files](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Patch % | Lines |
   |---|---|---|
   | [...e/assignment/instance/InstanceTagPoolSelector.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL2Fzc2lnbm1lbnQvaW5zdGFuY2UvSW5zdGFuY2VUYWdQb29sU2VsZWN0b3IuamF2YQ==) | 32.35% | [22 Missing and 1 partial :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   | [...nstance/InstanceReplicaGroupPartitionSelector.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL2Fzc2lnbm1lbnQvaW5zdGFuY2UvSW5zdGFuY2VSZXBsaWNhR3JvdXBQYXJ0aXRpb25TZWxlY3Rvci5qYXZh) | 94.58% | [9 Missing and 6 partials :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   | [...r/helix/SegmentOnlineOfflineStateModelFactory.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zZXJ2ZXIvc3RhcnRlci9oZWxpeC9TZWdtZW50T25saW5lT2ZmbGluZVN0YXRlTW9kZWxGYWN0b3J5LmphdmE=) | 0.00% | [15 Missing :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   | [...elix/core/periodictask/ControllerPeriodicTask.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3BlcmlvZGljdGFzay9Db250cm9sbGVyUGVyaW9kaWNUYXNrLmphdmE=) | 83.33% | [1 Missing and 1 partial :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   | [...mmon/assignment/InstanceAssignmentConfigUtils.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZUFzc2lnbm1lbnRDb25maWdVdGlscy5qYXZh) | 0.00% | [1 Missing :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   | [...e/pinot/controller/helix/SegmentStatusChecker.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9TZWdtZW50U3RhdHVzQ2hlY2tlci5qYXZh) | 0.00% | [1 Missing :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   | [...assignment/instance/InstancePartitionSelector.java](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL2Fzc2lnbm1lbnQvaW5zdGFuY2UvSW5zdGFuY2VQYXJ0aXRpb25TZWxlY3Rvci5qYXZh) | 50.00% | [0 Missing and 1 partial :warning: ](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) |
   
   <details><summary>Additional details and impacted files</summary>
   
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #12351      +/-   ##
   ============================================
   + Coverage     61.69%   61.74%   +0.04%     
     Complexity      207      207              
   ============================================
     Files          2424     2424              
     Lines        132340   132512     +172     
     Branches      20436    20481      +45     
   ============================================
   + Hits          81651    81820     +169     
   - Misses        44693    44694       +1     
   - Partials       5996     5998       +2     
   ```
   
   | [Flag](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | [integration](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | [integration1](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | [integration2](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (ø)` | |
   | [java-11](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.70% <83.42%> (+26.67%)` | :arrow_up: |
   | [java-21](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.61% <83.42%> (+0.03%)` | :arrow_up: |
   | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.73% <83.42%> (+0.04%)` | :arrow_up: |
   | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.58% <83.42%> (+0.02%)` | :arrow_up: |
   | [temurin](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.74% <83.42%> (+0.04%)` | :arrow_up: |
   | [unittests](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `61.74% <83.42%> (+0.04%)` | :arrow_up: |
   | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.92% <75.00%> (+0.01%)` | :arrow_up: |
   | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/12351/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.70% <82.57%> (+0.07%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   </details>
   
   [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/pinot/pull/12351?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).   
   :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #12351:
URL: https://github.com/apache/pinot/pull/12351#discussion_r1475219741


##########
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/SegmentOnlineOfflineStateModelFactory.java:
##########
@@ -46,6 +50,13 @@
  * 3. Delete an existed segment.
  */
 public class SegmentOnlineOfflineStateModelFactory extends StateModelFactory<StateModel> {
+  // NOTE: Helix might process CONSUMING -> DROPPED transition as 2 separate transitions: CONSUMING -> OFFLINE followed
+  // by OFFLINE -> DROPPED. Use this cache to track the segments that just went through CONSUMING -> OFFLINE transition
+  // to detect CONSUMING -> DROPPED transition.
+  // TODO: Check how Helix handle CONSUMING -> DROPPED transition and remove this cache if it's not needed.
+  private final Cache<Pair<String, String>, Boolean> _recentlyOffloadedConsumingSegments =

Review Comment:
   I prefer keeping it here since it is just a workaround for Helix state transition handling. Once Helix fixes the problem we should be able to remove it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #12351:
URL: https://github.com/apache/pinot/pull/12351#issuecomment-1922258767

   Submitted a Helix issue to verify the observation: https://github.com/apache/helix/issues/2751


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #12351:
URL: https://github.com/apache/pinot/pull/12351#discussion_r1475698239


##########
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/SegmentOnlineOfflineStateModelFactory.java:
##########
@@ -215,6 +240,15 @@ public void onBecomeDroppedFromOffline(Message message, NotificationContext cont
       String segmentName = message.getPartitionName();
       try {
         _instanceDataManager.deleteSegment(tableNameWithType, segmentName);
+
+        // Check if the segment is recently offloaded from CONSUMING to OFFLINE
+        if (TableNameBuilder.isRealtimeTableResource(tableNameWithType)) {
+          Pair<String, String> tableSegmentPair = Pair.of(tableNameWithType, segmentName);
+          if (_recentlyOffloadedConsumingSegments.getIfPresent(tableSegmentPair) != null) {
+            _recentlyOffloadedConsumingSegments.invalidate(tableSegmentPair);
+            onConsumingToDropped(tableNameWithType, segmentName);
+          }

Review Comment:
   This is not an actual problem, but it is a bit confusing. 
   
   `_recentlyOffloadedConsumingSegments` is only populated with realtime tables, why do we need to check `if  (TableNameBuilder.isRealtimeTableResource(tableNameWithType))`?
   
   Is the reason to do not create the pair?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Work around the problem of Helix sending 2 transitions for CONSUMING -> DROPPED [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #12351:
URL: https://github.com/apache/pinot/pull/12351#discussion_r1520636758


##########
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/SegmentOnlineOfflineStateModelFactory.java:
##########
@@ -215,6 +240,15 @@ public void onBecomeDroppedFromOffline(Message message, NotificationContext cont
       String segmentName = message.getPartitionName();
       try {
         _instanceDataManager.deleteSegment(tableNameWithType, segmentName);
+
+        // Check if the segment is recently offloaded from CONSUMING to OFFLINE
+        if (TableNameBuilder.isRealtimeTableResource(tableNameWithType)) {
+          Pair<String, String> tableSegmentPair = Pair.of(tableNameWithType, segmentName);
+          if (_recentlyOffloadedConsumingSegments.getIfPresent(tableSegmentPair) != null) {
+            _recentlyOffloadedConsumingSegments.invalidate(tableSegmentPair);
+            onConsumingToDropped(tableNameWithType, segmentName);
+          }

Review Comment:
   Yes, and to reduce a cache lookup. OFFLINE table doesn't really have CONSUMING state



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org