You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "sajjad-moradi (via GitHub)" <gi...@apache.org> on 2023/12/04 06:50:29 UTC

[PR] Add partition level Force Commit [pinot]

sajjad-moradi opened a new pull request, #12088:
URL: https://github.com/apache/pinot/pull/12088

   Currently, the force-commit operation is applied to all partitions within a table. However, in certain instances, problems may arise in only a specific subset of consuming segments. Addressing these issues requires force-committing the problematic partitions, but applying force-commit to the entire table affects all partitions. This becomes impractical for tables with a large number of partitions. This PR introduces partition-level force-commit functionality, expanding the endpoint to accept a comma-separated list of partitions or consuming segment names.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "mcvsubbu (via GitHub)" <gi...@apache.org>.
mcvsubbu commented on code in PR #12088:
URL: https://github.com/apache/pinot/pull/12088#discussion_r1414330997


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotRealtimeTableResource.java:
##########
@@ -136,15 +136,22 @@ public Response resumeConsumption(
       notes = "Force commit the current segments in consuming state and restart consumption. "
           + "This should be used after schema/table config changes. "
           + "Please note that this is an asynchronous operation, "
-          + "and 200 response does not mean it has actually been done already")
+          + "and 200 response does not mean it has actually been done already."
+          + "If specific partitions or consuming segments are provided, "
+          + "only those partitions or consuming segments will be force committed.")
   public Map<String, String> forceCommit(
-      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName) {
+      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName,
+      @ApiParam(value = "Comma separated list of partition group IDs to be committed") @QueryParam("partitions")
+      String partitionGroupIds,
+      @ApiParam(value = "Comma separated list of consuming segments to be committed") @QueryParam("segments")

Review Comment:
   let us modify this documentation as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #12088:
URL: https://github.com/apache/pinot/pull/12088#discussion_r1416183638


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -1559,12 +1560,54 @@ private boolean isTmpAndCanDelete(URI uri, Set<String> deepURIs, PinotFS pinotFS
 
   /**
    * Force commit the current segments in consuming state and restart consumption
+   * Commit all partitions unless either partitionsToCommit or segmentsToCommit are provided.
+   *
+   * @param tableNameWithType  table name with type
+   * @param partitionGroupIdsToCommit  comma separated list of partition group IDs to commit
+   * @param segmentsToCommit  comma separated list of consuming segments to commit
+   * @return the set of consuming segments for which commit was initiated
    */
-  public Set<String> forceCommit(String tableNameWithType) {
+  public Set<String> forceCommit(String tableNameWithType, @Nullable String partitionGroupIdsToCommit,
+      @Nullable String segmentsToCommit) {
     IdealState idealState = getIdealState(tableNameWithType);
-    Set<String> consumingSegments = findConsumingSegments(idealState);
-    sendForceCommitMessageToServers(tableNameWithType, consumingSegments);
-    return consumingSegments;
+    Set<String> allConsumingSegments = findConsumingSegments(idealState);
+    Set<String> targetConsumingSegments = filterSegmentsToCommit(allConsumingSegments, partitionGroupIdsToCommit,
+        segmentsToCommit);
+    sendForceCommitMessageToServers(tableNameWithType, targetConsumingSegments);
+    return targetConsumingSegments;
+  }
+
+  /**
+   * Among all consuming segments, filter the ones that are in the given partitions or segments.
+   */
+  private Set<String> filterSegmentsToCommit(Set<String> allConsumingSegments,
+      @Nullable String partitionGroupIdsToCommitStr, @Nullable String segmentsToCommitStr) {
+    if (partitionGroupIdsToCommitStr == null && segmentsToCommitStr == null) {
+      return allConsumingSegments;
+    }
+    Preconditions.checkState(partitionGroupIdsToCommitStr == null || segmentsToCommitStr == null,

Review Comment:
   (minor) Suggest moving this check to the first line of `forceCommit()` and change it to `checkArgument()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "sajjad-moradi (via GitHub)" <gi...@apache.org>.
sajjad-moradi commented on code in PR #12088:
URL: https://github.com/apache/pinot/pull/12088#discussion_r1433369984


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotRealtimeTableResource.java:
##########
@@ -136,15 +136,22 @@ public Response resumeConsumption(
       notes = "Force commit the current segments in consuming state and restart consumption. "
           + "This should be used after schema/table config changes. "
           + "Please note that this is an asynchronous operation, "
-          + "and 200 response does not mean it has actually been done already")
+          + "and 200 response does not mean it has actually been done already."
+          + "If specific partitions or consuming segments are provided, "
+          + "only those partitions or consuming segments will be force committed.")
   public Map<String, String> forceCommit(
-      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName) {
+      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName,
+      @ApiParam(value = "Comma separated list of partition group IDs to be committed") @QueryParam("partitions")
+      String partitionGroupIds,
+      @ApiParam(value = "Comma separated list of consuming segments to be committed") @QueryParam("segments")

Review Comment:
   Having the option to provide either makes it easier for operators.
   Keep in mind that both params cannot be provided at the same time, we fail the request. It's either partitions or segments.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "sajjad-moradi (via GitHub)" <gi...@apache.org>.
sajjad-moradi merged PR #12088:
URL: https://github.com/apache/pinot/pull/12088


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "sajjad-moradi (via GitHub)" <gi...@apache.org>.
sajjad-moradi commented on code in PR #12088:
URL: https://github.com/apache/pinot/pull/12088#discussion_r1433259329


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotRealtimeTableResource.java:
##########
@@ -136,15 +136,22 @@ public Response resumeConsumption(
       notes = "Force commit the current segments in consuming state and restart consumption. "
           + "This should be used after schema/table config changes. "
           + "Please note that this is an asynchronous operation, "
-          + "and 200 response does not mean it has actually been done already")
+          + "and 200 response does not mean it has actually been done already."
+          + "If specific partitions or consuming segments are provided, "
+          + "only those partitions or consuming segments will be force committed.")
   public Map<String, String> forceCommit(
-      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName) {
+      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName,
+      @ApiParam(value = "Comma separated list of partition group IDs to be committed") @QueryParam("partitions")
+      String partitionGroupIds,
+      @ApiParam(value = "Comma separated list of consuming segments to be committed") @QueryParam("segments")

Review Comment:
   I changed the variableName in the method as well as the documentation, but for the actual API's query param, I believe the shorter forms - `partitions` and `segments` - are better as they make the endpoint shorter:
   ```
   POST /tables/{tableName}/forceCommit?segments=s1,s2
   ```
   instead of
   ```
   POST /tables/{tableName}/forceCommit?segmentsToCommit=s1,s2
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "mcvsubbu (via GitHub)" <gi...@apache.org>.
mcvsubbu commented on code in PR #12088:
URL: https://github.com/apache/pinot/pull/12088#discussion_r1414249251


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -1559,12 +1560,54 @@ private boolean isTmpAndCanDelete(URI uri, Set<String> deepURIs, PinotFS pinotFS
 
   /**
    * Force commit the current segments in consuming state and restart consumption
+   * Commit all partitions unless either partitionsToCommit or segmentsToCommit are provided.
+   *
+   * @param tableNameWithType  table name with type
+   * @param partitionsToCommit  comma separated list of partitions to commit
+   * @param segmentsToCommit  comma separated list of consuming segments to commit
+   * @return the set of consuming segments that were committed

Review Comment:
   We really don't wait for them to commit, so perhaps we should say something like:
   "set of segments for which commit was initiated"



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -1559,12 +1560,54 @@ private boolean isTmpAndCanDelete(URI uri, Set<String> deepURIs, PinotFS pinotFS
 
   /**
    * Force commit the current segments in consuming state and restart consumption
+   * Commit all partitions unless either partitionsToCommit or segmentsToCommit are provided.
+   *
+   * @param tableNameWithType  table name with type
+   * @param partitionsToCommit  comma separated list of partitions to commit

Review Comment:
   To be consistent, should it be "partitionGroupIdsToCommit" instead of "partitionsToCommit"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #12088:
URL: https://github.com/apache/pinot/pull/12088#issuecomment-1837976665

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/12088?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   All modified and coverable lines are covered by tests :white_check_mark:
   > Comparison is base [(`d7c76b9`)](https://app.codecov.io/gh/apache/pinot/commit/d7c76b9fbb5a659da0da6276c6d9833ba1894e0d?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) 61.62% compared to head [(`dcf0149`)](https://app.codecov.io/gh/apache/pinot/pull/12088?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) 46.71%.
   > Report is 1 commits behind head on master.
   
   
   <details><summary>Additional details and impacted files</summary>
   
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #12088       +/-   ##
   =============================================
   - Coverage     61.62%   46.71%   -14.92%     
   + Complexity     1152      945      -207     
   =============================================
     Files          2389     1791      -598     
     Lines        129824    94130    -35694     
     Branches      20083    15211     -4872     
   =============================================
   - Hits          80009    43975    -36034     
   - Misses        43989    47025     +3036     
   + Partials       5826     3130     -2696     
   ```
   
   | [Flag](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [integration](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [integration1](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [integration2](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [java-11](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [java-21](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.71% <ø> (-14.78%)` | :arrow_down: |
   | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.71% <ø> (-14.91%)` | :arrow_down: |
   | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   | [temurin](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.71% <ø> (-14.92%)` | :arrow_down: |
   | [unittests](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.71% <ø> (-14.91%)` | :arrow_down: |
   | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.71% <ø> (-0.20%)` | :arrow_down: |
   | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/12088/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   
   </details>
   
   [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/pinot/pull/12088?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).   
   :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


Re: [PR] Add partition level Force Commit [pinot]

Posted by "mcvsubbu (via GitHub)" <gi...@apache.org>.
mcvsubbu commented on code in PR #12088:
URL: https://github.com/apache/pinot/pull/12088#discussion_r1433307104


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotRealtimeTableResource.java:
##########
@@ -136,15 +136,22 @@ public Response resumeConsumption(
       notes = "Force commit the current segments in consuming state and restart consumption. "
           + "This should be used after schema/table config changes. "
           + "Please note that this is an asynchronous operation, "
-          + "and 200 response does not mean it has actually been done already")
+          + "and 200 response does not mean it has actually been done already."
+          + "If specific partitions or consuming segments are provided, "
+          + "only those partitions or consuming segments will be force committed.")
   public Map<String, String> forceCommit(
-      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName) {
+      @ApiParam(value = "Name of the table", required = true) @PathParam("tableName") String tableName,
+      @ApiParam(value = "Comma separated list of partition group IDs to be committed") @QueryParam("partitions")
+      String partitionGroupIds,
+      @ApiParam(value = "Comma separated list of consuming segments to be committed") @QueryParam("segments")

Review Comment:
   Wait. Why do we have both partitiongroupIds as well as consuming segments? There is only one consuming segment per partition, so it is just enough to provide the segment names right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org