You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "jtao15 (via GitHub)" <gi...@apache.org> on 2023/06/22 23:59:31 UTC

[GitHub] [pinot] jtao15 opened a new pull request, #10964: Allow custom segment grouping in MergeRollupTask based on lineage metadata

jtao15 opened a new pull request, #10964:
URL: https://github.com/apache/pinot/pull/10964

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask

Posted by "snleee (via GitHub)" <gi...@apache.org>.
snleee commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1247496454


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -475,6 +476,14 @@ public void registerTaskGenerator(PinotTaskGenerator taskGenerator) {
     _taskGeneratorRegistry.registerTaskGenerator(taskGenerator);
   }
 
+  /**
+   * Registers a task segment group manager.
+   * <p>This method can be used to plug in custom task segment group manager.
+   */
+  public void registerTaskSegmentGroupManager(PinotTaskSegmentGroupManager taskSegmentGroupManager) {

Review Comment:
   Instead of exposing this function for registering and setting `taskSegmentGroupManager` from the top level task manager and feed this all the way to the task generator, I can think of another approach which would not require interface changes:
   
   1. Pass the factory class name for `TaskSegmentGroupManager` as part of the task config for Merge&Rollup task. 
   2. initialize the `TaskSegmentGroupManager` object based on the factory class name
   
   
   This approach is what we use for Kafka client factory in the stream config.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask based on lineage metadata

Posted by "snleee (via GitHub)" <gi...@apache.org>.
snleee commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1239387645


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/lineage/DefaultLineageManager.java:
##########
@@ -105,6 +107,11 @@ public void updateLineageForRetention(TableConfig tableConfig, SegmentLineage li
     }
   }
 
+  @Override
+  public List<List<SegmentZKMetadata>> getSegmentGroupsForMergeRollupTask(TableConfig tableConfig,

Review Comment:
   I feel that this function doesn't belong to `LineageManager`? LineageManager should not know about `MergeRollupTask`. MergeRollupTask should be the user of the segment replace protocol.
   
   Can you add a bit more information on what it means by `SegmentGroup` here?
   
   If your intention is to inject the custom code to the MergeRollupTask, please create a dedicated interface for that.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/lineage/LineageManager.java:
##########
@@ -70,4 +71,14 @@ void updateLineageForRevertReplaceSegments(TableConfig tableConfig, String linea
    */
   void updateLineageForRetention(TableConfig tableConfig, SegmentLineage lineage, List<String> allSegments,
       List<String> segmentsToDelete, Set<String> consumingSegments);
+
+  /**
+   * Get segment zk metadata groups for MergeRollupTask
+   * @param tableConfig
+   * @param lineage
+   * @param segments
+   * @return segment zk metadata groups

Review Comment:
   Can you elaborate on the `segment zk metadata groups`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee merged pull request #10964: Allow custom segment grouping in MergeRollupTask

Posted by "snleee (via GitHub)" <gi...@apache.org>.
snleee merged PR #10964:
URL: https://github.com/apache/pinot/pull/10964


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jtao15 commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask

Posted by "jtao15 (via GitHub)" <gi...@apache.org>.
jtao15 commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1248298273


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java:
##########
@@ -475,6 +476,14 @@ public void registerTaskGenerator(PinotTaskGenerator taskGenerator) {
     _taskGeneratorRegistry.registerTaskGenerator(taskGenerator);
   }
 
+  /**
+   * Registers a task segment group manager.
+   * <p>This method can be used to plug in custom task segment group manager.
+   */
+  public void registerTaskSegmentGroupManager(PinotTaskSegmentGroupManager taskSegmentGroupManager) {

Review Comment:
   Good point, updated. PTAL.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jtao15 commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask

Posted by "jtao15 (via GitHub)" <gi...@apache.org>.
jtao15 commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1243289538


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/lineage/DefaultLineageManager.java:
##########
@@ -105,6 +107,11 @@ public void updateLineageForRetention(TableConfig tableConfig, SegmentLineage li
     }
   }
 
+  @Override
+  public List<List<SegmentZKMetadata>> getSegmentGroupsForMergeRollupTask(TableConfig tableConfig,

Review Comment:
   @snleee @sajjad-moradi  Thanks for the suggestion, removed from `LineageManager` and added `PinotTaskSegmentGroupManager` interface. Custom class can be registered via `PinotTaskManager.registerTaskSegmentGroupManager()`. Please take another look.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask based on lineage metadata

Posted by "sajjad-moradi (via GitHub)" <gi...@apache.org>.
sajjad-moradi commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1242797215


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/lineage/DefaultLineageManager.java:
##########
@@ -105,6 +107,11 @@ public void updateLineageForRetention(TableConfig tableConfig, SegmentLineage li
     }
   }
 
+  @Override
+  public List<List<SegmentZKMetadata>> getSegmentGroupsForMergeRollupTask(TableConfig tableConfig,

Review Comment:
   +1
   This logic belongs to MergeRollup, not LineageManager. Let's create an interface like MergeRollupSegmentGroupManager, MergeRollupSegmentCategorizer, or something like these, and inject the interface into MergeRollupTaskGenerator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jtao15 commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask based on lineage metadata

Posted by "jtao15 (via GitHub)" <gi...@apache.org>.
jtao15 commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1243152915


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/lineage/LineageManager.java:
##########
@@ -70,4 +71,14 @@ void updateLineageForRevertReplaceSegments(TableConfig tableConfig, String linea
    */
   void updateLineageForRetention(TableConfig tableConfig, SegmentLineage lineage, List<String> allSegments,
       List<String> segmentsToDelete, Set<String> consumingSegments);
+
+  /**
+   * Get segment zk metadata groups for MergeRollupTask
+   * @param tableConfig
+   * @param lineage
+   * @param segments
+   * @return segment zk metadata groups

Review Comment:
   This basically returns multiple Group/List of SegmentZkMetadata. Each group/list have the segments that should be merged together, segments from different groups should not be merged.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #10964: Allow custom segment grouping in MergeRollupTask based on lineage metadata

Posted by "sajjad-moradi (via GitHub)" <gi...@apache.org>.
sajjad-moradi commented on code in PR #10964:
URL: https://github.com/apache/pinot/pull/10964#discussion_r1242797215


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/lineage/DefaultLineageManager.java:
##########
@@ -105,6 +107,11 @@ public void updateLineageForRetention(TableConfig tableConfig, SegmentLineage li
     }
   }
 
+  @Override
+  public List<List<SegmentZKMetadata>> getSegmentGroupsForMergeRollupTask(TableConfig tableConfig,

Review Comment:
   +1
   This logic belongs to MergeRollup, not LineageManager. Let's create an interface like MergeRollupSegmentGroupManager, MergeRollupSegmentCategorizer, or something like these, and inject the interface into MergeRollupTaskGenerator. The interface takes tableConfig and segmentList as the input, and returns segment groups as output.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10964: Allow custom segment grouping in MergeRollupTask based on lineage metadata

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10964:
URL: https://github.com/apache/pinot/pull/10964#issuecomment-1603487400

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10964?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#10964](https://app.codecov.io/gh/apache/pinot/pull/10964?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (9b4b9ec) into [master](https://app.codecov.io/gh/apache/pinot/commit/24c5d8fe8521b91e68dd07294a3f57b067c0d525?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (24c5d8f) will **decrease** coverage by `0.12%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #10964      +/-   ##
   ==========================================
   - Coverage    0.11%    0.00%   -0.12%     
   ==========================================
     Files        2189     1654     -535     
     Lines      117995    86684   -31311     
     Branches    17854    13664    -4190     
   ==========================================
   - Hits          137        0     -137     
   + Misses     117838    86684   -31154     
   + Partials       20        0      -20     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `?` | |
   | integration1temurin17 | `?` | |
   | integration1temurin20 | `?` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin11 | `?` | |
   | unittests1temurin17 | `0.00% <ø> (ø)` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `?` | |
   | unittests2temurin17 | `?` | |
   | unittests2temurin20 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   [see 537 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/10964/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org