You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "zhtaoxiang (via GitHub)" <gi...@apache.org> on 2023/02/14 09:19:48 UTC

[GitHub] [pinot] zhtaoxiang opened a new pull request, #10279: move segment existency check to preprocess

zhtaoxiang opened a new pull request, #10279:
URL: https://github.com/apache/pinot/pull/10279

   Similar to `preUploadSegments` and `postUploadSegments`, we put lineage related logic into `preprocess` so that child classes can override it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #10279: move segment existency check to preprocess

Posted by "zhtaoxiang (via GitHub)" <gi...@apache.org>.
zhtaoxiang commented on code in PR #10279:
URL: https://github.com/apache/pinot/pull/10279#discussion_r1106165042


##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java:
##########
@@ -107,14 +107,31 @@ protected abstract List<SegmentConversionResult> convert(PinotTaskConfig pinotTa
 
   /**
    * Pre processing operations to be done at the beginning of task execution
+   *
+   * The default implementation checks whether all segments to process exist in the table, if not, terminate early to
+   * avoid wasting computing resources.
    */
-  protected void preProcess(PinotTaskConfig pinotTaskConfig) {
+  protected void preProcess(PinotTaskConfig pinotTaskConfig) throws Exception {
+    Map<String, String> configs = pinotTaskConfig.getConfigs();
+    String tableNameWithType = configs.get(MinionConstants.TABLE_NAME_KEY);
+    String inputSegmentNames = configs.get(MinionConstants.SEGMENT_NAME_KEY);
+    String uploadURL = configs.get(MinionConstants.UPLOAD_URL_KEY);
+    AuthProvider authProvider = AuthProviderUtils.makeAuthProvider(configs.get(MinionConstants.AUTH_TOKEN));
+    Set<String> segmentNamesForTable = SegmentConversionUtils.getSegmentNamesForTable(tableNameWithType,
+        FileUploadDownloadClient.extractBaseURI(new URI(uploadURL)), authProvider);
+    Set<String> nonExistingSegmentNames =
+        new HashSet<>(Arrays.asList(inputSegmentNames.split(MinionConstants.SEGMENT_NAME_SEPARATOR)));
+    nonExistingSegmentNames.removeAll(segmentNamesForTable);
+    if (!CollectionUtils.isEmpty(nonExistingSegmentNames)) {
+      throw new RuntimeException(String.format("table: %s does have the following segments to process: %s",
+          tableNameWithType, nonExistingSegmentNames));
+    }
   }
 
   /**
    * Post processing operations to be done before exiting a successful task execution
    */
-  protected void postProcess(PinotTaskConfig pinotTaskConfig) {
+  protected void postProcess(PinotTaskConfig pinotTaskConfig) throws Exception {
   }

Review Comment:
   The purpose is to make it consistent with `preUploadSegmentsm`,  `postUploadSegments` and `preprocess`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee merged pull request #10279: move segment existency check to preprocess

Posted by "snleee (via GitHub)" <gi...@apache.org>.
snleee merged PR #10279:
URL: https://github.com/apache/pinot/pull/10279


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10279: move segment existency check to preprocess

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10279:
URL: https://github.com/apache/pinot/pull/10279#discussion_r1106160968


##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java:
##########
@@ -107,14 +107,31 @@ protected abstract List<SegmentConversionResult> convert(PinotTaskConfig pinotTa
 
   /**
    * Pre processing operations to be done at the beginning of task execution
+   *
+   * The default implementation checks whether all segments to process exist in the table, if not, terminate early to
+   * avoid wasting computing resources.
    */
-  protected void preProcess(PinotTaskConfig pinotTaskConfig) {
+  protected void preProcess(PinotTaskConfig pinotTaskConfig) throws Exception {
+    Map<String, String> configs = pinotTaskConfig.getConfigs();
+    String tableNameWithType = configs.get(MinionConstants.TABLE_NAME_KEY);
+    String inputSegmentNames = configs.get(MinionConstants.SEGMENT_NAME_KEY);
+    String uploadURL = configs.get(MinionConstants.UPLOAD_URL_KEY);
+    AuthProvider authProvider = AuthProviderUtils.makeAuthProvider(configs.get(MinionConstants.AUTH_TOKEN));
+    Set<String> segmentNamesForTable = SegmentConversionUtils.getSegmentNamesForTable(tableNameWithType,
+        FileUploadDownloadClient.extractBaseURI(new URI(uploadURL)), authProvider);
+    Set<String> nonExistingSegmentNames =
+        new HashSet<>(Arrays.asList(inputSegmentNames.split(MinionConstants.SEGMENT_NAME_SEPARATOR)));
+    nonExistingSegmentNames.removeAll(segmentNamesForTable);
+    if (!CollectionUtils.isEmpty(nonExistingSegmentNames)) {
+      throw new RuntimeException(String.format("table: %s does have the following segments to process: %s",
+          tableNameWithType, nonExistingSegmentNames));
+    }
   }
 
   /**
    * Post processing operations to be done before exiting a successful task execution
    */
-  protected void postProcess(PinotTaskConfig pinotTaskConfig) {
+  protected void postProcess(PinotTaskConfig pinotTaskConfig) throws Exception {
   }

Review Comment:
   is this change needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10279: move segment existency check to preprocess

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10279:
URL: https://github.com/apache/pinot/pull/10279#issuecomment-1429463205

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#10279](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1b53302) into [master](https://codecov.io/gh/apache/pinot/commit/6caa8a038f328feed381511c382c8593e06138a0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6caa8a0) will **decrease** coverage by `32.14%`.
   > The diff coverage is `88.23%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #10279       +/-   ##
   =============================================
   - Coverage     64.23%   32.10%   -32.14%     
   + Complexity     5754      197     -5557     
   =============================================
     Files          1962     2015       +53     
     Lines        106865   109317     +2452     
     Branches      16324    16615      +291     
   =============================================
   - Hits          68641    35092    -33549     
   - Misses        33230    71091    +37861     
   + Partials       4994     3134     -1860     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `24.54% <88.23%> (?)` | |
   | unittests1 | `?` | |
   | unittests2 | `13.72% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../tasks/BaseMultipleSegmentsConversionExecutor.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcGx1Z2lucy9waW5vdC1taW5pb24tdGFza3MvcGlub3QtbWluaW9uLWJ1aWx0aW4tdGFza3Mvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3BsdWdpbi9taW5pb24vdGFza3MvQmFzZU11bHRpcGxlU2VnbWVudHNDb252ZXJzaW9uRXhlY3V0b3IuamF2YQ==) | `83.00% <88.23%> (+80.02%)` | :arrow_up: |
   | [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/table/FSTType.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL0ZTVFR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/user/RoleType.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3VzZXIvUm9sZVR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/data/MetricFieldSpec.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9NZXRyaWNGaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/stream/StreamMessage.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvc3RyZWFtL1N0cmVhbU1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/common/tier/TierFactory.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/pinot/spi/config/table/TableType.java](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL1RhYmxlVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1418 more](https://codecov.io/gh/apache/pinot/pull/10279?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org