You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/11/02 07:46:18 UTC

[GitHub] [pinot] wirybeaver opened a new pull request, #9710: download offline segments from peers when segment streaming download …

wirybeaver opened a new pull request, #9710:
URL: https://github.com/apache/pinot/pull/9710

   [Design doc](https://docs.google.com/document/d/1HtU8NsYz84NiAKGh16Yv8WKxyhdXHx-KNsINcWZ7MEI/edit?usp=sharing)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jadami10 commented on a diff in pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
jadami10 commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1024166478


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/SegmentFetcherFactory.java:
##########
@@ -196,4 +206,21 @@ private void fetchAndDecryptSegmentToLocalInternal(String uri, File dest, String
       crypter.decrypt(tempDownloadedFile, dest);
     }
   }
+
+  private void fetchAndDecryptSegmentToLocalInternal(@NonNull List<URI> uris, File dest, String crypterName)
+          throws Exception {
+    Preconditions.checkArgument(!uris.isEmpty(), "empty uris passed into the fetchAndDecryptSegmentToLocalInternal");
+    URI uri = uris.get(RANDOM.nextInt(uris.size()));

Review Comment:
   these are all the peer URIs, and you're randomly selecting one? If so, this feels like it should just live in the calling function to this. No need to duplicate the `fetch` functions here just to get a random selector in



##########
pinot-core/pom.xml:
##########
@@ -218,7 +218,7 @@
     </dependency>
     <dependency>
       <groupId>org.mockito</groupId>
-      <artifactId>mockito-core</artifactId>
+      <artifactId>mockito-inline</artifactId>

Review Comment:
   was there something you needed from this that core couldn't achieve?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat commented on a diff in pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1014529974


##########
pinot-spi/src/main/java/org/apache/pinot/spi/config/instance/InstanceDataManagerConfig.java:
##########
@@ -66,4 +66,6 @@ public interface InstanceDataManagerConfig {
   int getDeletedSegmentsCacheSize();
 
   int getDeletedSegmentsCacheTtlMinutes();
+
+  String getPeerDownloadScheme();

Review Comment:
   Nit: getSegmentPeerDownloadScheme()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1317854252

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/9710?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#9710](https://codecov.io/gh/apache/pinot/pull/9710?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7685439) into [master](https://codecov.io/gh/apache/pinot/commit/cd1a017debf95e088881b191319c521b2a3a4296?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cd1a017) will **decrease** coverage by `29.45%`.
   > The diff coverage is `13.04%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #9710       +/-   ##
   =============================================
   - Coverage     64.11%   34.66%   -29.46%     
   + Complexity     5385      190     -5195     
   =============================================
     Files          1903     1956       +53     
     Lines        102554   105019     +2465     
     Branches      15604    15897      +293     
   =============================================
   - Hits          65753    36402    -29351     
   - Misses        32017    65503    +33486     
   + Partials       4784     3114     -1670     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `25.25% <13.04%> (?)` | |
   | unittests1 | `?` | |
   | unittests2 | `15.68% <0.00%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ent/local/data/manager/TableDataManagerConfig.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9kYXRhL21hbmFnZXIvVGFibGVEYXRhTWFuYWdlckNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../pinot/core/data/manager/BaseTableDataManager.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvQmFzZVRhYmxlRGF0YU1hbmFnZXIuamF2YQ==) | `58.10% <7.69%> (-17.35%)` | :arrow_down: |
   | [...ot/common/utils/fetcher/SegmentFetcherFactory.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvZmV0Y2hlci9TZWdtZW50RmV0Y2hlckZhY3RvcnkuamF2YQ==) | `39.72% <8.33%> (-53.72%)` | :arrow_down: |
   | [...a/org/apache/pinot/common/metrics/ServerMeter.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9TZXJ2ZXJNZXRlci5qYXZh) | `100.00% <100.00%> (+1.51%)` | :arrow_up: |
   | [.../starter/helix/HelixInstanceDataManagerConfig.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zZXJ2ZXIvc3RhcnRlci9oZWxpeC9IZWxpeEluc3RhbmNlRGF0YU1hbmFnZXJDb25maWcuamF2YQ==) | `81.63% <100.00%> (ø)` | |
   | [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/table/FSTType.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL0ZTVFR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/user/RoleType.java](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3VzZXIvUm9sZVR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1297 more](https://codecov.io/gh/apache/pinot/pull/9710/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat commented on a diff in pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1014527773


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/SegmentFetcherFactory.java:
##########
@@ -20,16 +20,16 @@
 
 import java.io.File;
 import java.net.URI;
-import java.util.Collections;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Map;
+import java.util.*;

Review Comment:
   Please revert this. It is probably due to your IDE setting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on a diff in pull request #9710: Download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1028564167


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/SegmentFetcherFactory.java:
##########
@@ -196,4 +206,21 @@ private void fetchAndDecryptSegmentToLocalInternal(String uri, File dest, String
       crypter.decrypt(tempDownloadedFile, dest);
     }
   }
+
+  private void fetchAndDecryptSegmentToLocalInternal(@NonNull List<URI> uris, File dest, String crypterName)
+          throws Exception {
+    Preconditions.checkArgument(!uris.isEmpty(), "empty uris passed into the fetchAndDecryptSegmentToLocalInternal");
+    URI uri = uris.get(RANDOM.nextInt(uris.size()));

Review Comment:
   For this PR, I won't integrate the strategy pattern.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on a diff in pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1024598441


##########
pinot-core/pom.xml:
##########
@@ -218,7 +218,7 @@
     </dependency>
     <dependency>
       <groupId>org.mockito</groupId>
-      <artifactId>mockito-core</artifactId>
+      <artifactId>mockito-inline</artifactId>

Review Comment:
   Test static function. Some other module also replace core with inline. If I remember correctly, pinot-common has a detailed comment in the pom



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat commented on a diff in pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1014531058


##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -532,19 +535,44 @@ File downloadAndDecrypt(String segmentName, SegmentZKMetadata zkMetadata, File t
       SegmentFetcherFactory.fetchAndDecryptSegmentToLocal(uri, tarFile, zkMetadata.getCrypterName());
       LOGGER.info("Downloaded tarred segment: {} for table: {} from: {} to: {}, file length: {}", segmentName,
           _tableNameWithType, uri, tarFile, tarFile.length());
+      downloadSuccess = true;
       return tarFile;
     } catch (AttemptsExceededException e) {
       LOGGER.error("Attempts exceeded when downloading segment: {} for table: {} from: {} to: {}", segmentName,
           _tableNameWithType, uri, tarFile);
-      _serverMetrics.addMeteredTableValue(_tableNameWithType, ServerMeter.SEGMENT_DOWNLOAD_FAILURES, 1L);
-      throw e;
+      if (_tableDataManagerConfig.getTablePeerDownloadScheme() == null) {

Review Comment:
   Can you add a unit test to verify the new behavior?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1313208273

   rebase and solve conflict


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1308307171

   > this is awesome.
   > 
   > How are you thinking about observability here? If you didn't expect the deep store data to be missing, but servers are just silently downloading segments from one another, how would you know? `Attempts exceeded when downloading segment` seems to be the only indicator. Maybe we need 2 new metrics for "download failed from deep store" vs "download failed from peer" and keep the current "download failed" as is to be used if both fail for backwards compatibility?
   
   Done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on a diff in pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1024597451


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/SegmentFetcherFactory.java:
##########
@@ -196,4 +206,21 @@ private void fetchAndDecryptSegmentToLocalInternal(String uri, File dest, String
       crypter.decrypt(tempDownloadedFile, dest);
     }
   }
+
+  private void fetchAndDecryptSegmentToLocalInternal(@NonNull List<URI> uris, File dest, String crypterName)
+          throws Exception {
+    Preconditions.checkArgument(!uris.isEmpty(), "empty uris passed into the fetchAndDecryptSegmentToLocalInternal");
+    URI uri = uris.get(RANDOM.nextInt(uris.size()));

Review Comment:
   I am considering whether we should introduces the strategy pattern here: user can either choose to pick up a random or go over all peers. The trade of between responsiveness and data reliability can be determined by user. For the sake of time, I hide the random implementation inside this function. In the future, I think the function signature would be
   ```
   private void fetchAndDecryptSegmentToLocalInternal(@NonNull List<URI> uris, File dest, String crypterName, PeerDownloadderStrategy strategy);
   
   interface PeerDownloaderStrategy {
   Response download(List<URI> uris, File dest, Context);
   }
   
   class Context {
    String crypterName;
   }
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jadami10 commented on pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
jadami10 commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1317204756

   > A newbie question, the [guide](https://docs.pinot.apache.org/developers/developers-and-contributors/contribution-guidelines#creating-a-pull-request-pr) recommend to use git rebase. AFAIK, all commits will be squash to a single commit and thus can guarantee that make the commit history linear when the PR got merged. If so, shall we use git merge when pull the remote master change? Otherwise, I need to force overriding every time push after git rebase. @jadami10
   
   With github, I generally use `git merge` when updating PRs because it keeps the change history. When you rebase/force push, the reviewer loses the "see changes since your last review" button. It's not a big deal, just a quality of life change.
   
   If you've ever used something like phabricator, it does a better job of decoupling changes from their actual commits, so even if someone rebases/force pushes, it can show you just the changes that user made.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on pull request #9710: Download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1322719983

   @chenboat I don't integrate the instance level config into realtime table in this approach, which might bring the rule of determining the value of final peer download scheme obscure. Check the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1317791484

   > > A newbie question, the [guide](https://docs.pinot.apache.org/developers/developers-and-contributors/contribution-guidelines#creating-a-pull-request-pr) recommend to use git rebase. AFAIK, all commits will be squash to a single commit and thus can guarantee that make the commit history linear when the PR got merged. If so, shall we use git merge when pull the remote master change? Otherwise, I need to force overriding every time push after git rebase. @jadami10
   > 
   > With github, I generally use `git merge` when updating PRs because it keeps the change history. When you rebase/force push, the reviewer loses the "see changes since your last review" button. It's not a big deal, just a quality of life change.
   > 
   > If you've ever used something like phabricator, it does a better job of decoupling changes from their actual commits, so even if someone rebases/force pushes, it can show you just the changes that user made.
   
   Thanks. Totally agree with you I use github before joining uber. I am meant to use `get merge` but the official Pinot OSS developer guide uses `git rebase` at the bottom when pushing to the remote branch, which made me confused. We need to update the instruction. https://github.com/apache/pinot/pull/9710


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on a diff in pull request #9710: Download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1028564167


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/SegmentFetcherFactory.java:
##########
@@ -196,4 +206,21 @@ private void fetchAndDecryptSegmentToLocalInternal(String uri, File dest, String
       crypter.decrypt(tempDownloadedFile, dest);
     }
   }
+
+  private void fetchAndDecryptSegmentToLocalInternal(@NonNull List<URI> uris, File dest, String crypterName)
+          throws Exception {
+    Preconditions.checkArgument(!uris.isEmpty(), "empty uris passed into the fetchAndDecryptSegmentToLocalInternal");
+    URI uri = uris.get(RANDOM.nextInt(uris.size()));

Review Comment:
   For this PR, I won't integrate the strategy pattern and directly use the random pick strategy by default. What do you think? @jadami10 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat merged pull request #9710: Download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat merged PR #9710:
URL: https://github.com/apache/pinot/pull/9710


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat commented on a diff in pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1024598026


##########
pinot-core/src/test/java/org/apache/pinot/core/data/manager/BaseTableDataManagerTest.java:
##########
@@ -614,6 +622,61 @@ public void testDownloadAndDecrypt()
     }
   }
 
+  // case 2: if the attempt to download from deep storage exceeds, invoke downloadFromPeers.
+  @Test
+  public void testDownloadAndDecryptCase2() throws Exception {
+    File tempInput = new File(TEMP_DIR, "tmp.txt");
+    FileUtils.write(tempInput, "this is from somewhere remote");
+
+    SegmentZKMetadata zkmd = mock(SegmentZKMetadata.class);
+    String backupCopyURI = "file://" + tempInput.getAbsolutePath();
+    when(zkmd.getDownloadUrl()).thenReturn(backupCopyURI);
+
+    TableDataManagerConfig config = createDefaultTableDataManagerConfig();
+    when(config.getTablePeerDownloadScheme()).thenReturn("http");
+    BaseTableDataManager tmgr = createSpyOfflineTableManager(config);
+    File tempRootDir = tmgr.getTmpSegmentDataDir("test-download-decrypt-peer");
+
+    // As the case 2 description says, we need to mock the static method fetchAndDecryptSegmentToLocal to
+    // throw the AttemptExceed exception; Due to the constrain that mockito static cannot do argument matching,

Review Comment:
   s/constrain/constraint/g



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat commented on a diff in pull request #9710: [WIP] download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1024601467


##########
pinot-core/src/test/java/org/apache/pinot/core/data/manager/BaseTableDataManagerTest.java:
##########
@@ -614,6 +622,61 @@ public void testDownloadAndDecrypt()
     }
   }
 
+  // case 2: if the attempt to download from deep storage exceeds, invoke downloadFromPeers.
+  @Test
+  public void testDownloadAndDecryptCase2() throws Exception {

Review Comment:
   NIT: instead of Case2, be more specific in the test name: it is about downloading from peers. 



##########
pinot-core/src/test/java/org/apache/pinot/core/data/manager/BaseTableDataManagerTest.java:
##########
@@ -614,6 +622,61 @@ public void testDownloadAndDecrypt()
     }
   }
 
+  // case 2: if the attempt to download from deep storage exceeds, invoke downloadFromPeers.
+  @Test
+  public void testDownloadAndDecryptCase2() throws Exception {
+    File tempInput = new File(TEMP_DIR, "tmp.txt");
+    FileUtils.write(tempInput, "this is from somewhere remote");
+
+    SegmentZKMetadata zkmd = mock(SegmentZKMetadata.class);
+    String backupCopyURI = "file://" + tempInput.getAbsolutePath();
+    when(zkmd.getDownloadUrl()).thenReturn(backupCopyURI);
+
+    TableDataManagerConfig config = createDefaultTableDataManagerConfig();
+    when(config.getTablePeerDownloadScheme()).thenReturn("http");
+    BaseTableDataManager tmgr = createSpyOfflineTableManager(config);
+    File tempRootDir = tmgr.getTmpSegmentDataDir("test-download-decrypt-peer");
+
+    // As the case 2 description says, we need to mock the static method fetchAndDecryptSegmentToLocal to
+    // throw the AttemptExceed exception; Due to the constrain that mockito static cannot do argument matching,
+    // e.g., any(), we have to pass exact argument value when mocking fetchAndDecryptSegmentToLocal.
+    // However, the first argument File is internally created, which cannot be mocked.
+    // Luckily, the File class's equal method only compares the path. Thus, we can create a file with identical path
+    // and use it to mock the fetchAndDecryptSegmentToLocal
+    File destFile = new File(tempRootDir, "seg01" + TarGzCompressionUtils.TAR_GZ_FILE_EXTENSION);
+    doNothing().when(tmgr).downloadFromPeersWithoutStreaming("seg01", zkmd, destFile);
+    try (MockedStatic<SegmentFetcherFactory> mockSegFactory = mockStatic(SegmentFetcherFactory.class)) {
+      mockSegFactory.when(() -> SegmentFetcherFactory.fetchAndDecryptSegmentToLocal(backupCopyURI, destFile, null))
+          .thenThrow(new AttemptsExceededException("fake attempt exceeds exception"));
+      tmgr.downloadAndDecrypt("seg01", zkmd, tempRootDir);
+    }
+    verify(tmgr, times(1)).downloadFromPeersWithoutStreaming("seg01", zkmd, destFile);
+  }
+
+  // happy case: down
+  @Test
+  public void testdownloadFromPeersWithoutStreaming() throws Exception {

Review Comment:
   Nit: testDownload...



##########
pinot-core/src/test/java/org/apache/pinot/core/data/manager/BaseTableDataManagerTest.java:
##########
@@ -614,6 +622,61 @@ public void testDownloadAndDecrypt()
     }
   }
 
+  // case 2: if the attempt to download from deep storage exceeds, invoke downloadFromPeers.
+  @Test
+  public void testDownloadAndDecryptCase2() throws Exception {
+    File tempInput = new File(TEMP_DIR, "tmp.txt");
+    FileUtils.write(tempInput, "this is from somewhere remote");
+
+    SegmentZKMetadata zkmd = mock(SegmentZKMetadata.class);
+    String backupCopyURI = "file://" + tempInput.getAbsolutePath();
+    when(zkmd.getDownloadUrl()).thenReturn(backupCopyURI);
+
+    TableDataManagerConfig config = createDefaultTableDataManagerConfig();
+    when(config.getTablePeerDownloadScheme()).thenReturn("http");
+    BaseTableDataManager tmgr = createSpyOfflineTableManager(config);
+    File tempRootDir = tmgr.getTmpSegmentDataDir("test-download-decrypt-peer");
+
+    // As the case 2 description says, we need to mock the static method fetchAndDecryptSegmentToLocal to
+    // throw the AttemptExceed exception; Due to the constrain that mockito static cannot do argument matching,
+    // e.g., any(), we have to pass exact argument value when mocking fetchAndDecryptSegmentToLocal.
+    // However, the first argument File is internally created, which cannot be mocked.
+    // Luckily, the File class's equal method only compares the path. Thus, we can create a file with identical path
+    // and use it to mock the fetchAndDecryptSegmentToLocal
+    File destFile = new File(tempRootDir, "seg01" + TarGzCompressionUtils.TAR_GZ_FILE_EXTENSION);
+    doNothing().when(tmgr).downloadFromPeersWithoutStreaming("seg01", zkmd, destFile);
+    try (MockedStatic<SegmentFetcherFactory> mockSegFactory = mockStatic(SegmentFetcherFactory.class)) {
+      mockSegFactory.when(() -> SegmentFetcherFactory.fetchAndDecryptSegmentToLocal(backupCopyURI, destFile, null))
+          .thenThrow(new AttemptsExceededException("fake attempt exceeds exception"));
+      tmgr.downloadAndDecrypt("seg01", zkmd, tempRootDir);
+    }
+    verify(tmgr, times(1)).downloadFromPeersWithoutStreaming("seg01", zkmd, destFile);
+  }
+
+  // happy case: down

Review Comment:
   download



##########
pinot-core/src/test/java/org/apache/pinot/core/data/manager/BaseTableDataManagerTest.java:
##########
@@ -614,6 +622,61 @@ public void testDownloadAndDecrypt()
     }
   }
 
+  // case 2: if the attempt to download from deep storage exceeds, invoke downloadFromPeers.
+  @Test
+  public void testDownloadAndDecryptCase2() throws Exception {
+    File tempInput = new File(TEMP_DIR, "tmp.txt");
+    FileUtils.write(tempInput, "this is from somewhere remote");
+
+    SegmentZKMetadata zkmd = mock(SegmentZKMetadata.class);
+    String backupCopyURI = "file://" + tempInput.getAbsolutePath();
+    when(zkmd.getDownloadUrl()).thenReturn(backupCopyURI);
+
+    TableDataManagerConfig config = createDefaultTableDataManagerConfig();
+    when(config.getTablePeerDownloadScheme()).thenReturn("http");
+    BaseTableDataManager tmgr = createSpyOfflineTableManager(config);
+    File tempRootDir = tmgr.getTmpSegmentDataDir("test-download-decrypt-peer");
+
+    // As the case 2 description says, we need to mock the static method fetchAndDecryptSegmentToLocal to
+    // throw the AttemptExceed exception; Due to the constrain that mockito static cannot do argument matching,
+    // e.g., any(), we have to pass exact argument value when mocking fetchAndDecryptSegmentToLocal.
+    // However, the first argument File is internally created, which cannot be mocked.
+    // Luckily, the File class's equal method only compares the path. Thus, we can create a file with identical path
+    // and use it to mock the fetchAndDecryptSegmentToLocal
+    File destFile = new File(tempRootDir, "seg01" + TarGzCompressionUtils.TAR_GZ_FILE_EXTENSION);
+    doNothing().when(tmgr).downloadFromPeersWithoutStreaming("seg01", zkmd, destFile);
+    try (MockedStatic<SegmentFetcherFactory> mockSegFactory = mockStatic(SegmentFetcherFactory.class)) {
+      mockSegFactory.when(() -> SegmentFetcherFactory.fetchAndDecryptSegmentToLocal(backupCopyURI, destFile, null))
+          .thenThrow(new AttemptsExceededException("fake attempt exceeds exception"));
+      tmgr.downloadAndDecrypt("seg01", zkmd, tempRootDir);
+    }
+    verify(tmgr, times(1)).downloadFromPeersWithoutStreaming("seg01", zkmd, destFile);
+  }
+
+  // happy case: down
+  @Test
+  public void testdownloadFromPeersWithoutStreaming() throws Exception {
+    File tempInput = new File(TEMP_DIR, "tmp.txt");

Review Comment:
   A lot of test codes are shared between the two new test methods. Can we refactor them? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jadami10 commented on pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
jadami10 commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1300655618

   this is awesome. 
   
   How are you thinking about observability here? If you didn't expect the deep store data to be missing, but servers are just silently downloading segments from one another, how would you know? `Attempts exceeded when downloading segment` seems to be the only indicator. Maybe we need 2 new metrics for "download failed from deep store" vs "download failed from peer" and keep the current "download failed" as is to be used if both fail for backwards compatibility?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] chenboat commented on a diff in pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
chenboat commented on code in PR #9710:
URL: https://github.com/apache/pinot/pull/9710#discussion_r1014528041


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/fetcher/SegmentFetcherFactory.java:
##########
@@ -182,6 +183,12 @@ public static void fetchAndDecryptSegmentToLocal(String uri, File dest, String c
     getInstance().fetchAndDecryptSegmentToLocalInternal(uri, dest, crypterName);
   }
 
+  // uris have equal status

Review Comment:
   The comment lacks details: do you mean each uri in the list has equal prob being selected for segment download?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] wirybeaver commented on pull request #9710: download offline segments from peers

Posted by GitBox <gi...@apache.org>.
wirybeaver commented on PR #9710:
URL: https://github.com/apache/pinot/pull/9710#issuecomment-1300786438

   > this is awesome.
   > 
   > How are you thinking about observability here? If you didn't expect the deep store data to be missing, but servers are just silently downloading segments from one another, how would you know? `Attempts exceeded when downloading segment` seems to be the only indicator. Maybe we need 2 new metrics for "download failed from deep store" vs "download failed from peer" and keep the current "download failed" as is to be used if both fail for backwards compatibility?
   
   Thanks for pointing it out. Will do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org