You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "walterddr (via GitHub)" <gi...@apache.org> on 2023/05/12 17:23:53 UTC

[GitHub] [pinot] walterddr opened a new pull request, #10760: [multistage] follow up leaf scheduler

walterddr opened a new pull request, #10760:
URL: https://github.com/apache/pinot/pull/10760

   this is a follow up on comments on #10711


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10760: [multistage] follow up leaf scheduler

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10760:
URL: https://github.com/apache/pinot/pull/10760#discussion_r1193013446


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -128,21 +129,21 @@ protected TransferableBlock getNextBlock() {
     TransferableBlock transferableBlock;
     try {
       transferableBlock = _sourceOperator.nextBlock();
-      while (!transferableBlock.isNoOpBlock()) {
-        if (transferableBlock.isEndOfStreamBlock()) {
-          if (transferableBlock.isSuccessfulEndOfStreamBlock()) {
-            //Stats need to be populated here because the block is being sent to the mailbox
-            // and the receiving opChain will not be able to access the stats from the previous opChain
-            TransferableBlock eosBlockWithStats = TransferableBlockUtils.getEndOfStreamTransferableBlock(
-                OperatorUtils.getMetadataFromOperatorStats(_opChainStats.getOperatorStatsMap()));
-            _exchange.send(eosBlockWithStats);
-          } else {
-            _exchange.send(transferableBlock);
-          }
-          return transferableBlock;
+      if (transferableBlock.isNoOpBlock()) {
+        return transferableBlock;
+      } else if (transferableBlock.isEndOfStreamBlock()) {
+        if (transferableBlock.isSuccessfulEndOfStreamBlock()) {

Review Comment:
   yes you are correct. code-wise they are identical. here i just separate them to distinguish signal block vs. data block for readability.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #10760: [multistage] follow up leaf scheduler

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #10760:
URL: https://github.com/apache/pinot/pull/10760#discussion_r1193216333


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/executor/OpChainSchedulerService.java:
##########
@@ -48,20 +48,16 @@ public class OpChainSchedulerService extends AbstractExecutionThreadService {
    * Default cancel signal retention, this should be set to several times larger than
    * {@link org.apache.pinot.query.service.QueryConfig#DEFAULT_SCHEDULER_RELEASE_TIMEOUT_MS}.
    */
-  private static final long DEFAULT_SCHEDULER_CANCELLATION_SIGNAL_RETENTION_MS = 60_000L;
+  private static final long SCHEDULER_CANCELLATION_SIGNAL_RETENTION_MS = 60_000L;
 
   private final OpChainScheduler _scheduler;
   private final ExecutorService _workerPool;
-  private final Cache<Long, Long> _cancelledRequests;
+  private final Cache<Long, Long> _cancelledRequests = CacheBuilder.newBuilder()

Review Comment:
   Hmm, what's the behavior when you have a very long timeout query? Or we should limit it from both control/data flows



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10760: [multistage] follow up leaf scheduler

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10760:
URL: https://github.com/apache/pinot/pull/10760#issuecomment-1546148983

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#10760](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (51e9a06) into [master](https://app.codecov.io/gh/apache/pinot/commit/fe98bb0783213504c77be397b1750de3962000ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fe98bb0) will **decrease** coverage by `1.78%`.
   > The diff coverage is `100.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #10760      +/-   ##
   ============================================
   - Coverage     70.45%   68.68%   -1.78%     
   - Complexity     5646     6456     +810     
   ============================================
     Files          2153     2152       -1     
     Lines        115396   115376      -20     
     Branches      17368    17366       -2     
   ============================================
   - Hits          81305    79241    -2064     
   - Misses        28449    30553    +2104     
   + Partials       5642     5582      -60     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `24.21% <0.00%> (-0.04%)` | :arrow_down: |
   | integration2 | `?` | |
   | unittests1 | `68.06% <100.00%> (+<0.01%)` | :arrow_up: |
   | unittests2 | `13.74% <0.00%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...va/org/apache/pinot/query/runtime/QueryRunner.java](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9RdWVyeVJ1bm5lci5qYXZh) | `88.28% <100.00%> (+0.21%)` | :arrow_up: |
   | [...uery/runtime/executor/OpChainSchedulerService.java](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9leGVjdXRvci9PcENoYWluU2NoZWR1bGVyU2VydmljZS5qYXZh) | `98.36% <100.00%> (+3.19%)` | :arrow_up: |
   | [...ot/query/runtime/executor/RoundRobinScheduler.java](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9leGVjdXRvci9Sb3VuZFJvYmluU2NoZWR1bGVyLmphdmE=) | `89.02% <100.00%> (+2.43%)` | :arrow_up: |
   | [...ot/query/runtime/operator/MailboxSendOperator.java](https://app.codecov.io/gh/apache/pinot/pull/10760?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9NYWlsYm94U2VuZE9wZXJhdG9yLmphdmE=) | `87.93% <100.00%> (-0.21%)` | :arrow_down: |
   
   ... and [169 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/10760/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on a diff in pull request #10760: [multistage] follow up leaf scheduler

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on code in PR #10760:
URL: https://github.com/apache/pinot/pull/10760#discussion_r1192695138


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -128,21 +129,21 @@ protected TransferableBlock getNextBlock() {
     TransferableBlock transferableBlock;
     try {
       transferableBlock = _sourceOperator.nextBlock();
-      while (!transferableBlock.isNoOpBlock()) {
-        if (transferableBlock.isEndOfStreamBlock()) {
-          if (transferableBlock.isSuccessfulEndOfStreamBlock()) {
-            //Stats need to be populated here because the block is being sent to the mailbox
-            // and the receiving opChain will not be able to access the stats from the previous opChain
-            TransferableBlock eosBlockWithStats = TransferableBlockUtils.getEndOfStreamTransferableBlock(
-                OperatorUtils.getMetadataFromOperatorStats(_opChainStats.getOperatorStatsMap()));
-            _exchange.send(eosBlockWithStats);
-          } else {
-            _exchange.send(transferableBlock);
-          }
-          return transferableBlock;
+      if (transferableBlock.isNoOpBlock()) {
+        return transferableBlock;
+      } else if (transferableBlock.isEndOfStreamBlock()) {
+        if (transferableBlock.isSuccessfulEndOfStreamBlock()) {

Review Comment:
    seems like only `isEndOfStreamBlock` is special handling, other cases you always `return transferableBlock;`



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/executor/OpChainSchedulerService.java:
##########
@@ -48,20 +48,16 @@ public class OpChainSchedulerService extends AbstractExecutionThreadService {
    * Default cancel signal retention, this should be set to several times larger than
    * {@link org.apache.pinot.query.service.QueryConfig#DEFAULT_SCHEDULER_RELEASE_TIMEOUT_MS}.
    */
-  private static final long DEFAULT_SCHEDULER_CANCELLATION_SIGNAL_RETENTION_MS = 60_000L;
+  private static final long SCHEDULER_CANCELLATION_SIGNAL_RETENTION_MS = 60_000L;
 
   private final OpChainScheduler _scheduler;
   private final ExecutorService _workerPool;
-  private final Cache<Long, Long> _cancelledRequests;
+  private final Cache<Long, Long> _cancelledRequests = CacheBuilder.newBuilder()

Review Comment:
   Shall it still be tunable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10760: [multistage] follow up leaf scheduler

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10760:
URL: https://github.com/apache/pinot/pull/10760#discussion_r1192630013


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -125,24 +126,25 @@ public String toExplainString() {
 
   @Override
   protected TransferableBlock getNextBlock() {
+    boolean canContinue = true;

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10760: [multistage] follow up leaf scheduler

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10760:
URL: https://github.com/apache/pinot/pull/10760#discussion_r1193013316


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/executor/OpChainSchedulerService.java:
##########
@@ -48,20 +48,16 @@ public class OpChainSchedulerService extends AbstractExecutionThreadService {
    * Default cancel signal retention, this should be set to several times larger than
    * {@link org.apache.pinot.query.service.QueryConfig#DEFAULT_SCHEDULER_RELEASE_TIMEOUT_MS}.
    */
-  private static final long DEFAULT_SCHEDULER_CANCELLATION_SIGNAL_RETENTION_MS = 60_000L;
+  private static final long SCHEDULER_CANCELLATION_SIGNAL_RETENTION_MS = 60_000L;
 
   private final OpChainScheduler _scheduler;
   private final ExecutorService _workerPool;
-  private final Cache<Long, Long> _cancelledRequests;
+  private final Cache<Long, Long> _cancelledRequests = CacheBuilder.newBuilder()

Review Comment:
   shouldn't be a problem. cancellation signal are bounded by the control flow not the mailbox data flow, which has a much smaller timeout



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10760: [multistage] follow up leaf scheduler

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10760:
URL: https://github.com/apache/pinot/pull/10760#discussion_r1192630337


##########
pinot-query-runtime/src/test/java/org/apache/pinot/query/runtime/operator/MailboxSendOperatorTest.java:
##########
@@ -150,6 +150,12 @@ public void shouldSendDataBlock()
     MailboxSendOperator mailboxSendOperator = getMailboxSendOperator();
     TransferableBlock block = mailboxSendOperator.nextBlock();
 
+    // Then:
+    assertSame(block, dataBlock, "expected EOS block to propagate");

Review Comment:
   ```suggestion
       assertSame(block, dataBlock, "expected data block to first return");
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr merged pull request #10760: [multistage] follow up leaf scheduler

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr merged PR #10760:
URL: https://github.com/apache/pinot/pull/10760


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org