You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "KKcorps (via GitHub)" <gi...@apache.org> on 2023/03/24 19:31:21 UTC

[GitHub] [pinot] KKcorps opened a new pull request, #10473: WIP: Do not serialize metrics

KKcorps opened a new pull request, #10473:
URL: https://github.com/apache/pinot/pull/10473

   There's no need to serialize the metrics unless they are sent or received. We can simply use the OperatorStats objects for rest of the operators.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1152229000


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxReceiveOperator.java:
##########
@@ -215,8 +216,10 @@ protected TransferableBlock getNextBlock() {
                 return block;
               }
             } else {
-              if (!block.getResultMetadata().isEmpty()) {
-                _operatorStatsMap.putAll(block.getResultMetadata());
+              if (_opChainStats != null && !block.getResultMetadata().isEmpty()) {
+                for (Map.Entry<String, OperatorStats> entry : block.getResultMetadata().entrySet()) {
+                  _opChainStats.getOperatorStatsMap().compute(entry.getKey(), (_key, _value) -> entry.getValue());

Review Comment:
   Tracing thing is being taken care of in the next PR. Need to rebase that PR with the changes in this one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1152226441


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -39,21 +35,18 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
 
   // TODO: Move to OperatorContext class.
   protected final OperatorStats _operatorStats;
-  protected final Map<String, OperatorStats> _operatorStatsMap;
-  private final String _operatorId;
+  protected final String _operatorId;
+  protected OpChainStats _opChainStats;

Review Comment:
   Ohh yes, making that change. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1153364284


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -39,21 +35,18 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
 
   // TODO: Move to OperatorContext class.
   protected final OperatorStats _operatorStats;

Review Comment:
   Yes, we still need it as OperatorStats contains stats only for a single operator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps merged pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps merged PR #10473:
URL: https://github.com/apache/pinot/pull/10473


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1155060568


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -151,10 +152,20 @@ protected TransferableBlock getNextBlock() {
     try {
       transferableBlock = _dataTableBlockBaseOperator.nextBlock();
       while (!transferableBlock.isNoOpBlock()) {
-        _exchange.send(transferableBlock);
         if (transferableBlock.isEndOfStreamBlock()) {
-          return transferableBlock;
+          if (transferableBlock.isSuccessfulEndOfStreamBlock()) {
+            //Stats need to be populated here because the block is being sent to the mailbox
+            // and the receiving opChain will not be able to access the stats from the previous opChain
+            TransferableBlock eosBlockWithStats = TransferableBlockUtils.getEndOfStreamTransferableBlock(
+                OperatorUtils.getMetadataFromOperatorStats(_opChainStats.getOperatorStatsMap()));
+            _exchange.send(eosBlockWithStats);
+            return transferableBlock;

Review Comment:
   nit: return can be unified outside of the inner if-else loop



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1152227371


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -151,10 +152,18 @@ protected TransferableBlock getNextBlock() {
     try {
       transferableBlock = _dataTableBlockBaseOperator.nextBlock();
       while (!transferableBlock.isNoOpBlock()) {
-        _exchange.send(transferableBlock);
         if (transferableBlock.isEndOfStreamBlock()) {
-          return transferableBlock;
+          if (transferableBlock.isSuccessfulEndOfStreamBlock()) {
+            TransferableBlock eosBlockWithStats = TransferableBlockUtils.getEndOfStreamTransferableBlock(
+                OperatorUtils.getMetadataFromOperatorStats(_opChainStats.getOperatorStatsMap()));
+            _exchange.send(eosBlockWithStats);
+            return eosBlockWithStats;

Review Comment:
   So that it gets serialized and sent to next OpChain.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10473: WIP: Do not serialize metrics

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10473:
URL: https://github.com/apache/pinot/pull/10473#issuecomment-1483351517

   ## [Codecov](https://codecov.io/gh/apache/pinot/pull/10473?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#10473](https://codecov.io/gh/apache/pinot/pull/10473?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f6c882a) into [master](https://codecov.io/gh/apache/pinot/commit/3687b2b732be22dc59f6ba08f90bce9e8b4ee45b?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3687b2b) will **decrease** coverage by `49.26%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #10473       +/-   ##
   =============================================
   - Coverage     63.23%   13.98%   -49.26%     
   + Complexity     5905      237     -5668     
   =============================================
     Files          2036     2016       -20     
     Lines        110973   109582     -1391     
     Branches      16892    16777      -115     
   =============================================
   - Hits          70177    15321    -54856     
   - Misses        35606    93011    +57405     
   + Partials       5190     1250     -3940     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `13.98% <0.00%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/10473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...query/runtime/operator/MailboxReceiveOperator.java](https://codecov.io/gh/apache/pinot/pull/10473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9NYWlsYm94UmVjZWl2ZU9wZXJhdG9yLmphdmE=) | `0.00% <0.00%> (-92.60%)` | :arrow_down: |
   | [...ot/query/runtime/operator/MailboxSendOperator.java](https://codecov.io/gh/apache/pinot/pull/10473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9NYWlsYm94U2VuZE9wZXJhdG9yLmphdmE=) | `0.00% <0.00%> (-75.81%)` | :arrow_down: |
   | [...not/query/runtime/operator/MultiStageOperator.java](https://codecov.io/gh/apache/pinot/pull/10473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9NdWx0aVN0YWdlT3BlcmF0b3IuamF2YQ==) | `0.00% <0.00%> (-79.55%)` | :arrow_down: |
   
   ... and [1844 files with indirect coverage changes](https://codecov.io/gh/apache/pinot/pull/10473/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1153481107


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -151,10 +152,18 @@ protected TransferableBlock getNextBlock() {
     try {
       transferableBlock = _dataTableBlockBaseOperator.nextBlock();
       while (!transferableBlock.isNoOpBlock()) {
-        _exchange.send(transferableBlock);
         if (transferableBlock.isEndOfStreamBlock()) {
-          return transferableBlock;
+          if (transferableBlock.isSuccessfulEndOfStreamBlock()) {
+            TransferableBlock eosBlockWithStats = TransferableBlockUtils.getEndOfStreamTransferableBlock(
+                OperatorUtils.getMetadataFromOperatorStats(_opChainStats.getOperatorStatsMap()));
+            _exchange.send(eosBlockWithStats);
+            return eosBlockWithStats;

Review Comment:
   - `_exchange.send(eosBlockWithStats)` sends to the next opChain
   - `return eosBlockWithStats` is returning to the opChainScheduler. 
    
   do we process anything related to stats on opChainScheduler? if not returning the original transferableBlock seems more reasonable



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -39,21 +35,18 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
 
   // TODO: Move to OperatorContext class.
   protected final OperatorStats _operatorStats;

Review Comment:
   IIUC, there will be 0 or 1 OperatorStats object corresponding to this operator in the OpChainStats map. if so why do we keep strong reference to both objects in 2 classes? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1152161485


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java:
##########
@@ -151,10 +152,18 @@ protected TransferableBlock getNextBlock() {
     try {
       transferableBlock = _dataTableBlockBaseOperator.nextBlock();
       while (!transferableBlock.isNoOpBlock()) {
-        _exchange.send(transferableBlock);
         if (transferableBlock.isEndOfStreamBlock()) {
-          return transferableBlock;
+          if (transferableBlock.isSuccessfulEndOfStreamBlock()) {
+            TransferableBlock eosBlockWithStats = TransferableBlockUtils.getEndOfStreamTransferableBlock(
+                OperatorUtils.getMetadataFromOperatorStats(_opChainStats.getOperatorStatsMap()));
+            _exchange.send(eosBlockWithStats);
+            return eosBlockWithStats;

Review Comment:
   why do we need to return this instead of just transferableBlock? we don't have any further processing of the stats right?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -39,21 +35,18 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
 
   // TODO: Move to OperatorContext class.
   protected final OperatorStats _operatorStats;
-  protected final Map<String, OperatorStats> _operatorStatsMap;
-  private final String _operatorId;
+  protected final String _operatorId;
+  protected OpChainStats _opChainStats;

Review Comment:
   this should be final yes?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxReceiveOperator.java:
##########
@@ -215,8 +216,10 @@ protected TransferableBlock getNextBlock() {
                 return block;
               }
             } else {
-              if (!block.getResultMetadata().isEmpty()) {
-                _operatorStatsMap.putAll(block.getResultMetadata());
+              if (_opChainStats != null && !block.getResultMetadata().isEmpty()) {
+                for (Map.Entry<String, OperatorStats> entry : block.getResultMetadata().entrySet()) {
+                  _opChainStats.getOperatorStatsMap().compute(entry.getKey(), (_key, _value) -> entry.getValue());

Review Comment:
   1. this is deserializing the metadata from received metadata from the previous stage and putting them into the current ones. if this is true. when there's no trace enabled, skip this step
   2. where is the opChain stats being encoded? or it is also in the map but when trace is enabled there will be more keys in the map?



##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -39,21 +35,18 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
 
   // TODO: Move to OperatorContext class.
   protected final OperatorStats _operatorStats;

Review Comment:
   do we still need this as final reference? can we also put this in OpChainStats and access via OpChainStats?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps commented on PR #10473:
URL: https://github.com/apache/pinot/pull/10473#issuecomment-1486894270

   @walterddr one query: Are you fine if I simply create OpChainStats inside OpChainExecutionContext and then use it everywhere instead of using `.attach` function introduced here.
   
   The only downside I see is context should be immutable while stats are mutable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] KKcorps commented on a diff in pull request #10473: Do not serialize metrics in each Operator

Posted by "KKcorps (via GitHub)" <gi...@apache.org>.
KKcorps commented on code in PR #10473:
URL: https://github.com/apache/pinot/pull/10473#discussion_r1154199486


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MultiStageOperator.java:
##########
@@ -39,21 +35,18 @@ public abstract class MultiStageOperator implements Operator<TransferableBlock>,
 
   // TODO: Move to OperatorContext class.
   protected final OperatorStats _operatorStats;

Review Comment:
   Cool. So now I have made the change to use directly via the map in OpChainStats rather than creating and storing a strong reference in  the base class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org