You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "walterddr (via GitHub)" <gi...@apache.org> on 2023/02/01 00:56:18 UTC
[GitHub] [pinot] walterddr opened a new pull request, #10211: [multistage] add singleton instance stage
walterddr opened a new pull request, #10211:
URL: https://github.com/apache/pinot/pull/10211
some of the operators in v2 engine are actually singleton instance stages such as
- global agg ` SELECT COUNT(*) FROM tbl` or
- global sort `SELECT col FROM tbl ORDER BY col`
there's no reason to run them on multiple instances which we currently do. the current implementation works b/c
- all but one instances is actually processing data
- all remaining instances will process pass-through end-of-stream blocks and thus contribute nothing to the final results.
This PR allows these operator to run on a random singleton server
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter commented on pull request #10211: [multistage] add singleton instance stage
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10211:
URL: https://github.com/apache/pinot/pull/10211#issuecomment-1411334620
# [Codecov](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#10211](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (96039a4) into [master](https://codecov.io/gh/apache/pinot/commit/36e741878715fffbd51ef918036642f46b16fd0c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (36e7418) will **decrease** coverage by `37.67%`.
> The diff coverage is `0.00%`.
```diff
@@ Coverage Diff @@
## master #10211 +/- ##
=============================================
- Coverage 70.41% 32.74% -37.67%
+ Complexity 5752 202 -5550
=============================================
Files 2017 2017
Lines 108978 108994 +16
Branches 16557 16564 +7
=============================================
- Hits 76738 35695 -41043
- Misses 26847 70158 +43311
+ Partials 5393 3141 -2252
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `?` | |
| integration2 | `24.45% <0.00%> (+0.13%)` | :arrow_up: |
| unittests1 | `?` | |
| unittests2 | `13.71% <0.00%> (+<0.01%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...a/org/apache/pinot/query/planner/PlannerUtils.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9QbGFubmVyVXRpbHMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/pinot/query/planner/StageMetadata.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9TdGFnZU1ldGFkYXRhLmphdmE=) | `0.00% <0.00%> (-95.00%)` | :arrow_down: |
| [.../org/apache/pinot/query/routing/WorkerManager.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcm91dGluZy9Xb3JrZXJNYW5hZ2VyLmphdmE=) | `0.00% <0.00%> (-85.51%)` | :arrow_down: |
| [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ava/org/apache/pinot/spi/config/table/FSTType.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL0ZTVFR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ava/org/apache/pinot/spi/config/user/RoleType.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3VzZXIvUm9sZVR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ava/org/apache/pinot/spi/data/MetricFieldSpec.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9NZXRyaWNGaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ava/org/apache/pinot/spi/stream/StreamMessage.java](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvc3RyZWFtL1N0cmVhbU1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [1249 more](https://codecov.io/gh/apache/pinot/pull/10211?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10211: [multistage] add singleton instance stage
Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10211:
URL: https://github.com/apache/pinot/pull/10211#discussion_r1093773561
##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinExchangeNodeInsertRule.java:
##########
@@ -65,7 +65,7 @@ public void onMatch(RelOptRuleCall call) {
if (joinInfo.leftKeys.isEmpty()) {
// when there's no JOIN key, use broadcast.
- leftExchange = LogicalExchange.create(leftInput, RelDistributions.SINGLETON);
+ leftExchange = LogicalExchange.create(leftInput, RelDistributions.RANDOM_DISTRIBUTED);
rightExchange = LogicalExchange.create(rightInput, RelDistributions.BROADCAST_DISTRIBUTED);
Review Comment:
Not sure I follow this change. If we are doing a BROADCAST_EXCHANGE on one side, why do we need to insert any EXCHANGE on the other side ? May be I am missing something.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10211: [multistage] add singleton instance stage
Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10211:
URL: https://github.com/apache/pinot/pull/10211#discussion_r1093893913
##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinExchangeNodeInsertRule.java:
##########
@@ -65,7 +65,7 @@ public void onMatch(RelOptRuleCall call) {
if (joinInfo.leftKeys.isEmpty()) {
// when there's no JOIN key, use broadcast.
- leftExchange = LogicalExchange.create(leftInput, RelDistributions.SINGLETON);
+ leftExchange = LogicalExchange.create(leftInput, RelDistributions.RANDOM_DISTRIBUTED);
rightExchange = LogicalExchange.create(rightInput, RelDistributions.BROADCAST_DISTRIBUTED);
Review Comment:
Discussed offline with @walterddr. This is sort of faking an exchange.
My question was from the perspective that left side does not need data movement
```
Leaf stage - scan -> filter -> project right side. NO-OP for left side
BROADCAST_EXCHANGE right side
Intermediary stage to run JOIN (can be run on same set of servers where no data movement needed for left side)
```
Since the intermediary stage needs to run the new operators and there is no mechanism to wait currently, we need to do it this way.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10211: [multistage] add singleton instance stage
Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10211:
URL: https://github.com/apache/pinot/pull/10211#discussion_r1093893913
##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinExchangeNodeInsertRule.java:
##########
@@ -65,7 +65,7 @@ public void onMatch(RelOptRuleCall call) {
if (joinInfo.leftKeys.isEmpty()) {
// when there's no JOIN key, use broadcast.
- leftExchange = LogicalExchange.create(leftInput, RelDistributions.SINGLETON);
+ leftExchange = LogicalExchange.create(leftInput, RelDistributions.RANDOM_DISTRIBUTED);
rightExchange = LogicalExchange.create(rightInput, RelDistributions.BROADCAST_DISTRIBUTED);
Review Comment:
Discussed offline with @walterddr. This is sort of faking an exchange.
My question was from the perspective that left side does not need data movement
```
Leaf stage - scan -> filter -> project right side
BROADCAST_EXCHANGE right side
Intermediary stage to run JOIN (can be run on same set of servers where no data movement needed for left side)
```
Since the intermediary stage needs to run the new operators and there is no mechanism to wait currently, we need to do it this way.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] siddharthteotia merged pull request #10211: [multistage] add singleton instance stage
Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia merged PR #10211:
URL: https://github.com/apache/pinot/pull/10211
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10211: [multistage] add singleton instance stage
Posted by "siddharthteotia (via GitHub)" <gi...@apache.org>.
siddharthteotia commented on code in PR #10211:
URL: https://github.com/apache/pinot/pull/10211#discussion_r1093893913
##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinExchangeNodeInsertRule.java:
##########
@@ -65,7 +65,7 @@ public void onMatch(RelOptRuleCall call) {
if (joinInfo.leftKeys.isEmpty()) {
// when there's no JOIN key, use broadcast.
- leftExchange = LogicalExchange.create(leftInput, RelDistributions.SINGLETON);
+ leftExchange = LogicalExchange.create(leftInput, RelDistributions.RANDOM_DISTRIBUTED);
rightExchange = LogicalExchange.create(rightInput, RelDistributions.BROADCAST_DISTRIBUTED);
Review Comment:
Discussed offline with @walterddr. This is sort of faking an exchange.
My question was from the perspective that left side does not need data movement
```
Leaf stage - scan -> filter -> project right side. NO-OP for left side
BROADCAST_EXCHANGE right side
Intermediary stage to run JOIN -- can be run on same set of servers where no
data movement needed for left side
```
Since the intermediary stage needs to run the new operators and there is no mechanism to wait currently, we need to do it this way.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] walterddr commented on a diff in pull request #10211: [multistage] add singleton instance stage
Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr commented on code in PR #10211:
URL: https://github.com/apache/pinot/pull/10211#discussion_r1093776745
##########
pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotJoinExchangeNodeInsertRule.java:
##########
@@ -65,7 +65,7 @@ public void onMatch(RelOptRuleCall call) {
if (joinInfo.leftKeys.isEmpty()) {
// when there's no JOIN key, use broadcast.
- leftExchange = LogicalExchange.create(leftInput, RelDistributions.SINGLETON);
+ leftExchange = LogicalExchange.create(leftInput, RelDistributions.RANDOM_DISTRIBUTED);
rightExchange = LogicalExchange.create(rightInput, RelDistributions.BROADCAST_DISTRIBUTED);
Review Comment:
it is the least restrictive exchange needed --> e.g. if right side has already broadcast the data; left side can be arbitrarily exchange its data to join stage.
- previously we can allow singleton here b/c we want to optimize local-to-local transfer without ser/de or grpc hopping;
- however with the new singleton stage, we might not be able to apply this optimization thus we relax the exchange to random
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org