You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2023/01/16 07:43:54 UTC

[GitHub] [doris] XieJiann opened a new pull request, #15965: [fix](Nereids): always propose hashDistributeSpec of OlapScan

XieJiann opened a new pull request, #15965:
URL: https://github.com/apache/doris/pull/15965

   Signed-off-by: xiejiann <ji...@gmail.com>
   
   # Proposed changes
   
   always propose hashDistributeSpec of OlapScan
   
   ## Problem summary
   
   When calculating DistributionSpec of OlapScan, we don't need to consider colocate join because it can be done later
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
       - [ ] Yes
       - [ ] No
       - [ ] I don't know
   2. Has unit tests been added:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   3. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   4. Does it need to update dependencies:
       - [ ] Yes
       - [ ] No
   5. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #15965: [fix](Nereids): propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15965:
URL: https://github.com/apache/doris/pull/15965#issuecomment-1385570327

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #15965: [fix](Nereids): propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15965:
URL: https://github.com/apache/doris/pull/15965#issuecomment-1386961113

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morrySnow merged pull request #15965: [fix](Nereids): propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
morrySnow merged PR #15965:
URL: https://github.com/apache/doris/pull/15965


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #15965: [fix](Nereids): always propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #15965:
URL: https://github.com/apache/doris/pull/15965#issuecomment-1383800392

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 37.66 seconds
    load time: 504 seconds
    storage size: 17123068316 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230116101015_clickbench_pr_81066.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #15965: [fix](Nereids): propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15965:
URL: https://github.com/apache/doris/pull/15965#issuecomment-1385570395

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morrySnow commented on a diff in pull request #15965: [fix](Nereids): always propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
morrySnow commented on code in PR #15965:
URL: https://github.com/apache/doris/pull/15965#discussion_r1071477134


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalOlapScanToPhysicalOlapScan.java:
##########
@@ -48,48 +45,40 @@ public class LogicalOlapScanToPhysicalOlapScan extends OneImplementationRuleFact
     @Override
     public Rule build() {
         return logicalOlapScan().then(olapScan ->
-            new PhysicalOlapScan(
-                    olapScan.getId(),
-                    olapScan.getTable(),
-                    olapScan.getQualifier(),
-                    olapScan.getSelectedIndexId(),
-                    olapScan.getSelectedTabletIds(),
-                    olapScan.getSelectedPartitionIds(),
-                    convertDistribution(olapScan),
-                    olapScan.getPreAggStatus(),
-                    Optional.empty(),
-                    olapScan.getLogicalProperties())
+                new PhysicalOlapScan(
+                        olapScan.getId(),
+                        olapScan.getTable(),
+                        olapScan.getQualifier(),
+                        olapScan.getSelectedIndexId(),
+                        olapScan.getSelectedTabletIds(),
+                        olapScan.getSelectedPartitionIds(),
+                        convertDistribution(olapScan),
+                        olapScan.getPreAggStatus(),
+                        Optional.empty(),
+                        olapScan.getLogicalProperties())
         ).toRule(RuleType.LOGICAL_OLAP_SCAN_TO_PHYSICAL_OLAP_SCAN_RULE);
     }
 
     private DistributionSpec convertDistribution(LogicalOlapScan olapScan) {
         OlapTable olapTable = olapScan.getTable();
         DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo();
-        ColocateTableIndex colocateTableIndex = Env.getCurrentColocateIndex();
-        if ((colocateTableIndex.isColocateTable(olapTable.getId())
-                && !colocateTableIndex.isGroupUnstable(colocateTableIndex.getGroup(olapTable.getId())))
-                || olapTable.getPartitionInfo().getType() == PartitionType.UNPARTITIONED
-                || olapTable.getPartitions().size() == 1) {
-            if (!(distributionInfo instanceof HashDistributionInfo)) {
-                return DistributionSpecAny.INSTANCE;
-            }
-            HashDistributionInfo hashDistributionInfo = (HashDistributionInfo) distributionInfo;
-            List<Slot> output = olapScan.getOutput();
-            List<ExprId> hashColumns = Lists.newArrayList();
-            List<Column> schemaColumns = olapScan.getTable().getFullSchema();
-            for (int i = 0; i < schemaColumns.size(); i++) {
-                for (Column column : hashDistributionInfo.getDistributionColumns()) {
-                    if (schemaColumns.get(i).equals(column)) {
-                        hashColumns.add(output.get(i).getExprId());
-                    }
+
+        if (!(distributionInfo instanceof HashDistributionInfo)) {
+            return DistributionSpecAny.INSTANCE;
+        }
+        HashDistributionInfo hashDistributionInfo = (HashDistributionInfo) distributionInfo;

Review Comment:
   we must to consider partition numbers and colocate group status when we generate DistributionSpec of OlapScan.
   When the partition number is not 1 and it is not in a colocate group or the colocate group is not stable, OlapScan's distribution should be random(in other words, any)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morrySnow commented on a diff in pull request #15965: [fix](Nereids): always propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
morrySnow commented on code in PR #15965:
URL: https://github.com/apache/doris/pull/15965#discussion_r1071477134


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalOlapScanToPhysicalOlapScan.java:
##########
@@ -48,48 +45,40 @@ public class LogicalOlapScanToPhysicalOlapScan extends OneImplementationRuleFact
     @Override
     public Rule build() {
         return logicalOlapScan().then(olapScan ->
-            new PhysicalOlapScan(
-                    olapScan.getId(),
-                    olapScan.getTable(),
-                    olapScan.getQualifier(),
-                    olapScan.getSelectedIndexId(),
-                    olapScan.getSelectedTabletIds(),
-                    olapScan.getSelectedPartitionIds(),
-                    convertDistribution(olapScan),
-                    olapScan.getPreAggStatus(),
-                    Optional.empty(),
-                    olapScan.getLogicalProperties())
+                new PhysicalOlapScan(
+                        olapScan.getId(),
+                        olapScan.getTable(),
+                        olapScan.getQualifier(),
+                        olapScan.getSelectedIndexId(),
+                        olapScan.getSelectedTabletIds(),
+                        olapScan.getSelectedPartitionIds(),
+                        convertDistribution(olapScan),
+                        olapScan.getPreAggStatus(),
+                        Optional.empty(),
+                        olapScan.getLogicalProperties())
         ).toRule(RuleType.LOGICAL_OLAP_SCAN_TO_PHYSICAL_OLAP_SCAN_RULE);
     }
 
     private DistributionSpec convertDistribution(LogicalOlapScan olapScan) {
         OlapTable olapTable = olapScan.getTable();
         DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo();
-        ColocateTableIndex colocateTableIndex = Env.getCurrentColocateIndex();
-        if ((colocateTableIndex.isColocateTable(olapTable.getId())
-                && !colocateTableIndex.isGroupUnstable(colocateTableIndex.getGroup(olapTable.getId())))
-                || olapTable.getPartitionInfo().getType() == PartitionType.UNPARTITIONED
-                || olapTable.getPartitions().size() == 1) {
-            if (!(distributionInfo instanceof HashDistributionInfo)) {
-                return DistributionSpecAny.INSTANCE;
-            }
-            HashDistributionInfo hashDistributionInfo = (HashDistributionInfo) distributionInfo;
-            List<Slot> output = olapScan.getOutput();
-            List<ExprId> hashColumns = Lists.newArrayList();
-            List<Column> schemaColumns = olapScan.getTable().getFullSchema();
-            for (int i = 0; i < schemaColumns.size(); i++) {
-                for (Column column : hashDistributionInfo.getDistributionColumns()) {
-                    if (schemaColumns.get(i).equals(column)) {
-                        hashColumns.add(output.get(i).getExprId());
-                    }
+
+        if (!(distributionInfo instanceof HashDistributionInfo)) {
+            return DistributionSpecAny.INSTANCE;
+        }
+        HashDistributionInfo hashDistributionInfo = (HashDistributionInfo) distributionInfo;

Review Comment:
   we must to consider partition numbers and colocate group status when we generate DistributionSpec of OlapScan.
   When the partition number is not 1 and it is not in a colocate group or the colocate group is not stable, OlapScan's distribution should be random(in other words, any)
   
   Maybe, we should change `olapTable.getPartitions().size() == 1` to `selectedPartition.size() == 1`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] XieJiann commented on a diff in pull request #15965: [fix](Nereids): always propose hashDistributeSpec of OlapScan

Posted by GitBox <gi...@apache.org>.
XieJiann commented on code in PR #15965:
URL: https://github.com/apache/doris/pull/15965#discussion_r1071692134


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalOlapScanToPhysicalOlapScan.java:
##########
@@ -48,48 +45,40 @@ public class LogicalOlapScanToPhysicalOlapScan extends OneImplementationRuleFact
     @Override
     public Rule build() {
         return logicalOlapScan().then(olapScan ->
-            new PhysicalOlapScan(
-                    olapScan.getId(),
-                    olapScan.getTable(),
-                    olapScan.getQualifier(),
-                    olapScan.getSelectedIndexId(),
-                    olapScan.getSelectedTabletIds(),
-                    olapScan.getSelectedPartitionIds(),
-                    convertDistribution(olapScan),
-                    olapScan.getPreAggStatus(),
-                    Optional.empty(),
-                    olapScan.getLogicalProperties())
+                new PhysicalOlapScan(
+                        olapScan.getId(),
+                        olapScan.getTable(),
+                        olapScan.getQualifier(),
+                        olapScan.getSelectedIndexId(),
+                        olapScan.getSelectedTabletIds(),
+                        olapScan.getSelectedPartitionIds(),
+                        convertDistribution(olapScan),
+                        olapScan.getPreAggStatus(),
+                        Optional.empty(),
+                        olapScan.getLogicalProperties())
         ).toRule(RuleType.LOGICAL_OLAP_SCAN_TO_PHYSICAL_OLAP_SCAN_RULE);
     }
 
     private DistributionSpec convertDistribution(LogicalOlapScan olapScan) {
         OlapTable olapTable = olapScan.getTable();
         DistributionInfo distributionInfo = olapTable.getDefaultDistributionInfo();
-        ColocateTableIndex colocateTableIndex = Env.getCurrentColocateIndex();
-        if ((colocateTableIndex.isColocateTable(olapTable.getId())
-                && !colocateTableIndex.isGroupUnstable(colocateTableIndex.getGroup(olapTable.getId())))
-                || olapTable.getPartitionInfo().getType() == PartitionType.UNPARTITIONED
-                || olapTable.getPartitions().size() == 1) {
-            if (!(distributionInfo instanceof HashDistributionInfo)) {
-                return DistributionSpecAny.INSTANCE;
-            }
-            HashDistributionInfo hashDistributionInfo = (HashDistributionInfo) distributionInfo;
-            List<Slot> output = olapScan.getOutput();
-            List<ExprId> hashColumns = Lists.newArrayList();
-            List<Column> schemaColumns = olapScan.getTable().getFullSchema();
-            for (int i = 0; i < schemaColumns.size(); i++) {
-                for (Column column : hashDistributionInfo.getDistributionColumns()) {
-                    if (schemaColumns.get(i).equals(column)) {
-                        hashColumns.add(output.get(i).getExprId());
-                    }
+
+        if (!(distributionInfo instanceof HashDistributionInfo)) {
+            return DistributionSpecAny.INSTANCE;
+        }
+        HashDistributionInfo hashDistributionInfo = (HashDistributionInfo) distributionInfo;

Review Comment:
   Get it.
   However the test case is wrong because the bucket shuffle join is forbidden when both join tables is partitioned and do not belong to the stable colocate group.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org