You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/24 08:41:30 UTC

[GitHub] [incubator-doris] hf200012 opened a new pull request, #9753: [doc]Add Doris join optimization documentation

hf200012 opened a new pull request, #9753:
URL: https://github.com/apache/incubator-doris/pull/9753

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

yangzhg commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r882257444


##########
docs/zh-CN/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,226 @@
+---
+{
+    "title": "Doris Join 优化原理",
+    "language": "zh-CN"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join 优化原理
+
+Doris 支持两种物理算子，一类是 **Hash Join**，另一类是 **Nest Loop Join**。
+
+- Hash Join：在右表上根据等值 Join 列建立哈希表，左表流式的利用哈希表进行 Join 计算，它的限制是只能适用于等值 Join。
+- Nest Loop Join：通过两个 for 循环，很直观。然后它适用的场景就是不等值的 Join，例如：大于小于或者是需要求笛卡尔积的场景。它是一个通用的 Join 算子，但是性能表现差。
+
+作为分布式的 MPP 数据库， 在 Join 的过程中是需要进行数据的 Shuffle。数据需要进行拆分调度，才能保证最终的 Join 结果是正确的。举个简单的例子，假设关系S 和 R 进行Join，N 表示参与 Join 计算的节点的数量；T 则表示关系的 Tuple 数目。
+
+
+
+## Doris Shuffle 方式
+
+Doris 支持 4 种 Shuffle 方式
+
+1. BroadCast Join

Review Comment:
   ```suggestion
   1. Broadcast Join
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

yangzhg commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r882257343


##########
docs/zh-CN/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,226 @@
+---
+{
+    "title": "Doris Join 优化原理",
+    "language": "zh-CN"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join 优化原理
+
+Doris 支持两种物理算子，一类是 **Hash Join**，另一类是 **Nest Loop Join**。
+

Review Comment:
   doris doesn't really support  `Nest Loop Join`, it just uses an Nest-Loop-Join-like algorithm for non-equivalent joins, so I think this will cause some insurmountable problems for join optimization



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] dataalive commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

dataalive commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r880512484


##########
docs/en/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,222 @@
+---
+{
+    "title": "Doris Join optimization principle",
+    "language": "en"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join optimization principle
+
+Doris supports two physical operators, one is **Hash Join**, and the other is **Nest Loop Join**.
+
+- Hash Join: Create a hash table on the right table based on the equivalent join column, and the left table uses the hash table to perform join calculations in a streaming manner. Its limitation is that it can only be applied to equivalent joins.
+- Nest Loop Join: With two for loops, it is very intuitive. Then it is applicable to unequal-valued joins, such as: greater than or less than or the need to find a Cartesian product. It is a general join operator, but has poor performance.
+
+As a distributed MPP database, data shuffle needs to be performed during the Join process. Data needs to be split and scheduled to ensure that the final Join result is correct. As a simple example, assume that the relationship S and R are joined, and N represents the number of nodes participating in the join calculation; T represents the number of tuples in the relationship.
+
+
+
+## Doris Shuffle way
+
+1. Doris supports 4 Shuffle methods
+
+   1. BroadCast Join
+
+       It requires the full data of the right table to be sent to the left table, that is, each node participating in Join has the full data of the right table, that is, T(R).
+
+       Its applicable scenarios are more general, and it can support Hash Join and Nest loop Join at the same time, and its network overhead is N * T(R).
+
+   ![image-20220523152004731](/images/join/image-20220523152004731.png)
+
+   The data in the left table is not moved, and the data in the right table is sent to the scanning node of the data in the left table.
+
+2. Shuffle Join
+
+   When Hash Join is performed, the corresponding Hash value can be calculated through the Join column, and Hash bucketing can be performed.
+
+   Its network overhead is: T(R) + T(N), but it can only support Hash Join, because it also calculates buckets according to the conditions of Join.
+
+   ![image-20220523151902368](/images/join/image-20220523151902368.png)
+
+   The left and right table data are sent to different partition nodes according to the partition, and the calculated demerits are sent.
+
+3. Bucket Shuffle Join
+
+   Doris's table data itself is bucketed by Hash calculation, so you can use the properties of the bucketed columns of the table itself to shuffle the Join data. If two tables need to be joined, and the Join column is the bucket column of the left table, then the data in the left table can actually be calculated by sending the data into the buckets of the left table without moving the data in the right table.
+
+   Its network overhead is: T(R) is equivalent to only Shuffle the data in the right table.
+
+   ![image-20220523151653562](/images/join/image-20220523151653562.png)
+
+   The data in the left table does not move, and the data in the right table is sent to the node that scans the table in the left table according to the result of the partition calculation.
+
+4. Colocation 
+
+   It is similar to Bucket Shuffle Join, which means that the data has been shuffled according to the preset Join column scenario when data is imported. Then the join calculation can be performed directly without considering the Shuffle problem of the data during the actual query.
+
+   ![image-20220523151619754](/images/join/image-20220523151619754.png)
+
+   The data has been pre-partitioned, and the Join calculation is performed directly locally
+
+### Comparison of four Shuffle methods
+
+| Shuffle Mode | Network Overhead | Physical Operators | Applicable Scenarios |
+| -------------- | ------------- | ------------ ---- | --------------------------------------------- --------------- |
+| BroadCast | N * T(R) | Hash Join / Nest Loop Join | Universal |
+| Shuffle | T(S) + T(R) | Hash Join | General |
+| Bucket Shuffle | T(R) | Hash Join | There are distributed columns in the left table in the join condition, and the left table is executed as a single partition |
+| Colocate | 0 | Hash Join | There are distributed columns in the left table in the join condition, and the left and right tables belong to the same Colocate Group |
+
+N : The number of Instances participating in the Join calculation
+
+T(relation) : Tuple number of relation
+
+The flexibility of the above four methods is from high to low, and its requirements for this data distribution are becoming more and more strict, but the performance of Join calculation is also getting better and better.
+
+## Runtime Filter Join optimization
+
+Doris will build a hash table in the right table when performing Hash Join calculation, and the left table will stream through the hash table of the right table to obtain the join result. The RuntimeFilter makes full use of the Hash table of the right table. When the right table generates a hash table, a filter condition based on the hash table data is generated at the same time, and then pushed down to the data scanning node of the left table. In this way, Doris can perform data filtering at runtime.
+
+If the left table is a large table and the right table is a small table, then using the filter conditions generated by the left table, most of the data to be filtered in the Join layer can be filtered in advance when the data is read, so that a large amount of data can be filtered. Improve the performance of join queries.
+
+Currently Doris supports three types of RuntimeFilter
+
+- One is IN-IN, which is well understood, and pushes a hashset down to the data scanning node.
+- The second is BloomFilter, which uses the data of the hash table to construct a BloomFilter, and then pushes the BloomFilter down to the scanning node that queries the data. .
+- The last one is MinMax, which is a Range range. After the Range range is determined by the data in the right table, it is pushed down to the data scanning node.
+
+There are two requirements for the applicable scenarios of Runtime Filter:
+
+- The first requirement is that the right table is large and the left table is small, because building a Runtime Filter needs to bear the computational cost, including some memory overhead.
+- The second requirement is that there are few results from the join of the left and right tables, indicating that this join can filter out most of the data in the left table.
+
+When the above two conditions are met, turning on the Runtime Filter can achieve better results
+
+When the Join column is the Key column of the left table, the RuntimeFilter will be pushed down to the storage engine. Doris itself supports delayed materialization,
+
+Delayed materialization is simply like this: if you need to scan three columns A, B, and C, there is a filter condition on column A: A is equal to 2, if you want to scan 100 rows, you can scan 100 rows of column A first, Then filter through the filter condition A = 2. After filtering the results, read columns B and C, which can greatly reduce the data read IO. Therefore, if the Runtime Filter is generated on the Key column, and the delayed materialization of Doris itself is used to further improve the performance of the query.
+
+### Runtime Filter Type
+
+- Doris provides three different Runtime Filter types:
+  - The advantage of **IN** is that the effect filtering effect is obvious and fast. Its shortcomings are: First, it only applies to BroadCast. Second, when the right table exceeds a certain amount of data, it will fail. The current Doris configuration is 1024, that is, if the right table is larger than 1024, the Runtime Filter of IN will directly failed.
+  - The advantage of **MinMax** is that the overhead is relatively small. Its disadvantage is that it has a relatively good effect on numeric columns, but basically no effect on non-numeric columns.
+  - The feature of **Bloom Filter** is that it is universal, suitable for various types, and the effect is better. The disadvantage is that its configuration is more complicated and the calculation is high.
+
+## Join Reader
+
+Once the database involves multi-table Join, the order of Join has a great impact on the performance of the entire Join query. Assuming that there are three tables to join, refer to the following picture, the left is the a table and the b table to do the join first, the intermediate result has 2000 rows, and then the c table is joined.
+
+Next, look at the picture on the right and adjust the order of Join. Join the a table with the c table first, the intermediate result generated is only 100, and then finally join with the b table for calculation. The final join result is the same, but the intermediate result it generates has a 20x difference, which results in a big performance diff.
+
+![image-20220523152639123](/images/join/image-20220523152639123.png)
+
+- Doris currently supports the rule-based Join Reorder algorithm. Its logic is:
+  - Make joins with large tables and small tables as much as possible, and the intermediate results it generates are as small as possible.
+  - Put the conditional join table forward, that is to say, try to filter the conditional join table
+  - Hash Join has higher priority than Nest Loop Join, because Hash Join itself is much faster than Nest Loop Join.
+
+## Doris Join optimization method
+
+Doris Join tuning method:
+
+- Use the Profile provided by Doris itself to locate the bottleneck of the query. Profile records various information in Doris' entire query, which is first-hand information for performance tuning. .
+- Understand the Join mechanism of Doris, which is also the content shared with you in the second part. Only by knowing why and understanding its mechanism can we analyze why it is slow.
+- Use Session variables to change some behaviors of Join, so as to realize the tuning of Join.
+- Check the Query Plan to analyze whether this tuning is effective.
+
+The above 4 steps basically complete a standard Join tuning process, and then it is to actually query and verify it to see what the effect is.
+
+If the first 4 methods are connected in series, it still does not work. At this time, it may be necessary to rewrite the Join statement, or to adjust the data distribution. It is necessary to recheck whether the entire data distribution is reasonable, including querying the Join statement, and some manual adjustments may be required. Of course, this method has a relatively high mental cost, which means that further analysis is required only when the previous method does not work.
+
+## Optimization case practice
+
+### Case one

Review Comment:
   ```suggestion
   ### Case one -> Case 1 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] dataalive commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

dataalive commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r880512484


##########
docs/en/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,222 @@
+---
+{
+    "title": "Doris Join optimization principle",
+    "language": "en"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join optimization principle
+
+Doris supports two physical operators, one is **Hash Join**, and the other is **Nest Loop Join**.
+
+- Hash Join: Create a hash table on the right table based on the equivalent join column, and the left table uses the hash table to perform join calculations in a streaming manner. Its limitation is that it can only be applied to equivalent joins.
+- Nest Loop Join: With two for loops, it is very intuitive. Then it is applicable to unequal-valued joins, such as: greater than or less than or the need to find a Cartesian product. It is a general join operator, but has poor performance.
+
+As a distributed MPP database, data shuffle needs to be performed during the Join process. Data needs to be split and scheduled to ensure that the final Join result is correct. As a simple example, assume that the relationship S and R are joined, and N represents the number of nodes participating in the join calculation; T represents the number of tuples in the relationship.
+
+
+
+## Doris Shuffle way
+
+1. Doris supports 4 Shuffle methods
+
+   1. BroadCast Join
+
+       It requires the full data of the right table to be sent to the left table, that is, each node participating in Join has the full data of the right table, that is, T(R).
+
+       Its applicable scenarios are more general, and it can support Hash Join and Nest loop Join at the same time, and its network overhead is N * T(R).
+
+   ![image-20220523152004731](/images/join/image-20220523152004731.png)
+
+   The data in the left table is not moved, and the data in the right table is sent to the scanning node of the data in the left table.
+
+2. Shuffle Join
+
+   When Hash Join is performed, the corresponding Hash value can be calculated through the Join column, and Hash bucketing can be performed.
+
+   Its network overhead is: T(R) + T(N), but it can only support Hash Join, because it also calculates buckets according to the conditions of Join.
+
+   ![image-20220523151902368](/images/join/image-20220523151902368.png)
+
+   The left and right table data are sent to different partition nodes according to the partition, and the calculated demerits are sent.
+
+3. Bucket Shuffle Join
+
+   Doris's table data itself is bucketed by Hash calculation, so you can use the properties of the bucketed columns of the table itself to shuffle the Join data. If two tables need to be joined, and the Join column is the bucket column of the left table, then the data in the left table can actually be calculated by sending the data into the buckets of the left table without moving the data in the right table.
+
+   Its network overhead is: T(R) is equivalent to only Shuffle the data in the right table.
+
+   ![image-20220523151653562](/images/join/image-20220523151653562.png)
+
+   The data in the left table does not move, and the data in the right table is sent to the node that scans the table in the left table according to the result of the partition calculation.
+
+4. Colocation 
+
+   It is similar to Bucket Shuffle Join, which means that the data has been shuffled according to the preset Join column scenario when data is imported. Then the join calculation can be performed directly without considering the Shuffle problem of the data during the actual query.
+
+   ![image-20220523151619754](/images/join/image-20220523151619754.png)
+
+   The data has been pre-partitioned, and the Join calculation is performed directly locally
+
+### Comparison of four Shuffle methods
+
+| Shuffle Mode | Network Overhead | Physical Operators | Applicable Scenarios |
+| -------------- | ------------- | ------------ ---- | --------------------------------------------- --------------- |
+| BroadCast | N * T(R) | Hash Join / Nest Loop Join | Universal |
+| Shuffle | T(S) + T(R) | Hash Join | General |
+| Bucket Shuffle | T(R) | Hash Join | There are distributed columns in the left table in the join condition, and the left table is executed as a single partition |
+| Colocate | 0 | Hash Join | There are distributed columns in the left table in the join condition, and the left and right tables belong to the same Colocate Group |
+
+N : The number of Instances participating in the Join calculation
+
+T(relation) : Tuple number of relation
+
+The flexibility of the above four methods is from high to low, and its requirements for this data distribution are becoming more and more strict, but the performance of Join calculation is also getting better and better.
+
+## Runtime Filter Join optimization
+
+Doris will build a hash table in the right table when performing Hash Join calculation, and the left table will stream through the hash table of the right table to obtain the join result. The RuntimeFilter makes full use of the Hash table of the right table. When the right table generates a hash table, a filter condition based on the hash table data is generated at the same time, and then pushed down to the data scanning node of the left table. In this way, Doris can perform data filtering at runtime.
+
+If the left table is a large table and the right table is a small table, then using the filter conditions generated by the left table, most of the data to be filtered in the Join layer can be filtered in advance when the data is read, so that a large amount of data can be filtered. Improve the performance of join queries.
+
+Currently Doris supports three types of RuntimeFilter
+
+- One is IN-IN, which is well understood, and pushes a hashset down to the data scanning node.
+- The second is BloomFilter, which uses the data of the hash table to construct a BloomFilter, and then pushes the BloomFilter down to the scanning node that queries the data. .
+- The last one is MinMax, which is a Range range. After the Range range is determined by the data in the right table, it is pushed down to the data scanning node.
+
+There are two requirements for the applicable scenarios of Runtime Filter:
+
+- The first requirement is that the right table is large and the left table is small, because building a Runtime Filter needs to bear the computational cost, including some memory overhead.
+- The second requirement is that there are few results from the join of the left and right tables, indicating that this join can filter out most of the data in the left table.
+
+When the above two conditions are met, turning on the Runtime Filter can achieve better results
+
+When the Join column is the Key column of the left table, the RuntimeFilter will be pushed down to the storage engine. Doris itself supports delayed materialization,
+
+Delayed materialization is simply like this: if you need to scan three columns A, B, and C, there is a filter condition on column A: A is equal to 2, if you want to scan 100 rows, you can scan 100 rows of column A first, Then filter through the filter condition A = 2. After filtering the results, read columns B and C, which can greatly reduce the data read IO. Therefore, if the Runtime Filter is generated on the Key column, and the delayed materialization of Doris itself is used to further improve the performance of the query.
+
+### Runtime Filter Type
+
+- Doris provides three different Runtime Filter types:
+  - The advantage of **IN** is that the effect filtering effect is obvious and fast. Its shortcomings are: First, it only applies to BroadCast. Second, when the right table exceeds a certain amount of data, it will fail. The current Doris configuration is 1024, that is, if the right table is larger than 1024, the Runtime Filter of IN will directly failed.
+  - The advantage of **MinMax** is that the overhead is relatively small. Its disadvantage is that it has a relatively good effect on numeric columns, but basically no effect on non-numeric columns.
+  - The feature of **Bloom Filter** is that it is universal, suitable for various types, and the effect is better. The disadvantage is that its configuration is more complicated and the calculation is high.
+
+## Join Reader
+
+Once the database involves multi-table Join, the order of Join has a great impact on the performance of the entire Join query. Assuming that there are three tables to join, refer to the following picture, the left is the a table and the b table to do the join first, the intermediate result has 2000 rows, and then the c table is joined.
+
+Next, look at the picture on the right and adjust the order of Join. Join the a table with the c table first, the intermediate result generated is only 100, and then finally join with the b table for calculation. The final join result is the same, but the intermediate result it generates has a 20x difference, which results in a big performance diff.
+
+![image-20220523152639123](/images/join/image-20220523152639123.png)
+
+- Doris currently supports the rule-based Join Reorder algorithm. Its logic is:
+  - Make joins with large tables and small tables as much as possible, and the intermediate results it generates are as small as possible.
+  - Put the conditional join table forward, that is to say, try to filter the conditional join table
+  - Hash Join has higher priority than Nest Loop Join, because Hash Join itself is much faster than Nest Loop Join.
+
+## Doris Join optimization method
+
+Doris Join tuning method:
+
+- Use the Profile provided by Doris itself to locate the bottleneck of the query. Profile records various information in Doris' entire query, which is first-hand information for performance tuning. .
+- Understand the Join mechanism of Doris, which is also the content shared with you in the second part. Only by knowing why and understanding its mechanism can we analyze why it is slow.
+- Use Session variables to change some behaviors of Join, so as to realize the tuning of Join.
+- Check the Query Plan to analyze whether this tuning is effective.
+
+The above 4 steps basically complete a standard Join tuning process, and then it is to actually query and verify it to see what the effect is.
+
+If the first 4 methods are connected in series, it still does not work. At this time, it may be necessary to rewrite the Join statement, or to adjust the data distribution. It is necessary to recheck whether the entire data distribution is reasonable, including querying the Join statement, and some manual adjustments may be required. Of course, this method has a relatively high mental cost, which means that further analysis is required only when the previous method does not work.
+
+## Optimization case practice
+
+### Case one

Review Comment:
   ```suggestion
   ### Case 1
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] dataalive commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

dataalive commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r880515874


##########
docs/zh-CN/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,226 @@
+---
+{
+    "title": "Doris Join 优化原理",
+    "language": "zh-CN"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join 优化原理
+
+Doris 支持两种物理算子，一类是 **Hash Join**，另一类是 **Nest Loop Join**。
+
+- Hash Join：在右表上根据等值 Join 列建立哈希表，左表流式的利用哈希表进行 Join 计算，它的限制是只能适用于等值 Join。
+- Nest Loop Join：通过两个 for 循环，很直观。然后它适用的场景就是不等值的 Join，例如：大于小于或者是需要求笛卡尔积的场景。它是一个通用的 Join 算子，但是性能表现差。
+
+作为分布式的 MPP 数据库， 在 Join 的过程中是需要进行数据的 Shuffle。数据需要进行拆分调度，才能保证最终的 Join 结果是正确的。举个简单的例子，假设关系S 和 R 进行Join，N 表示参与 Join 计算的节点的数量；T 则表示关系的 Tuple 数目。
+
+
+
+## Doris Shuffle 方式
+
+Doris 支持 4 种 Shuffle 方式
+
+1. BroadCast Join
+
+   它要求把右表全量的数据都发送到左表上，即每一个参与 Join 的节点，它都拥有右表全量的数据，也就是 T(R)。
+
+   它适用的场景是比较通用的，同时能够支持 Hash Join 和 Nest loop Join，它的网络开销 N * T(R)。
+
+   ![image-20220523152004731](/images/join/image-20220523152004731.png)
+
+   左表数据不移动，右表数据发送到左表数据的扫描节点。
+
+2. Shuffle Join
+
+   当进行 Hash Join 时候，可以通过 Join 列计算对应的 Hash 值，并进行 Hash 分桶。
+
+   它的网络开销则是：T（R） + T（N），但它只能支持 Hash Join，因为它是根据 Join 的条件也去做计算分桶的。
+
+   ![image-20220523151902368](/images/join/image-20220523151902368.png)
+
+   左右表数据根据分区，计算的记过发送到不同的分区节点上。
+
+3. Bucket Shuffle Join
+
+   Doris 的表数据本身是通过 Hash 计算分桶的，所以就可以利用表本身的分桶列的性质来进行 Join 数据的 Shuffle。假如两张表需要做 Join，并且 Join 列是左表的分桶列，那么左表的数据其实可以不用去移动右表通过左表的数据分桶发送数据就可以完成  Join  的计算。
+
+   它的网络开销则是：T（R）相当于只 Shuffle 右表的数据就可以了。
+
+   ![image-20220523151653562](/images/join/image-20220523151653562.png)
+
+   左表数据不移动，右表数据根据分区计算的结果发送到左表扫表的节点
+
+4. Colocation 
+
+   它与 Bucket Shuffle Join 相似，相当于在数据导入的时候，根据预设的 Join 列的场景已经做好了数据的 Shuffle。那么实际查询的时候就可以直接进行 Join 计算而不需要考虑数据的 Shuffle 问题了。
+
+   ![image-20220523151619754](/images/join/image-20220523151619754.png)
+
+   数据已经预先分区，直接在本地进行 Join 计算
+
+### 四种 Shuffle 方式对比
+
+| Shuffle方式    | 网络开销    | 物理算子                   | 适用场景                                                     |
+| -------------- | ----------- | -------------------------- | ------------------------------------------------------------ |
+| BroadCast      | N * T(R)    | Hash Join / Nest Loop Join | 通用                                                         |
+| Shuffle        | T(S) + T(R) | Hash Join                  | 通用                                                         |
+| Bucket Shuffle | T(R)        | Hash Join                  | Join条件中存在左表的分布式列，且左表执行时为单分区           |
+| Colocate       | 0           | Hash Join                  | Join条件中存在左表的分布式列，切左右表同属于一个Colocate Group |
+
+N ： 参与 Join 计算的 Instance 个数
+
+T(关系) : 关系的 Tuple 数目
+
+上面这 4 种方式灵活度是从高到低的，它对这个数据分布的要求是越来越严格，但 Join 计算的性能也是越来越好的。
+
+## Runtime Filter  Join 优化
+
+Doris 在进行 Hash Join 计算时会在右表构建一个哈希表，左表流式的通过右表的哈希表从而得出 Join 结果。而 RuntimeFilter 就是充分利用了右表的 Hash 表，在右表生成哈希表的时，同时生成一个基于哈希表数据的一个过滤条件，然后下推到左表的数据扫描节点。通过这样的方式，Doris 可以在运行时进行数据过滤。
+
+假如左表是一张大表，右表是一张小表，那么利用左表生成的过滤条件就可以把绝大多数在 Join 层要过滤的数据在数据读取时就提前过滤，这样就能大幅度的提升 Join 查询的性能。
+
+当前 Doris 支持三种类型 RuntimeFilter
+
+- 一种是 IN— IN，很好理解，将一个 hashset 下推到数据扫描节点。
+- 第二种就是 BloomFilter，就是利用哈希表的数据构造一个 BloomFilter，然后把这个 BloomFilter 下推到查询数据的扫描节点。。
+- 最后一种就是 MinMax，就是个 Range 范围，通过右表数据确定 Range 范围之后，下推给数据扫描节点。
+
+Runtime Filter 适用的场景有两个要求：
+
+- 第一个要求就是右表大左表小，因为构建 Runtime Filter是需要承担计算成本的，包括一些内存的开销。
+- 第二个要求就是左右表 Join 出来的结果很少，说明这个 Join 可以过滤掉左表的绝大部分数据。
+
+当符合上面两个条件的情况下，开启 Runtime Filter 就能收获比较好的效果
+
+当 Join 列为左表的 Key 列时，RuntimeFilter 会下推到存储引擎。Doris 本身支持延迟物化，
+
+延迟物化简单来说是这样的：假如需要扫描 A、B、C 三列，在 A 列上有一个过滤条件: A 等于 2，要扫描 100 行的话，可以先把 A 列的 100 行扫描出来，再通过 A = 2 这个过滤条件过滤。之后通过过滤完成后的结果，再去读取 B、C 列，这样就能极大的降低数据的读取 IO。所以说 Runtime Filter 如果在 Key 列上生成，同时利用 Doris 本身的延迟物化来进一步提升查询的性能。
+
+### Runtime Filter 类型
+
+Doris 提供了三种不同的 Runtime Filter 类型：
+
+- **IN** 的优点就是效果过滤效果明显，且快速。它的缺点首先第一个它只适用于 BroadCast，第二，它右表超过一定数据量的时候就失效了，当前 Doris 目前配置的是1024，即右表如果大于 1024，IN 的 Runtime Filter 就直接失效了。
+- **MinMax** 的优点是开销比较小。它的缺点就是对数值列还有比较好的效果，但对于非数值列，基本上就没什么效果。
+- **Bloom Filter** 的特点就是通用，适用于各种类型、效果也比较好。缺点就是它的配置比较复杂并且计算较高。
+
+
+
+## Join Reader

Review Comment:
   ```suggestion
   ## Join Reorder
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

yangzhg commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r882258297


##########
docs/zh-CN/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,226 @@
+---
+{
+    "title": "Doris Join 优化原理",
+    "language": "zh-CN"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join 优化原理
+
+Doris 支持两种物理算子，一类是 **Hash Join**，另一类是 **Nest Loop Join**。
+
+- Hash Join：在右表上根据等值 Join 列建立哈希表，左表流式的利用哈希表进行 Join 计算，它的限制是只能适用于等值 Join。
+- Nest Loop Join：通过两个 for 循环，很直观。然后它适用的场景就是不等值的 Join，例如：大于小于或者是需要求笛卡尔积的场景。它是一个通用的 Join 算子，但是性能表现差。
+
+作为分布式的 MPP 数据库， 在 Join 的过程中是需要进行数据的 Shuffle。数据需要进行拆分调度，才能保证最终的 Join 结果是正确的。举个简单的例子，假设关系S 和 R 进行Join，N 表示参与 Join 计算的节点的数量；T 则表示关系的 Tuple 数目。
+
+
+
+## Doris Shuffle 方式
+
+Doris 支持 4 种 Shuffle 方式
+
+1. BroadCast Join
+
+   它要求把右表全量的数据都发送到左表上，即每一个参与 Join 的节点，它都拥有右表全量的数据，也就是 T(R)。
+
+   它适用的场景是比较通用的，同时能够支持 Hash Join 和 Nest loop Join，它的网络开销 N * T(R)。
+
+   ![image-20220523152004731](/images/join/image-20220523152004731.png)
+
+   左表数据不移动，右表数据发送到左表数据的扫描节点。
+
+2. Shuffle Join
+
+   当进行 Hash Join 时候，可以通过 Join 列计算对应的 Hash 值，并进行 Hash 分桶。
+
+   它的网络开销则是：T（R） + T（N），但它只能支持 Hash Join，因为它是根据 Join 的条件也去做计算分桶的。
+
+   ![image-20220523151902368](/images/join/image-20220523151902368.png)
+
+   左右表数据根据分区，计算的记过发送到不同的分区节点上。
+
+3. Bucket Shuffle Join
+
+   Doris 的表数据本身是通过 Hash 计算分桶的，所以就可以利用表本身的分桶列的性质来进行 Join 数据的 Shuffle。假如两张表需要做 Join，并且 Join 列是左表的分桶列，那么左表的数据其实可以不用去移动右表通过左表的数据分桶发送数据就可以完成  Join  的计算。
+
+   它的网络开销则是：T（R）相当于只 Shuffle 右表的数据就可以了。
+
+   ![image-20220523151653562](/images/join/image-20220523151653562.png)
+
+   左表数据不移动，右表数据根据分区计算的结果发送到左表扫表的节点
+
+4. Colocate
+
+   它与 Bucket Shuffle Join 相似，相当于在数据导入的时候，根据预设的 Join 列的场景已经做好了数据的 Shuffle。那么实际查询的时候就可以直接进行 Join 计算而不需要考虑数据的 Shuffle 问题了。
+
+   ![image-20220523151619754](/images/join/image-20220523151619754.png)
+
+   数据已经预先分区，直接在本地进行 Join 计算
+
+### 四种 Shuffle 方式对比
+
+| Shuffle方式    | 网络开销    | 物理算子                   | 适用场景                                                     |
+| -------------- | ----------- | -------------------------- | ------------------------------------------------------------ |
+| BroadCast      | N * T(R)    | Hash Join / Nest Loop Join | 通用                                                         |
+| Shuffle        | T(S) + T(R) | Hash Join                  | 通用                                                         |
+| Bucket Shuffle | T(R)        | Hash Join                  | Join条件中存在左表的分布式列，且左表执行时为单分区           |
+| Colocate       | 0           | Hash Join                  | Join条件中存在左表的分布式列，切左右表同属于一个Colocate Group |
+
+N ： 参与 Join 计算的 Instance 个数
+
+T(关系) : 关系的 Tuple 数目
+
+上面这 4 种方式灵活度是从高到低的，它对这个数据分布的要求是越来越严格，但 Join 计算的性能也是越来越好的。
+
+## Runtime Filter  Join 优化
+
+Doris 在进行 Hash Join 计算时会在右表构建一个哈希表，左表流式的通过右表的哈希表从而得出 Join 结果。而 RuntimeFilter 就是充分利用了右表的 Hash 表，在右表生成哈希表的时，同时生成一个基于哈希表数据的一个过滤条件，然后下推到左表的数据扫描节点。通过这样的方式，Doris 可以在运行时进行数据过滤。
+
+假如左表是一张大表，右表是一张小表，那么利用左表生成的过滤条件就可以把绝大多数在 Join 层要过滤的数据在数据读取时就提前过滤，这样就能大幅度的提升 Join 查询的性能。
+
+当前 Doris 支持三种类型 RuntimeFilter
+
+- 一种是 IN— IN，很好理解，将一个 hashset 下推到数据扫描节点。

Review Comment:
   ```suggestion
   - 一种是 IN，很好理解，将一个 hashset 下推到数据扫描节点。
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] hf200012 merged pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

hf200012 merged PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#issuecomment-1137213841

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] dataalive commented on a diff in pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

dataalive commented on code in PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#discussion_r880514364


##########
docs/zh-CN/advanced/join-optimization/doris-join-optimization.md:
##########
@@ -0,0 +1,226 @@
+---
+{
+    "title": "Doris Join 优化原理",
+    "language": "zh-CN"
+}
+
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Doris Join 优化原理
+
+Doris 支持两种物理算子，一类是 **Hash Join**，另一类是 **Nest Loop Join**。
+
+- Hash Join：在右表上根据等值 Join 列建立哈希表，左表流式的利用哈希表进行 Join 计算，它的限制是只能适用于等值 Join。
+- Nest Loop Join：通过两个 for 循环，很直观。然后它适用的场景就是不等值的 Join，例如：大于小于或者是需要求笛卡尔积的场景。它是一个通用的 Join 算子，但是性能表现差。
+
+作为分布式的 MPP 数据库， 在 Join 的过程中是需要进行数据的 Shuffle。数据需要进行拆分调度，才能保证最终的 Join 结果是正确的。举个简单的例子，假设关系S 和 R 进行Join，N 表示参与 Join 计算的节点的数量；T 则表示关系的 Tuple 数目。
+
+
+
+## Doris Shuffle 方式
+
+Doris 支持 4 种 Shuffle 方式
+
+1. BroadCast Join
+
+   它要求把右表全量的数据都发送到左表上，即每一个参与 Join 的节点，它都拥有右表全量的数据，也就是 T(R)。
+
+   它适用的场景是比较通用的，同时能够支持 Hash Join 和 Nest loop Join，它的网络开销 N * T(R)。
+
+   ![image-20220523152004731](/images/join/image-20220523152004731.png)
+
+   左表数据不移动，右表数据发送到左表数据的扫描节点。
+
+2. Shuffle Join
+
+   当进行 Hash Join 时候，可以通过 Join 列计算对应的 Hash 值，并进行 Hash 分桶。
+
+   它的网络开销则是：T（R） + T（N），但它只能支持 Hash Join，因为它是根据 Join 的条件也去做计算分桶的。
+
+   ![image-20220523151902368](/images/join/image-20220523151902368.png)
+
+   左右表数据根据分区，计算的记过发送到不同的分区节点上。
+
+3. Bucket Shuffle Join
+
+   Doris 的表数据本身是通过 Hash 计算分桶的，所以就可以利用表本身的分桶列的性质来进行 Join 数据的 Shuffle。假如两张表需要做 Join，并且 Join 列是左表的分桶列，那么左表的数据其实可以不用去移动右表通过左表的数据分桶发送数据就可以完成  Join  的计算。
+
+   它的网络开销则是：T（R）相当于只 Shuffle 右表的数据就可以了。
+
+   ![image-20220523151653562](/images/join/image-20220523151653562.png)
+
+   左表数据不移动，右表数据根据分区计算的结果发送到左表扫表的节点
+
+4. Colocation 

Review Comment:
   ```suggestion
   4. Colocate
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9753: [doc]Add Doris join optimization documentation

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on PR #9753:
URL: https://github.com/apache/incubator-doris/pull/9753#issuecomment-1137213800

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org