You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/03/02 12:09:29 UTC

[GitHub] [incubator-doris] EmmyMiao87 edited a comment on issue #7901: [Feature] [Vectorized] Some Join opt in vec exec engine

EmmyMiao87 edited a comment on issue #7901:
URL: https://github.com/apache/incubator-doris/issues/7901#issuecomment-1056815743


   # Join 性能优化
   
   ## 减少不必要的内存拷贝
   
   Join Node 的输出 schema 与 Join Node 的输入 schema 不同。但当前 Doris 的 Join Node 算子在构造结果行时,直接将左右孩子的 tuple 进行拼接。
   而实际上结果行的列可能是输入行中列的子集。这导致了很多无用的内存拷贝。
   
   举例说明
   
   ```
   select a.k1 from a, b where a.k1=b.k1;
   ```
   输入 schema : a.k1, b.k1
   输出 schema :a.k1, b.k1
   优化后输出 schema :a.k1
   
   ```
   MySQL [ssb]> select count(d_datekey) from lineorder inner join date on lo_orderdate = d_datekey;
   +--------------------+
   | count(`d_datekey`) |
   +--------------------+
   |          600037902 |
   +--------------------+
   1 row in set (10.555 sec)
   ```
   打印 perf 发现,主要耗时函数在:
   <img width="1436" alt="image" src="https://user-images.githubusercontent.com/25147274/156358842-1d987caf-bea7-4a5a-9d63-0dca844d25f4.png">
   1. replicate 负责非 Join 列的结果填写函数。**占用约 10%**
   
   demo 测试
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org