You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/11/24 09:50:50 UTC

[GitHub] [incubator-doris] 924060929 opened a new issue #7212: [Feature] TPC-H Q17、Q18优化调研:semi-join改inner-join,物化公共子查询

924060929 opened a new issue #7212:
URL: https://github.com/apache/incubator-doris/issues/7212


   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Description
   
   # 1. TPC-H测试环境
   
   配置
   |指标|数量|
   |  ------  | ------- |
   |scale factor|10G|
   |# fe| 1|
   |# be| 1|
   |# buckets| 1|
   |# replica | 1|
   
   
   查询设置
   ``` sql
   set exec_mem_limit=2162147488*25; -- 调高内存
   set runtime_filter_mode=off; -- 关闭runtime filter
   set disable_join_reorder=false; -- 使用reorder
   set enable_cost_based_join_reorder=false; -- 使用老版reorder
   ```
   
   # 2. Query 17
   ## 2.1 Q17 优化前
   原始sql耗时20秒左右:
   ```sql
   select
           sum(l_extendedprice) / 7.0 as avg_yearly
   from
           lineitem,
           part
   where
           p_partkey = l_partkey
           and p_brand = 'Brand#55'
           and p_container = 'JUMBO CAN'
           and l_quantity < (              -- 关联子查询, left semi join
                   select
                           0.2 * avg(l_quantity)
                   from
                           lineitem
                   where
                           l_partkey = p_partkey
           );
   ```
   
   explain  graph: (有semi join)
   ```sql
   +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Explain String                                                                                                                                                     |
   +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   |                              ┌───────────────┐                                                                                   |
   |                              │[6: ResultSink]│                                                                                                                 |
   |                              │[Fragment: 0]  │                                                                                                                 |
   |                              │RESULT SINK    │                                                                                                                 |
   |                              └───────────────┘                                                                                   |
   |                                      └┐                                                                                                                        |
   |                                       │                                                                                                                          |
   |                      ┌────────────────────────────────┐                                        |
   |                      │[6: AGGREGATE (update finalize)]│                                                                                                        |
   |                      │[Fragment: 0]                   │                                                                                                        |
   |                      └────────────────────────────────┘                                        |
   |                                      ┌┘                                                                                                                        |
   |                                      │                                                                                                                           |
   |                    ┌───────────────────────────────────┐                                 |
   |                    │[5: HASH JOIN]                     │                                                                                                       |
   |                    │[Fragment: 0]                      │                                                                                                       |
   |                    │join op: LEFT SEMI JOIN (BROADCAST)│                                                                                                       |
   |                    └───────────────────────────────────┘                                 |
   |                     ┌────────────────┴────────────────────┐                          |
   |                     │                                     │                                                                                                    |
   |     ┌───────────────────────────────┐              ┌─────────────┐ |
   |     │[2: HASH JOIN]                 │              │[8: EXCHANGE]│                                                                                         |
   |     │[Fragment: 0]                  │              │[Fragment: 0]│                                                                                         |
   |     │join op: INNER JOIN (BROADCAST)│              └─────────────┘                                                               |
   |     └───────────────────────────────┘                     │                                    |
   |          ┌──────────┴─────────┐                           │                                                          |
   |          │                    │                 ┌───────────────────┐                                                |
   | ┌─────────────────┐    ┌─────────────┐          │[8: DataStreamSink]│                      |
   | │[0: OlapScanNode]│    │[7: EXCHANGE]│          │[Fragment: 2]      │                                                                                  |
   | │[Fragment: 0]    │    │[Fragment: 0]│          │STREAM DATA SINK   │                                                                                  |
   | │TABLE: lineitem  │    └─────────────┘          │  EXCHANGE ID: 08  │                                                        |
   | └─────────────────┘           │                 │  UNPARTITIONED    │                                                  |
   |                               │                 └───────────────────┘                                                  |
   |                     ┌───────────────────┐                 └┐                                                         |
   |                     │[7: DataStreamSink]│                  │                                                                                                 |
   |                     │[Fragment: 1]      │ ┌────────────────────────────────┐               |
   |                     │STREAM DATA SINK   │ │[4: AGGREGATE (update finalize)]│                                                                               |
   |                     │  EXCHANGE ID: 07  │ │[Fragment: 2]                   │                                                                               |
   |                     │  UNPARTITIONED    │ └────────────────────────────────┘               |
   |                     └───────────────────┘                 ┌┘                                                         |
   |                               │                           │                                                                                                    |
   |                               │                  ┌─────────────────┐                                                       |
   |                      ┌─────────────────┐         │[3: OlapScanNode]│                                                     |
   |                      │[1: OlapScanNode]│         │[Fragment: 2]    │                                                                                       |
   |                      │[Fragment: 1]    │         │TABLE: lineitem  │                                                                                       |
   |                      │TABLE: part      │         └─────────────────┘                                                     |
   |                      └─────────────────┘                                                                                     |
   +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   ## 2.2 Q17 优化方案1
   耗时15秒
   ```sql
   with a     -- 公共子查询部分,选择率较高,但目前doris没有物化临时表功能
   as
   (
     select
             l_partkey, l_quantity, l_extendedprice
     from
             lineitem,
             part
     where
             p_partkey = l_partkey
             and p_brand = 'Brand#55'
             and p_container = 'JUMBO CAN'
   )
   select sum(l_extendedprice) / 7.0 as avg_yearly
   from
   (
     select b.l_extendedprice, b.l_quantity, c.avg_quantity
     from
     (
       select l_partkey, l_quantity, l_extendedprice
       from a
     ) b
     join
     (
       select l_partkey, 0.2 * avg(l_quantity) avg_quantity
       from a
       group by l_partkey          -- 注意这里要group,达到semi join的半连接效果
     ) c
     on b.l_partkey = c.l_partkey
   ) d
   where l_quantity < avg_quantity
   ```
   
   explain graph: (semi join 变成了 inner join)
   ```sql
   +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Explain String                                                                                                                                                            |
   +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   |                                  ┌───────────────┐                                                                                      |
   |                                  │[8: ResultSink]│                                                                                                                    |
   |                                  │[Fragment: 0]  │                                                                                                                    |
   |                                  │RESULT SINK    │                                                                                                                    |
   |                                  └───────────────┘                                                                                      |
   |                                          └┐                                                                                                                           |
   |                                           │                                                                                                                             |
   |                          ┌────────────────────────────────┐                                           |
   |                          │[8: AGGREGATE (update finalize)]│                                                                                                           |
   |                          │[Fragment: 0]                   │                                                                                                           |
   |                          └────────────────────────────────┘                                           |
   |                                          ┌┘                                                                                                                           |
   |                                          │                                                                                                                              |
   |                          ┌───────────────────────────────┐                                              |
   |                          │[7: HASH JOIN]                 │                                                                                                            |
   |                          │[Fragment: 0]                  │                                                                                                            |
   |                          │join op: INNER JOIN (BROADCAST)│                                                                                                            |
   |                          └───────────────────────────────┘                                              |
   |                     ┌────────────────────┴─────────────────────┐                  |
   |                     │                                          │                                                                                                      |
   |     ┌───────────────────────────────┐                  ┌──────────────┐ |
   |     │[2: HASH JOIN]                 │                  │[11: EXCHANGE]│                                                                                           |
   |     │[Fragment: 0]                  │                  │[Fragment: 0] │                                                                                           |
   |     │join op: INNER JOIN (BROADCAST)│                  └──────────────┘                                                               |
   |     └───────────────────────────────┘                          │                                      |
   |          ┌──────────┴─────────┐                                │                                                            |
   |          │                    │                     ┌────────────────────┐                                                |
   | ┌─────────────────┐    ┌─────────────┐              │[11: DataStreamSink]│                        |
   | │[0: OlapScanNode]│    │[9: EXCHANGE]│              │[Fragment: 2]       │                                                                                    |
   | │[Fragment: 0]    │    │[Fragment: 0]│              │STREAM DATA SINK    │                                                                                    |
   | │TABLE: lineitem  │    └─────────────┘              │  EXCHANGE ID: 11   │                                                          |
   | └─────────────────┘           │                     │  UNPARTITIONED     │                                                    |
   |                               │                     └────────────────────┘                                                  |
   |                     ┌───────────────────┐                      │                                                              |
   |                     │[9: DataStreamSink]│                      │                                                                                                    |
   |                     │[Fragment: 1]      │     ┌────────────────────────────────┐                  |
   |                     │STREAM DATA SINK   │     │[6: AGGREGATE (update finalize)]│                                                                                  |
   |                     │  EXCHANGE ID: 09  │     │[Fragment: 2]                   │                                                                                  |
   |                     │  UNPARTITIONED    │     └────────────────────────────────┘                  |
   |                     └───────────────────┘                     ┌┘                                                            |
   |                               │                               │                                                                                                       |
   |                               │               ┌───────────────────────────────┐                       |
   |                      ┌─────────────────┐      │[5: HASH JOIN]                 │                                                 |
   |                      │[1: OlapScanNode]│      │[Fragment: 2]                  │                                                                                   |
   |                      │[Fragment: 1]    │      │join op: INNER JOIN (BROADCAST)│                                                                                   |
   |                      │TABLE: part      │      └───────────────────────────────┘                     |
   |                      └─────────────────┘           ┌──────────┴──────────┐            |
   |                                                    │                     │                                                                                            |
   |                                           ┌─────────────────┐    ┌──────────────┐                   |
   |                                           │[3: OlapScanNode]│    │[10: EXCHANGE]│                                                                                 |
   |                                           │[Fragment: 2]    │    │[Fragment: 2] │                                                                                 |
   |                                           │TABLE: lineitem  │    └──────────────┘                                                     |
   |                                           └─────────────────┘            │                                                        |
   |                                                                          │                                                                                              |
   |                                                               ┌────────────────────┐                                          |
   |                                                               │[10: DataStreamSink]│                                                                                  |
   |                                                               │[Fragment: 3]       │                                                                                  |
   |                                                               │STREAM DATA SINK    │                                                                                  |
   |                                                               │  EXCHANGE ID: 10   │                                                                                  |
   |                                                               │  UNPARTITIONED     │                                                                                  |
   |                                                               └────────────────────┘                                          |
   |                                                                         ┌┘                                                                                            |
   |                                                                         │                                                                                               |
   |                                                                ┌─────────────────┐                                                  |
   |                                                                │[4: OlapScanNode]│                                                                                    |
   |                                                                │[Fragment: 3]    │                                                                                    |
   |                                                                │TABLE: part      │                                                                                    |
   |                                                                └─────────────────┘                                                  |
   +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   优化效果: 20秒 -> 15秒
   
   ## 2.3 Q17 优化方案2
   
   先创建临时表a(假设不算耗时)
   ```sql
   CREATE TABLE `a` (
     `l_partkey` int(11) NULL COMMENT "",
     `l_quantity` decimal(15, 2) NULL COMMENT "",
     `l_extendedprice` decimal(15, 2) NULL COMMENT ""
   ) ENGINE=OLAP
   DUPLICATE KEY(`l_partkey`, `l_quantity`, `l_extendedprice`)
   COMMENT "OLAP"
   DISTRIBUTED BY HASH(`l_partkey`) BUCKETS 10
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2"
   );
   ```
   
   把公共子查询物化下来,耗时9.2秒
   ```sql
   insert into a
   select
             l_partkey, l_quantity, l_extendedprice
     from
             lineitem,
             part
     where
             p_partkey = l_partkey
             and p_brand = 'Brand#55'
             and p_container = 'JUMBO CAN';
   ```
   
   然后查临时表,耗时0.03秒
   ```sql
   select sum(l_extendedprice) / 7.0 as avg_yearly
   from
   (
     select b.l_extendedprice, b.l_quantity, c.avg_quantity
     from
     (
       select l_partkey, l_quantity, l_extendedprice
       from a
     ) b
     join
     (
       select l_partkey, 0.2 * avg(l_quantity) avg_quantity
       from a
       group by l_partkey
     ) c
     on b.l_partkey = c.l_partkey
   ) d
   where l_quantity < avg_quantity
   ```
   
   优化效果: 20秒 -> 9.2秒
   
   
   # 3. Query 18
   ## 3.1 Q18 优化前
   耗时29秒左右
   ```sql
   select
           c_name,
           c_custkey,
           o_orderkey,
           o_orderdate,
           o_totalprice,
           sum(l_quantity)
   from
           customer,
           orders,
           lineitem
   where
           o_orderkey in (             -- 非关联子查询
                   select
                           l_orderkey
                   from
                           lineitem
                   group by
                           l_orderkey having
                                   sum(l_quantity) > 312
           )
           and c_custkey = o_custkey
           and o_orderkey = l_orderkey
   group by
           c_name,
           c_custkey,
           o_orderkey,
           o_orderdate,
           o_totalprice
   order by
           o_totalprice desc,
           o_orderdate
   ```
   
   explain graph: (有semi join)
   ```
   +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Explain String                                                                                                                                                                                                       |
   +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   |                                          ┌────────────────┐                                                                                                                      |
   |                                          │[13: ResultSink]│                                                                                                                                                      |
   |                                          │[Fragment: 4]   │                                                                                                                                                      |
   |                                          │RESULT SINK     │                                                                                                                                                      |
   |                                          └────────────────┘                                                                                                                      |
   |                                                   │                                                                                                                                                                |
   |                                                   │                                                                                                                                                                |
   |                                       ┌──────────────────────┐                                                                                                       |
   |                                       │[13: MERGING-EXCHANGE]│                                                                                                                                                   |
   |                                       │[Fragment: 4]         │                                                                                                                                                   |
   |                                       └──────────────────────┘                                                                                                       |
   |                                                   │                                                                                                                                                                |
   |                                                   │                                                                                                                                                                |
   |                                        ┌────────────────────┐                                                                                                            |
   |                                        │[13: DataStreamSink]│                                                                                                                                                    |
   |                                        │[Fragment: 0]       │                                                                                                                                                    |
   |                                        │STREAM DATA SINK    │                                                                                                                                                    |
   |                                        │  EXCHANGE ID: 13   │                                                                                                                                                    |
   |                                        │  UNPARTITIONED     │                                                                                                                                                    |
   |                                        └────────────────────┘                                                                                                            |
   |                                                  ┌┘                                                                                                                                                              |
   |                                                  │                                                                                                                                                                 |
   |                                           ┌─────────────┐                                                                                                                              |
   |                                           │[9: TOP-N]   │                                                                                                                                                        |
   |                                           │[Fragment: 0]│                                                                                                                                                        |
   |                                           └─────────────┘                                                                                                                              |
   |                                                  └┐                                                                                                                                                              |
   |                                                   │                                                                                                                                                                |
   |                                  ┌────────────────────────────────┐                                                                              |
   |                                  │[8: AGGREGATE (update finalize)]│                                                                                                                                              |
   |                                  │[Fragment: 0]                   │                                                                                                                                              |
   |                                  └────────────────────────────────┘                                                                              |
   |                                                  ┌┘                                                                                                                                                              |
   |                                                  │                                                                                                                                                                 |
   |                                ┌───────────────────────────────────┐                                                                       |
   |                                │[7: HASH JOIN]                     │                                                                                                                                             |
   |                                │[Fragment: 0]                      │                                                                                                                                             |
   |                                │join op: LEFT SEMI JOIN (BROADCAST)│                                                                                                                                             |
   |                                └───────────────────────────────────┘                                                                       |
   |                                 ┌────────────────┴─────────────────────────────────┐                         |
   |                                 │                                                  │                                                                                                                             |
   |                 ┌───────────────────────────────┐                          ┌──────────────┐                        |
   |                 │[4: HASH JOIN]                 │                          │[12: EXCHANGE]│                                                                                                                  |
   |                 │[Fragment: 0]                  │                          │[Fragment: 0] │                                                                                                                  |
   |                 │join op: INNER JOIN (BROADCAST)│                          └──────────────┘                                                                                      |
   |                 └───────────────────────────────┘                                  │                                                             |
   |                     ┌───────────┴─────────────────────┐                            │                                                         |
   |                     │                                 │                 ┌────────────────────┐                                                                       |
   |     ┌───────────────────────────────┐         ┌──────────────┐          │[12: DataStreamSink]│                 |
   |     │[2: HASH JOIN]                 │         │[11: EXCHANGE]│          │[Fragment: 3]       │                                                                                                           |
   |     │[Fragment: 0]                  │         │[Fragment: 0] │          │STREAM DATA SINK    │                                                                                                           |
   |     │join op: INNER JOIN (BROADCAST)│         └──────────────┘          │  EXCHANGE ID: 12   │                                                                               |
   |     └───────────────────────────────┘                 │                 │  UNPARTITIONED     │                                               |
   |          ┌──────────┴──────────┐                      │                 └────────────────────┘                           |
   |          │                     │           ┌────────────────────┐                  │                                                                               |
   | ┌─────────────────┐    ┌──────────────┐    │[11: DataStreamSink]│                  │                                                     |
   | │[0: OlapScanNode]│    │[10: EXCHANGE]│    │[Fragment: 2]       │ ┌────────────────────────────────┐                                 |
   | │[Fragment: 0]    │    │[Fragment: 0] │    │STREAM DATA SINK    │ │[6: AGGREGATE (update finalize)]│                                                                                                 |
   | │TABLE: customer  │    └──────────────┘    │  EXCHANGE ID: 11   │ │[Fragment: 3]                   │                                                                     |
   | └─────────────────┘            │           │  UNPARTITIONED     │ └────────────────────────────────┘ |
   |                                │           └────────────────────┘                 ┌┘                                                                               |
   |                     ┌────────────────────┐           ┌┘                           │                                                                                |
   |                     │[10: DataStreamSink]│           │                   ┌─────────────────┐                                                                             |
   |                     │[Fragment: 1]       │  ┌─────────────────┐          │[5: OlapScanNode]│                                                                           |
   |                     │STREAM DATA SINK    │  │[3: OlapScanNode]│          │[Fragment: 3]    │                                                                                                             |
   |                     │  EXCHANGE ID: 10   │  │[Fragment: 2]    │          │TABLE: lineitem  │                                                                                                             |
   |                     │  UNPARTITIONED     │  │TABLE: lineitem  │          └─────────────────┘                                                                           |
   |                     └────────────────────┘  └─────────────────┘                                                                    |
   |                               ┌┘                                                                                                                                                                                 |
   |                               │                                                                                                                                                                                    |
   |                      ┌─────────────────┐                                                                                                                                       |
   |                      │[1: OlapScanNode]│                                                                                                                                                                         |
   |                      │[Fragment: 1]    │                                                                                                                                                                         |
   |                      │TABLE: orders    │                                                                                                                                                                         |
   |                      └─────────────────┘                                                                                                                                       |
   +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   3.2 Q18 优化后
   耗时15秒
   ```sql
   select
           c_name,
           c_custkey,
           o_orderkey,
           o_orderdate,
           o_totalprice,
           sum(l_quantity)
   from
           customer,
           (
              select o_orderkey, o_orderdate, o_totalprice, o_custkey
              from orders
              join (
                   select
                           l_orderkey
                   from
                           lineitem
                   group by l_orderkey
                   having sum(l_quantity) > 312
           	) a
             on orders.o_orderkey = a.l_orderkey
           ) o,
           lineitem
   where
           
           c_custkey = o_custkey
           and o_orderkey = l_orderkey
   group by
           c_name,
           c_custkey,
           o_orderkey,
           o_orderdate,
           o_totalprice
   order by
           o_totalprice desc,
           o_orderdate
   ```
   
   explain graph: (semi join变成了inner join)
   ```sql
   +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Explain String                                                                                                                                                                                   |
   +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   |                                        ┌────────────────┐                                                                                                    |
   |                                        │[13: ResultSink]│                                                                                                                                    |
   |                                        │[Fragment: 4]   │                                                                                                                                    |
   |                                        │RESULT SINK     │                                                                                                                                    |
   |                                        └────────────────┘                                                                                                    |
   |                                                 │                                                                                                                                              |
   |                                                 │                                                                                                                                              |
   |                                     ┌──────────────────────┐                                                                                     |
   |                                     │[13: MERGING-EXCHANGE]│                                                                                                                                 |
   |                                     │[Fragment: 4]         │                                                                                                                                 |
   |                                     └──────────────────────┘                                                                                     |
   |                                                 │                                                                                                                                              |
   |                                                 │                                                                                                                                              |
   |                                      ┌────────────────────┐                                                                                          |
   |                                      │[13: DataStreamSink]│                                                                                                                                  |
   |                                      │[Fragment: 0]       │                                                                                                                                  |
   |                                      │STREAM DATA SINK    │                                                                                                                                  |
   |                                      │  EXCHANGE ID: 13   │                                                                                                                                  |
   |                                      │  UNPARTITIONED     │                                                                                                                                  |
   |                                      └────────────────────┘                                                                                          |
   |                                                 │                                                                                                                                              |
   |                                                 │                                                                                                                                              |
   |                                          ┌─────────────┐                                                                                                           |
   |                                          │[9: TOP-N]   │                                                                                                                                     |
   |                                          │[Fragment: 0]│                                                                                                                                     |
   |                                          └─────────────┘                                                                                                           |
   |                                                 │                                                                                                                                              |
   |                                                 │                                                                                                                                              |
   |                                ┌────────────────────────────────┐                                                            |
   |                                │[8: AGGREGATE (update finalize)]│                                                                                                                            |
   |                                │[Fragment: 0]                   │                                                                                                                            |
   |                                └────────────────────────────────┘                                                            |
   |                                                 │                                                                                                                                              |
   |                                                 │                                                                                                                                              |
   |                                 ┌───────────────────────────────┐                                                              |
   |                                 │[7: HASH JOIN]                 │                                                                                                                            |
   |                                 │[Fragment: 0]                  │                                                                                                                            |
   |                                 │join op: INNER JOIN (BROADCAST)│                                                                                                                            |
   |                                 └───────────────────────────────┘                                                              |
   |                                     ┌───────────┴─────────────────────────────────────┐    |
   |                                     │                                                 │                                                                                                      |
   |                     ┌───────────────────────────────┐                         ┌──────────────┐ |
   |                     │[5: HASH JOIN]                 │                         │[12: EXCHANGE]│                                                                                           |
   |                     │[Fragment: 0]                  │                         │[Fragment: 0] │                                                                                           |
   |                     │join op: INNER JOIN (BROADCAST)│                         └──────────────┘                                                               |
   |                     └───────────────────────────────┘                                 │                                      |
   |          ┌──────────────────────────┴──────────┐                                      │                          |
   |          │                                     │                           ┌────────────────────┐                                                |
   | ┌─────────────────┐                    ┌──────────────┐                    │[12: DataStreamSink]│                      |
   | │[0: OlapScanNode]│                    │[11: EXCHANGE]│                    │[Fragment: 3]       │                                                                                    |
   | │[Fragment: 0]    │                    │[Fragment: 0] │                    │STREAM DATA SINK    │                                                                                    |
   | │TABLE: customer  │                    └──────────────┘                    │  EXCHANGE ID: 12   │                                                        |
   | └─────────────────┘                            │                           │  UNPARTITIONED     │                                                    |
   |                                                │                           └────────────────────┘                                                  |
   |                                     ┌────────────────────┐                           ┌┘                                                          |
   |                                     │[11: DataStreamSink]│                           │                                                                                                     |
   |                                     │[Fragment: 1]       │                  ┌─────────────────┐                                                        |
   |                                     │STREAM DATA SINK    │                  │[6: OlapScanNode]│                                                                                          |
   |                                     │  EXCHANGE ID: 11   │                  │[Fragment: 3]    │                                                                                          |
   |                                     │  UNPARTITIONED     │                  │TABLE: lineitem  │                                                                                          |
   |                                     └────────────────────┘                  └─────────────────┘                |
   |                                               ┌┘                                                                                                                                             |
   |                                               │                                                                                                                                                |
   |                               ┌───────────────────────────────┐                                                                |
   |                               │[4: HASH JOIN]                 │                                                                                                                              |
   |                               │[Fragment: 1]                  │                                                                                                                              |
   |                               │join op: INNER JOIN (BROADCAST)│                                                                                                                              |
   |                               └───────────────────────────────┘                                                                |
   |                              ┌────────────────┴──────────┐                                                                             |
   |                              │                           │                                                                                                                                   |
   |                     ┌─────────────────┐          ┌──────────────┐                                                          |
   |                     │[1: OlapScanNode]│          │[10: EXCHANGE]│                                                                                                                        |
   |                     │[Fragment: 1]    │          │[Fragment: 1] │                                                                                                                        |
   |                     │TABLE: orders    │          └──────────────┘                                                                                            |
   |                     └─────────────────┘                  │                                                                                               |
   |                                                          │                                                                                                                                     |
   |                                               ┌────────────────────┐                                                                                 |
   |                                               │[10: DataStreamSink]│                                                                                                                         |
   |                                               │[Fragment: 2]       │                                                                                                                         |
   |                                               │STREAM DATA SINK    │                                                                                                                         |
   |                                               │  EXCHANGE ID: 10   │                                                                                                                         |
   |                                               │  UNPARTITIONED     │                                                                                                                         |
   |                                               └────────────────────┘                                                                                 |
   |                                                          │                                                                                                                                     |
   |                                                          │                                                                                                                                     |
   |                                         ┌────────────────────────────────┐                                                   |
   |                                         │[3: AGGREGATE (update finalize)]│                                                                                                                   |
   |                                         │[Fragment: 2]                   │                                                                                                                   |
   |                                         └────────────────────────────────┘                                                   |
   |                                                         ┌┘                                                                                                                                   |
   |                                                         │                                                                                                                                      |
   |                                                ┌─────────────────┐                                                                                         |
   |                                                │[2: OlapScanNode]│                                                                                                                           |
   |                                                │[Fragment: 2]    │                                                                                                                           |
   |                                                │TABLE: lineitem  │                                                                                                                           |
   |                                                └─────────────────┘                                                                                         |
   +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   优化效果: 29秒 -> 15秒
   
   另外Q18的涉及4表join(customer, orders, lineitem, lineitem),最优的join顺序是 `(customer join (orders join lineitem)) join lineitem`,原因是orders join lineitem可以把orders表的大量数据过滤掉,减少join中间结果。不同join顺序大概会有2秒到5秒的差距(15秒到20秒)。
   
   
   # 3. 总结
   1. semi join转inner join可能有大幅度的优化
   2. semi join转inner join后可以结合join reorder对join顺序进行进一步优化
   3. 公共子查询**可能**可以通过物化来提速,取决于子查询的复杂程度,如果子查询选择率较高,或计算逻辑较多,物化是值得的;反之物化简单的子查询可能会让后续的查询block住。可能需要cbo来判断
   
   ### Use case
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on issue #7212: [Observation] TPC-H Q17、Q18优化调研:关联子查询和非关联子查询优化

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on issue #7212:
URL: https://github.com/apache/incubator-doris/issues/7212#issuecomment-981237887


   这个查询规划如果顶层 join 改为shuffle 后,不是会传输更多的数据?为何从 broadcast 改为shuffle 后反而效率变高了?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on issue #7212: [Observation] TPC-H Q17、Q18优化调研:semi-join改inner-join,物化公共子查询

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on issue #7212:
URL: https://github.com/apache/incubator-doris/issues/7212#issuecomment-977737394


   可以看看orca 在这个 query 上的优化是咋做的嘛?他如果可以的话,主要是在优化器的哪个节点进行的调整呢?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] 924060929 commented on issue #7212: [Observation] TPC-H Q17、Q18优化调研:关联子查询和非关联子查询优化

Posted by GitBox <gi...@apache.org>.
924060929 commented on issue #7212:
URL: https://github.com/apache/incubator-doris/issues/7212#issuecomment-982365991


   > 这个查询规划如果顶层 join 改为shuffle 后,不是会传输更多的数据?为何从 broadcast 改为shuffle 后反而效率变高了?
   
   新版runtime filter默认type是in,在shuffle join中in会隐式转成bloom filter,提高了性能。但broadcast join中runtime filter还是保持了in,所以相比性能较低。
   
   后续尝试增加一个in_or_bloomfilter的类型,让shuffle/broadcast join的runtime filter行为保持一致,并且去掉在in类型下隐式转换bloom filter的行为


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org