You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/12 07:13:32 UTC

[GitHub] [arrow-datafusion] pjmore removed a comment on issue #1293: sql planner creates cross join instead of inner join from select predicates

pjmore removed a comment on issue #1293:
URL: https://github.com/apache/arrow-datafusion/issues/1293#issuecomment-986080519


   I believe this is the same issue that affects TPCH 9. Namely that the planner does not recognize inner joins through a table. In #77 I sketched out a solution that I think should work which gives the plan:
   ```
   ❯ create table part as select 1 as p_partkey;
   0 rows in set. Query took 0.031 seconds.
   ❯ create table lineitem as select 1 as l_partkey, 2 as l_suppkey;
   0 rows in set. Query took 0.001 seconds.
   ❯ create table supplier as select 1 as s_suppkey;
   0 rows in set. Query took 0.001 seconds.
   ❯ explain select * from part, supplier, lineitem where p_partkey = l_partkey and s_suppkey = l_suppkey;
   +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                                                                                                               |
   +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Projection: #part.p_partkey, #lineitem.l_partkey, #lineitem.l_suppkey, #supplier.s_suppkey                                                         |
   |               |   Join: #lineitem.l_suppkey = #supplier.s_suppkey                                                                                                  |
   |               |     Join: #part.p_partkey = #lineitem.l_partkey                                                                                                    |
   |               |       TableScan: part projection=Some([0])                                                                                                         |
   |               |       TableScan: lineitem projection=Some([0, 1])                                                                                                  |
   |               |     TableScan: supplier projection=Some([0])                                                                                                       |
   | physical_plan | ProjectionExec: expr=[p_partkey@0 as p_partkey, l_partkey@1 as l_partkey, l_suppkey@2 as l_suppkey, s_suppkey@3 as s_suppkey]                      |
   |               |   CoalesceBatchesExec: target_batch_size=4096                                                                                                      |
   |               |     HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: "l_suppkey", index: 2 }, Column { name: "s_suppkey", index: 0 })]         |
   |               |       CoalesceBatchesExec: target_batch_size=4096                                                                                                  |
   |               |         RepartitionExec: partitioning=Hash([Column { name: "l_suppkey", index: 2 }], 12)                                                           |
   |               |           CoalesceBatchesExec: target_batch_size=4096                                                                                              |
   |               |             HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column { name: "p_partkey", index: 0 }, Column { name: "l_partkey", index: 0 })] |
   |               |               CoalesceBatchesExec: target_batch_size=4096                                                                                          |
   |               |                 RepartitionExec: partitioning=Hash([Column { name: "p_partkey", index: 0 }], 12)                                                   |
   |               |                   RepartitionExec: partitioning=RoundRobinBatch(12)                                                                                |
   |               |                     MemoryExec: partitions=1, partition_sizes=[1]                                                                                  |
   |               |               CoalesceBatchesExec: target_batch_size=4096                                                                                          |
   |               |                 RepartitionExec: partitioning=Hash([Column { name: "l_partkey", index: 0 }], 12)                                                   |
   |               |                   RepartitionExec: partitioning=RoundRobinBatch(12)                                                                                |
   |               |                     MemoryExec: partitions=1, partition_sizes=[1]                                                                                  |
   |               |       CoalesceBatchesExec: target_batch_size=4096                                                                                                  |
   |               |         RepartitionExec: partitioning=Hash([Column { name: "s_suppkey", index: 0 }], 12)                                                           |
   |               |           RepartitionExec: partitioning=RoundRobinBatch(12)                                                                                        |
   |               |             MemoryExec: partitions=1, partition_sizes=[1]                                                                                          |
   |               |                                                                                                                                                    |
   +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org