You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/14 22:02:08 UTC

[GitHub] [arrow-datafusion] Dandandan commented on issue #2230: Panic while running inner join with predicate only on single relation

Dandandan commented on issue #2230:
URL: https://github.com/apache/arrow-datafusion/issues/2230#issuecomment-1099665076

   Reproducing:
   
   ```
   ❯ create table y as select 1 c;
   +---+
   | c |
   +---+
   | 1 |
   +---+
   1 row in set. Query took 0.005 seconds.
   ❯ select * from y t1 join y t2 on t1.c = 1 and  t2.c = 1;
   thread 'tokio-runtime-worker' panicked at 'index out of bounds: the len is 0 but the index is 0', /home/danielheres/Code/gdd/arrow-datafusion/datafusion/core/src/physical_plan/repartition.rs:349:39
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   ArrowError(ExternalError(ArrowError(ExternalError(Execution("Join Error: panic")))))
   ❯ explain select * from y t1 join y t2 on t1.c = 1 and  t2.c = 1;
   +---------------+-----------------------------------------------------------------+
   | plan_type     | plan                                                            |
   +---------------+-----------------------------------------------------------------+
   | logical_plan  | Projection: #t1.c, #t2.c                                        |
   |               |   Inner Join:                                                   |
   |               |     Filter: #t1.c = Int64(1)                                    |
   |               |       TableScan: t1 projection=Some([0])                        |
   |               |     Filter: #t2.c = Int64(1)                                    |
   |               |       TableScan: t2 projection=Some([0])                        |
   | physical_plan | ProjectionExec: expr=[c@0 as c, c@1 as c]                       |
   |               |   CoalesceBatchesExec: target_batch_size=4096                   |
   |               |     HashJoinExec: mode=Partitioned, join_type=Inner, on=[]      |
   |               |       CoalesceBatchesExec: target_batch_size=4096               |
   |               |         RepartitionExec: partitioning=Hash([], 16)              |
   |               |           CoalesceBatchesExec: target_batch_size=4096           |
   |               |             FilterExec: c@0 = 1                                 |
   |               |               RepartitionExec: partitioning=RoundRobinBatch(16) |
   |               |                 MemoryExec: partitions=1, partition_sizes=[1]   |
   |               |       CoalesceBatchesExec: target_batch_size=4096               |
   |               |         RepartitionExec: partitioning=Hash([], 16)              |
   |               |           CoalesceBatchesExec: target_batch_size=4096           |
   |               |             FilterExec: c@0 = 1                                 |
   |               |               RepartitionExec: partitioning=RoundRobinBatch(16) |
   |               |                 MemoryExec: partitions=1, partition_sizes=[1]   |
   |               |                                                                 |
   +---------------+-----------------------------------------------------------------+
   2 rows in set. Query took 0.007 seconds.
   ```
   
   I think the reason is we are creating a hash join based on a empty list of keys.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org