You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Abhishek Ravi (JIRA)" <ji...@apache.org> on 2019/01/08 00:36:00 UTC

[jira] [Created] (DRILL-6949) Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled

Abhishek Ravi created DRILL-6949:
------------------------------------

             Summary: Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled
                 Key: DRILL-6949
                 URL: https://issues.apache.org/jira/browse/DRILL-6949
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning &amp; Optimization
    Affects Versions: 1.15.0
            Reporter: Abhishek Ravi


Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates)* on TPC-H SF100 data.

{code:sql}

set `exec.hashjoin.enable.runtime_filter` = true;
set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
set `planner.enable_broadcast_join` = false;


select
 count(*)
from
 lineitem l1
where
 l1.l_discount IN (
 select
 distinct(cast(l2.l_discount as double))
 from
 lineitem l2);

reset `exec.hashjoin.enable.runtime_filter`;
reset `exec.hashjoin.runtime_filter.max.waiting.time`;
reset `planner.enable_broadcast_join`;

{code}

The subquery contains *distinct* keyword and hence there should not be duplicate values. 

I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly.
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)