You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (Jira)" <ji...@apache.org> on 2019/11/04 12:55:00 UTC

[jira] [Updated] (DRILL-6949) Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled

     [ https://issues.apache.org/jira/browse/DRILL-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arina Ielchiieva updated DRILL-6949:
------------------------------------
    Fix Version/s:     (was: 1.17.0)

> Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-6949
>                 URL: https://issues.apache.org/jira/browse/DRILL-6949
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning &amp; Optimization
>    Affects Versions: 1.15.0
>            Reporter: Abhishek Ravi
>            Assignee: Boaz Ben-Zvi
>            Priority: Major
>         Attachments: 23cc1240-74ff-a0c0-8cd5-938fc136e4e2.sys.drill, 23cc1369-0812-63ce-1861-872636571437.sys.drill
>
>
> Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates)* on TPC-H SF100 data.
> {code:sql}
> set `exec.hashjoin.enable.runtime_filter` = true;
> set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
> set `planner.enable_broadcast_join` = false;
> select
>  count(*)
> from
>  lineitem l1
> where
>  l1.l_discount IN (
>  select
>  distinct(cast(l2.l_discount as double))
>  from
>  lineitem l2);
> reset `exec.hashjoin.enable.runtime_filter`;
> reset `exec.hashjoin.runtime_filter.max.waiting.time`;
> reset `planner.enable_broadcast_join`;
> {code}
> The subquery contains *distinct* keyword and hence there should not be duplicate values. 
> I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)