You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Abhishek Ravi (JIRA)" <ji...@apache.org> on 2019/01/08 00:36:00 UTC
[jira] [Created] (DRILL-6949) Query fails with
"UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data
any further" when Semi join is enabled
Abhishek Ravi created DRILL-6949:
------------------------------------
Summary: Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled
Key: DRILL-6949
URL: https://issues.apache.org/jira/browse/DRILL-6949
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.15.0
Reporter: Abhishek Ravi
Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates)* on TPC-H SF100 data.
{code:sql}
set `exec.hashjoin.enable.runtime_filter` = true;
set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
set `planner.enable_broadcast_join` = false;
select
count(*)
from
lineitem l1
where
l1.l_discount IN (
select
distinct(cast(l2.l_discount as double))
from
lineitem l2);
reset `exec.hashjoin.enable.runtime_filter`;
reset `exec.hashjoin.runtime_filter.max.waiting.time`;
reset `planner.enable_broadcast_join`;
{code}
The subquery contains *distinct* keyword and hence there should not be duplicate values.
I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)