You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Alexander Behm (JIRA)" <ji...@apache.org> on 2018/01/13 01:24:00 UTC
[jira] [Resolved] (IMPALA-6329) Wrong results for complex query
with CTE, limit, group by and left join
[ https://issues.apache.org/jira/browse/IMPALA-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Behm resolved IMPALA-6329.
------------------------------------
Resolution: Not A Bug
Query contains a limit without order by and so produces non-deterministic results.
> Wrong results for complex query with CTE, limit, group by and left join
> -----------------------------------------------------------------------
>
> Key: IMPALA-6329
> URL: https://issues.apache.org/jira/browse/IMPALA-6329
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 2.10.0
> Reporter: Alex
> Assignee: Alexander Behm
> Priority: Blocker
> Labels: correctness
>
> Impala may generate an incorrect plan for complex query.
> (see NULL in id and a_id columns)
> Can get correct result with commented lines (1, 2, 3, 4, 5)
> Example query and incorrect plan:
> {code}
> with test as (
> select id, b as b from(
> select 1 as id , 10 as b union all
> select 2 as id , 20 as b union all
> select 3 as id , 30 as b union all
> select 4 as id , 40 as b union all
> select 5 as id , 50 as b
> ) t
> group by id, b --1
> limit 3 --2
> ),
> test2 as (
> select 1 as id, 10 as a_id union all
> select 2, 10 union all
> select 3, 20 union all
> select 4, 20 union all
> select 5, 30 union all
> select 6, 40
> )
> select * from test
> left --3
> join
> (select id , a_id
> from (select id, a_id
> from test2
> where id in (select id from test) --4
> group by id, a_id) t
> group by id, a_id --5
> ) e on test.id = e.id;
> {code}
> Result:
> {code}
> +----+----+------+------+
> | id | b | id | a_id |
> +----+----+------+------+
> | 2 | 20 | NULL | NULL |
> | 3 | 30 | 3 | 20 |
> | 5 | 50 | 5 | 30 |
> +----+----+------+------+
> {code}
> Plan:
> {code}
> +--------------------------------------------------+
> | Explain String |
> +--------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=9.69MB |
> | Per-Host Resource Estimates: Memory=41.94MB |
> | Codegen disabled by planner |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 08:HASH JOIN [LEFT OUTER JOIN, BROADCAST] |
> | | hash predicates: id = id |
> | | |
> | |--10:EXCHANGE [UNPARTITIONED] |
> | | | |
> | | 07:AGGREGATE [FINALIZE] |
> | | | group by: id, a_id |
> | | | |
> | | 06:AGGREGATE [FINALIZE] |
> | | | group by: id, a_id |
> | | | |
> | | 05:HASH JOIN [LEFT SEMI JOIN, BROADCAST] |
> | | | hash predicates: id = id |
> | | | |
> | | |--09:EXCHANGE [UNPARTITIONED] |
> | | | | |
> | | | 04:AGGREGATE [FINALIZE] |
> | | | | group by: id, b |
> | | | | limit: 3 |
> | | | | |
> | | | 03:UNION |
> | | | constant-operands=5 |
> | | | |
> | | 02:UNION |
> | | constant-operands=6 |
> | | |
> | 01:AGGREGATE [FINALIZE] |
> | | group by: id, b |
> | | limit: 3 |
> | | |
> | 00:UNION |
> | constant-operands=5 |
> +--------------------------------------------------+
> {code}
> Correct result:
> {code}
> with test as (
> select id, b as b from(
> select 1 as id , 10 as b union all
> select 2 as id , 20 as b union all
> select 3 as id , 30 as b union all
> select 4 as id , 40 as b union all
> select 5 as id , 50 as b
> ) t
> --group by id, b --1
> limit 3 --2
> ),
> test2 as (
> select 1 as id, 10 as a_id union all
> select 2, 10 union all
> select 3, 20 union all
> select 4, 20 union all
> select 5, 30 union all
> select 6, 40
> )
> select * from test left join
> (select id , a_id
> from (select id, a_id
> from test2
> where id in (select id from test) --3
> group by id, a_id) t
> group by id, a_id --4
> ) e on test.id = e.id;
> {code}
> Result:
> {code:java}
> +----+----+----+------+
> | id | b | id | a_id |
> +----+----+----+------+
> | 1 | 10 | 1 | 10 |
> | 2 | 20 | 2 | 10 |
> | 3 | 30 | 3 | 20 |
> +----+----+----+------+
> {code}
> {code}
> +--------------------------------------------------+
> | Explain String |
> +--------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=5.81MB |
> | Per-Host Resource Estimates: Memory=21.94MB |
> | Codegen disabled by planner |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 06:HASH JOIN [LEFT OUTER JOIN, BROADCAST] |
> | | hash predicates: id = id |
> | | |
> | |--08:EXCHANGE [UNPARTITIONED] |
> | | | |
> | | 05:AGGREGATE [FINALIZE] |
> | | | group by: id, a_id |
> | | | |
> | | 04:AGGREGATE [FINALIZE] |
> | | | group by: id, a_id |
> | | | |
> | | 03:HASH JOIN [LEFT SEMI JOIN, BROADCAST] |
> | | | hash predicates: id = id |
> | | | |
> | | |--07:EXCHANGE [UNPARTITIONED] |
> | | | | |
> | | | 02:UNION |
> | | | constant-operands=5 |
> | | | limit: 3 |
> | | | |
> | | 01:UNION |
> | | constant-operands=6 |
> | | |
> | 00:UNION |
> | constant-operands=5 |
> | limit: 3 |
> +--------------------------------------------------+
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)