You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Alexander Behm (JIRA)" <ji...@apache.org> on 2018/01/13 01:24:00 UTC

[jira] [Resolved] (IMPALA-6329) Wrong results for complex query with CTE, limit, group by and left join

     [ https://issues.apache.org/jira/browse/IMPALA-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Behm resolved IMPALA-6329.
------------------------------------
    Resolution: Not A Bug

Query contains a limit without order by and so produces non-deterministic results.

> Wrong results for complex query with CTE, limit, group by and left join
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-6329
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6329
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.10.0
>            Reporter: Alex
>            Assignee: Alexander Behm
>            Priority: Blocker
>              Labels: correctness
>
> Impala may generate an incorrect plan for complex query.
> (see NULL in id and a_id columns)
> Can get correct result with commented lines (1, 2, 3, 4, 5)
> Example query and incorrect plan:
> {code}
> with test as (
>   select id, b as b from(
>    select 1 as id , 10 as b union all
>    select 2 as id , 20 as b union all
>    select 3 as id , 30 as b union all
>    select 4 as id , 40 as b union all
>    select 5 as id , 50 as b
>   ) t 
>   group by id, b --1
>   limit 3 --2
> ), 
> test2 as (
>   select 1 as id, 10 as a_id union all
>   select 2, 10 union all
>   select 3, 20 union all
>   select 4, 20 union all
>   select 5, 30 union all
>   select 6, 40
> )
> select * from test 
> left --3
> join
> (select id , a_id
>    from (select id, a_id
>            from test2
>           where id in (select id from test) --4
>           group by id, a_id) t
>   group by id, a_id --5
> ) e on test.id = e.id;
> {code}
> Result:
> {code}
> +----+----+------+------+
> | id | b  | id   | a_id |
> +----+----+------+------+
> | 2  | 20 | NULL | NULL |
> | 3  | 30 | 3    | 20   |
> | 5  | 50 | 5    | 30   |
> +----+----+------+------+
> {code}
> Plan:
> {code}
> +--------------------------------------------------+
> | Explain String                                   |
> +--------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=9.69MB |
> | Per-Host Resource Estimates: Memory=41.94MB      |
> | Codegen disabled by planner                      |
> |                                                  |
> | PLAN-ROOT SINK                                   |
> | |                                                |
> | 08:HASH JOIN [LEFT OUTER JOIN, BROADCAST]        |
> | |  hash predicates: id = id                      |
> | |                                                |
> | |--10:EXCHANGE [UNPARTITIONED]                   |
> | |  |                                             |
> | |  07:AGGREGATE [FINALIZE]                       |
> | |  |  group by: id, a_id                         |
> | |  |                                             |
> | |  06:AGGREGATE [FINALIZE]                       |
> | |  |  group by: id, a_id                         |
> | |  |                                             |
> | |  05:HASH JOIN [LEFT SEMI JOIN, BROADCAST]      |
> | |  |  hash predicates: id = id                   |
> | |  |                                             |
> | |  |--09:EXCHANGE [UNPARTITIONED]                |
> | |  |  |                                          |
> | |  |  04:AGGREGATE [FINALIZE]                    |
> | |  |  |  group by: id, b                         |
> | |  |  |  limit: 3                                |
> | |  |  |                                          |
> | |  |  03:UNION                                   |
> | |  |     constant-operands=5                     |
> | |  |                                             |
> | |  02:UNION                                      |
> | |     constant-operands=6                        |
> | |                                                |
> | 01:AGGREGATE [FINALIZE]                          |
> | |  group by: id, b                               |
> | |  limit: 3                                      |
> | |                                                |
> | 00:UNION                                         |
> |    constant-operands=5                           |
> +--------------------------------------------------+
> {code}
> Correct result:
> {code}
> with test as (
>   select id, b as b from(
>    select 1 as id , 10 as b union all
>    select 2 as id , 20 as b union all
>    select 3 as id , 30 as b union all
>    select 4 as id , 40 as b union all
>    select 5 as id , 50 as b
>   ) t 
>   --group by id, b --1
>   limit 3 --2
> ), 
> test2 as (
>   select 1 as id, 10 as a_id union all
>   select 2, 10 union all
>   select 3, 20 union all
>   select 4, 20 union all
>   select 5, 30 union all
>   select 6, 40
> )
> select * from test left join
> (select id , a_id
>    from (select id, a_id
>            from test2
>           where id in (select id from test) --3
>           group by id, a_id) t
>   group by id, a_id --4
> ) e on test.id = e.id;
> {code}
> Result:
> {code:java}
> +----+----+----+------+
> | id | b  | id | a_id |
> +----+----+----+------+
> | 1  | 10 | 1  | 10   |
> | 2  | 20 | 2  | 10   |
> | 3  | 30 | 3  | 20   |
> +----+----+----+------+
> {code}
> {code}
> +--------------------------------------------------+
> | Explain String                                   |
> +--------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=5.81MB |
> | Per-Host Resource Estimates: Memory=21.94MB      |
> | Codegen disabled by planner                      |
> |                                                  |
> | PLAN-ROOT SINK                                   |
> | |                                                |
> | 06:HASH JOIN [LEFT OUTER JOIN, BROADCAST]        |
> | |  hash predicates: id = id                      |
> | |                                                |
> | |--08:EXCHANGE [UNPARTITIONED]                   |
> | |  |                                             |
> | |  05:AGGREGATE [FINALIZE]                       |
> | |  |  group by: id, a_id                         |
> | |  |                                             |
> | |  04:AGGREGATE [FINALIZE]                       |
> | |  |  group by: id, a_id                         |
> | |  |                                             |
> | |  03:HASH JOIN [LEFT SEMI JOIN, BROADCAST]      |
> | |  |  hash predicates: id = id                   |
> | |  |                                             |
> | |  |--07:EXCHANGE [UNPARTITIONED]                |
> | |  |  |                                          |
> | |  |  02:UNION                                   |
> | |  |     constant-operands=5                     |
> | |  |     limit: 3                                |
> | |  |                                             |
> | |  01:UNION                                      |
> | |     constant-operands=6                        |
> | |                                                |
> | 00:UNION                                         |
> |    constant-operands=5                           |
> |    limit: 3                                      |
> +--------------------------------------------------+
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)