You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/04/28 02:18:06 UTC
[jira] [Closed] (DRILL-2203) DISTINCT over UNION ALL subquery with fully qualified column names returns wrong result

     [ https://issues.apache.org/jira/browse/DRILL-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Victoria Markman closed DRILL-2203.
-----------------------------------

> DISTINCT over UNION ALL subquery with fully qualified column names returns wrong result
> ---------------------------------------------------------------------------------------
>
>                 Key: DRILL-2203
>                 URL: https://issues.apache.org/jira/browse/DRILL-2203
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.8.0
>            Reporter: Victoria Markman
>            Assignee: Sean Hsuan-Yi Chu
>            Priority: Critical
>             Fix For: 0.8.0
>
>         Attachments: t1.parquet, t2.parquet, t3.parquet, t4.parquet
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select a1, b1, c1 from t1 union all select a2, b2, c2 from t2;
> +------------+------------+------------+
> |     a1     |     b1     |     c1     |
> +------------+------------+------------+
> | 1          | aaaaa      | 2015-01-01 |
> | 2          | bbbbb      | 2015-01-02 |
> | 3          | ccccc      | 2015-01-03 |
> | 4          | null       | 2015-01-04 |
> | 5          | eeeee      | 2015-01-05 |
> | 6          | fffff      | 2015-01-06 |
> | 7          | ggggg      | 2015-01-07 |
> | null       | hhhhh      | 2015-01-08 |
> | 9          | iiiii      | null       |
> | 10         | jjjjj      | 2015-01-10 |
> | 0          | zzz        | 2014-12-31 |
> | 1          | aaaaa      | 2015-01-01 |
> | 2          | bbbbb      | 2015-01-02 |
> | 2          | bbbbb      | 2015-01-02 |
> | 2          | bbbbb      | 2015-01-02 |
> | 3          | ccccc      | 2015-01-03 |
> | 4          | ddddd      | 2015-01-04 |
> | 5          | eeeee      | 2015-01-05 |
> | 6          | fffff      | 2015-01-06 |
> | 7          | ggggg      | 2015-01-07 |
> | 7          | ggggg      | 2015-01-07 |
> | 8          | hhhhh      | 2015-01-08 |
> | 9          | iiiii      | 2015-01-09 |
> +------------+------------+------------+
> {code}
> Wrong result:
> {code}
> 0: jdbc:drill:schema=dfs> select distinct sq.x1, sq.x2, sq.x3 from ( select a1, b1, c1 from t1 union all select a2, b2, c2 from t2 ) as sq(x1,x2,x3);
> +------------+------------+------------+
> |     x1     |     x2     |     x3     |
> +------------+------------+------------+
> | null       | null       | null       |
> +------------+------------+------------+
> 1 row selected (0.127 seconds)
> {code}
> Query plan:
> {code}
> 00-01      Project(x1=[$0], x2=[$1], x3=[$2])
> 00-02        HashAgg(group=[{0, 1, 2}])
> 00-03          Project(x1=[$0], x2=[$1], x3=[$2])
> 00-04            UnionAll(all=[true])
> 00-06              Project(a1=[$2], b1=[$1], c1=[$0])
> 00-08                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/aggregation/sanity/t1]], selectionRoot=/aggregation/sanity/t1, numFiles=1, columns=[`a1`, `b1`, `c1`]]])
> 00-05              Project(a2=[$1], b2=[$0], c2=[$2])
> 00-07                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/aggregation/sanity/t2]], selectionRoot=/aggregation/sanity/t2, numFiles=1, columns=[`a2`, `b2`, `c2`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)