You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chris Westin (JIRA)" <ji...@apache.org> on 2015/04/01 17:17:53 UTC
[jira] [Commented] (DRILL-2264) Incorrect data when we use aggregate functions with flatten

    [ https://issues.apache.org/jira/browse/DRILL-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390774#comment-14390774 ] 

Chris Westin commented on DRILL-2264:
-------------------------------------

Might be related to DRILL-2232, which uses flatten in an order by.

> Incorrect data when we use aggregate functions with flatten
> -----------------------------------------------------------
>
>                 Key: DRILL-2264
>                 URL: https://issues.apache.org/jira/browse/DRILL-2264
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>            Reporter: Rahul Challapalli
>            Assignee: Jason Altekruse
>            Priority: Critical
>             Fix For: 0.9.0
>
>
> git.commit.id.abbrev=6676f2d
> Data Set :
> {code}
> {
>   "uid":1,
>   "lst_lst" : [[1,2],[3,4]]
> }
> {
>   "uid":2,
>   "lst_lst" : [[1,2],[3,4]]
> }
> {code}
> The below query returns incorrect results :
> {code}
> select uid,MAX( flatten(lst_lst[1]) + flatten(lst_lst[0])) from `temp.json` group by uid, flatten(lst_lst[1]), flatten(lst_lst[0]);
> +------------+------------+
> |    uid     |   EXPR$1   |
> +------------+------------+
> | 1          | 6          |
> | 1          | 6          |
> | 1          | 6          |
> | 1          | 6          |
> | 2          | 6          |
> | 2          | 6          |
> | 2          | 6          |
> | 2          | 6          |
> +------------+------------+
> {code}
> However if we use a sub query, drill returns the right data
> {code}
> select uid, MAX(l1+l2) from (select uid,flatten(lst_lst[1]) l1, flatten(lst_lst[0]) l2 from `temp.json`) sub group by uid, l1, l2;
> +------------+------------+
> |    uid     |   EXPR$1   |
> +------------+------------+
> | 1          | 4          |
> | 1          | 5          |
> | 1          | 5          |
> | 1          | 6          |
> | 2          | 4          |
> | 2          | 5          |
> | 2          | 5          |
> | 2          | 6          |
> +------------+------------+
> {code}
> Also using a single flatten yields proper results
> {code}
> select uid,MAX(flatten(lst_lst[0])) from `temp.json` group by uid, flatten(lst_lst[0]);
> +------------+------------+
> |    uid     |   EXPR$1   |
> +------------+------------+
> | 1          | 1          |
> | 1          | 2          |
> | 2          | 1          |
> | 2          | 2          |
> +------------+------------+
> {code}
> Marked it as critical since we return in-correct data. Let me know if you have any other questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)