You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Deneche A. Hakim (JIRA)" <ji...@apache.org> on 2016/03/10 07:22:40 UTC
[jira] [Assigned] (DRILL-4494) Window sum over integer column returns incorrect results.

     [ https://issues.apache.org/jira/browse/DRILL-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deneche A. Hakim reassigned DRILL-4494:
---------------------------------------

    Assignee: Deneche A. Hakim

> Window sum over integer column returns incorrect results.
> ---------------------------------------------------------
>
>                 Key: DRILL-4494
>                 URL: https://issues.apache.org/jira/browse/DRILL-4494
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.6.0
>         Environment: 4 node cluster CentOS
>            Reporter: Khurram Faraaz
>            Assignee: Deneche A. Hakim
>            Priority: Critical
>              Labels: window_function
>             Fix For: 1.7.0
>
>         Attachments: 0_0_0.parquet, t_alltype.csv
>
>
> Window sum over integer column returns incorrect results.
> Drill 1.6.0, git commit ID : git.commit.id=64ab0a8e
> Note that Drill returns the same value for SUM(c2) in the below query.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select SUM(c2) OVER w as w_sum from ( SELECT * FROM t_alltype ORDER BY c1,c2 DESC NULLS LAST ) subquery WINDOW w AS (PARTITION BY c8 ORDER BY c2 DESC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
> +---------------+
> |     w_sum     |
> +---------------+
> | -16323034011  |
> | -16323034011  |
> | -16323034011  |
> ...
> | -16323034011  |
> | -16323034011  |
> | -16323034011  |
> | -16323034011  |
> | -16323034011  |
> +---------------+
> 145 rows selected (0.683 seconds)
> {noformat}
> Postgres 9.3
> {noformat}
> postgres=#   select SUM(c2) OVER w as w_sum from ( SELECT * FROM t_alltype ORDER BY c1,c2 DESC NULLS LAST ) subquery WINDOW w AS (PARTITION BY c8 ORDER BY c2 DESC NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
>     w_sum
> -------------
>  -8995559793
>  -8995559793
>  -8995559793
>  -8995559793
>  -8995559793
>  -8995559793
>  -8995559793
>  ...
>  -9169127412
>  -9169127412
>  -9169127412
>  -9169127412
>  -9169127412
>  -9169127412
>  -9169127412
>  -9169127412
>  ...
>   1841653194
>   1841653194
>   1841653194
>   1841653194
>   1841653194
> (145 rows)
> {noformat}
> Both in Drill and on Postgres there are 142 distinct rows and three nulls of the 145 total rows, in column c2.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select count(distinct c2) from t_alltype;
> +---------+
> | EXPR$0  |
> +---------+
> | 142     |
> +---------+
> 1 row selected (0.682 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select c2 from t_alltype where c2 is null;
> +-------+
> |  c2   |
> +-------+
> | null  |
> | null  |
> | null  |
> +-------+
> 3 rows selected (0.389 seconds)
> {noformat}
> {noformat}
> postgres=#  select count(distinct c2) from t_alltype;
>  count
> -------
>    142
> (1 row)
> postgres=# select c2 from t_alltype where c2 is null;
>  c2
> ----
> (3 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)