You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/12/12 01:22:46 UTC
[jira] [Commented] (DRILL-4191) Last value function returns incorrect results.

    [ https://issues.apache.org/jira/browse/DRILL-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053840#comment-15053840 ] 

Khurram Faraaz commented on DRILL-4191:
---------------------------------------

Also when we order by the same column as the partitioning (partition by) column we get the right result for last value. The column being c1 in the query below.

{code}
0: jdbc:drill:schema=dfs.tmp> select c1, last_value(c2) over(partition by c1 order by c1) lstval from md_627;
+-----+-------------+
| c1  |   lstval    |
+-----+-------------+
| 1   | 2015-12-12  |
| 1   | 2015-12-12  |
| 1   | 2015-12-12  |
+-----+-------------+
3 rows selected (0.458 seconds)
{code}

> Last value function returns incorrect results.
> ----------------------------------------------
>
>                 Key: DRILL-4191
>                 URL: https://issues.apache.org/jira/browse/DRILL-4191
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.4.0
>         Environment: 4 node cluster on CentOS
>            Reporter: Khurram Faraaz
>
> Last value function returns incorrect results.
> {noformat}
> [root@centos-01 ~]# cat MD627.csv
> 1,2015-01-01
> 1,2015-01-02
> 1,2015-12-12
> git commit id : b9068117
> create table md_627 as select cast(columns[0] as int) c1, cast(columns[1] as date) c2 from `MD627.csv`;
> [root@centos-01 parquet-tools]# ./parquet-schema ../md627/0_0_0.parquet
> message root {
>   optional int32 c1;
>   optional int32 c2 (DATE);
> }
> 0: jdbc:drill:schema=dfs.tmp> select * from md_627;
> +-----+-------------+
> | c1  |     c2      |
> +-----+-------------+
> | 1   | 2015-01-01  |
> | 1   | 2015-01-02  |
> | 1   | 2015-12-12  |
> +-----+-------------+
> 3 rows selected (0.265 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select c1, last_value(c2) over(partition by c1 order by c2) lstval from md_627;
> +-----+-------------+
> | c1  |   lstval    |
> +-----+-------------+
> | 1   | 2015-01-01  |
> | 1   | 2015-01-02  |
> | 1   | 2015-12-12  |
> +-----+-------------+
> 3 rows selected (0.405 seconds)
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, last_value(c2) over(partition by c1 order by c2) lstval from md_627;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      Project(c1=[$0], lstval=[$1])
> 00-02        Project(c1=[$0], lstval=[$1])
> 00-03          Project(c1=[$0], $1=[$2])
> 00-04            Window(window#0=[window(partition {0} order by [1] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [LAST_VALUE($1)])])
> 00-05              SelectionVectorRemover
> 00-06                Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
> 00-07                  Project(c1=[$1], c2=[$0])
> 00-08                    Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/md_627]], selectionRoot=maprfs:/tmp/md_627, numFiles=1, usedMetadataFile=false, columns=[`c1`, `c2`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)