You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2016/02/29 11:39:18 UTC
[jira] [Created] (DRILL-4453) Difference in results over char data,
window function query
Khurram Faraaz created DRILL-4453:
-------------------------------------
Summary: Difference in results over char data, window function query
Key: DRILL-4453
URL: https://issues.apache.org/jira/browse/DRILL-4453
Project: Apache Drill
Issue Type: Bug
Components: Execution - Flow
Affects Versions: 1.6.0
Environment: 4 node cluster
Reporter: Khurram Faraaz
Window function query with frame clause returns results that are different from those returned by same query on Postgres 9.3 of same data.
The difference is that there are two extra nulls returned by Drill where as Postgres does not. Note that the two tables have same number of nulls in both Drill and Postgres.
{noformat}
postgres=# \d t_alltype
Table "public.t_alltype"
Column | Type | Modifiers
--------+-----------------------------+-----------
c1 | integer |
c2 | integer |
c3 | bigint |
c4 | character(256) |
c5 | character varying(256) |
c6 | timestamp without time zone |
c7 | date |
c8 | boolean |
c9 | double precision |
postgres=# select c4 from t_alltype where c4 is null;
c4
----
(3 rows)
{noformat}
postgres=# SELECT MIN(c4) OVER(PARTITION BY c8 ORDER BY c1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM t_alltype;
min
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
gwfrW
ZAFOcferhjkcl
ZAFOcferhjkcl
ZAFOcferhjkcl
ZAFOcferhjkcl
...
...
ApKK
ApKK
(145 rows)
{noformat}
Parquet schema details
{noformat}
[root@centos-01 parquet-tools]# ./parquet-schema ./Datasources/window_functions/t_alltype.parquet
message root {
optional int32 c1;
optional int32 c2;
optional int64 c3;
optional binary c4 (UTF8);
optional binary c5 (UTF8);
optional int64 c6 (TIMESTAMP_MILLIS);
optional int32 c7 (DATE);
optional boolean c8;
optional double c9;
}
{noformat}
On Drill 1.6.0
{noformat}
0: jdbc:drill:schema=dfs.tmp> SELECT MIN(c4) OVER(PARTITION BY c8 ORDER BY c1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM dfs.tmp.`t_alltype`;
+--------------------------------+
| EXPR$0 |
+--------------------------------+
| gwfrW |
| ZAFOcferhjkcl |
| ZAFOcferhjkcl |
| ZAFOcferhjkcl |
| ZAFOcferhjkcl |
...
...
| ApKK |
| ApKK |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| null |
| null |
| |
| |
| |
+------------------------------+
145 rows selected (0.409 seconds)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)