You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Krystal (JIRA)" <ji...@apache.org> on 2017/03/24 18:50:41 UTC
[jira] [Created] (DRILL-5381) convert_from(col, 'TIMESTAMP_IMPALA')
returns incorrect timestamp if there are multiple nulls
Krystal created DRILL-5381:
------------------------------
Summary: convert_from(col, 'TIMESTAMP_IMPALA') returns incorrect timestamp if there are multiple nulls
Key: DRILL-5381
URL: https://issues.apache.org/jira/browse/DRILL-5381
Project: Apache Drill
Issue Type: Bug
Components: Storage - Parquet
Affects Versions: 1.9.0, 1.8.0, 1.10.0
Reporter: Krystal
In drill-1.10, setting `store.parquet.reader.int96_as_timestamp`=true returns expected data:
select voter_id,create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet` limit 15;
+-----------+------------------------+
| voter_id | create_timestamp |
+-----------+------------------------+
| 1 | 2016-10-23 20:03:58.0 |
| 2 | null |
| 3 | 2016-09-09 12:01:18.0 |
| 4 | 2017-03-06 20:35:55.0 |
| 5 | 2017-01-20 22:32:43.0 |
| 6 | 2016-10-22 05:46:12.0 |
| 7 | 2016-09-19 10:21:29.0 |
| 8 | null |
| 9 | 2016-07-23 13:39:02.0 |
| 10 | 2017-01-28 17:27:19.0 |
| 11 | 2016-10-23 10:55:44.0 |
| 12 | 2016-06-07 22:44:03.0 |
| 13 | 2016-05-04 13:59:20.0 |
| 14 | 2016-11-08 17:20:14.0 |
| 15 | 2016-05-14 11:23:53.0 |
+-----------+------------------------+
However, setting `store.parquet.reader.int96_as_timestamp`=false returns incorrect timestamp when it encounters the second "null" value.
select voter_id,convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from dfs.`/user/hive/warehouse/voter_hive_parquet` limit 15;
+-----------+------------------------+
| voter_id | EXPR$1 |
+-----------+------------------------+
| 1 | 2016-10-23 20:03:58.0 |
| 2 | null |
| 3 | 2016-09-09 12:01:18.0 |
| 4 | 2017-03-06 20:35:55.0 |
| 5 | 2017-01-20 22:32:43.0 |
| 6 | 2016-10-22 05:46:12.0 |
| 7 | 2016-09-19 10:21:29.0 |
| 8 | 2016-07-23 13:39:02.0 |
| 9 | 2016-10-23 10:55:44.0 |
| 10 | 2016-06-07 22:44:03.0 |
| 11 | 2016-05-04 13:59:20.0 |
| 12 | 2016-11-08 17:20:14.0 |
| 13 | 2016-05-14 11:23:53.0 |
| 14 | 2016-06-20 16:18:51.0 |
| 15 | 2016-09-09 10:02:28.0 |
+-----------+------------------------+
Notice that the timestamp for voter_id=9 shifts to voter_id=8 which suppose to have value of "null". The rest of the timestamps after voter_id=7 are incorrect. This issue is also reproducible on both drill-1.8.0 and drill-1.9.0.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)