You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2016/02/03 01:16:40 UTC
[jira] [Created] (DRILL-4345) Hive Native Reader reporting wrong
results for timestamp column in hive generated parquet file
Rahul Challapalli created DRILL-4345:
----------------------------------------
Summary: Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file
Key: DRILL-4345
URL: https://issues.apache.org/jira/browse/DRILL-4345
Project: Apache Drill
Issue Type: Bug
Components: Storage - Hive, Storage - Parquet
Reporter: Rahul Challapalli
Priority: Critical
git.commit.id.abbrev=1b96174
Below you can see different results returned from hive plugin and native reader for the same table.
{code}
0: jdbc:drill:zk=10.10.100.190:5181> use hive;
+-------+-----------------------------------+
| ok | summary |
+-------+-----------------------------------+
| true | Default schema changed to [hive] |
+-------+-----------------------------------+
1 row selected (0.415 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from hive1_fewtypes_null_parquet;
+----------+------------------------+
| int_col | timestamp_col |
+----------+------------------------+
| 1 | null |
| null | 1997-01-02 00:00:00.0 |
| 3 | null |
| 4 | null |
| 5 | 1997-02-10 17:32:00.0 |
| 6 | 1997-02-11 17:32:01.0 |
| 7 | 1997-02-12 17:32:01.0 |
| 8 | 1997-02-13 17:32:01.0 |
| 9 | null |
| 10 | 1997-02-15 17:32:01.0 |
| null | 1997-02-16 17:32:01.0 |
| 12 | 1897-02-18 17:32:01.0 |
| 13 | 2002-02-14 17:32:01.0 |
| 14 | 1991-02-10 17:32:01.0 |
| 15 | 1900-02-16 17:32:01.0 |
| 16 | null |
| null | 1897-02-16 17:32:01.0 |
| 18 | 1997-02-16 17:32:01.0 |
| null | null |
| 20 | 1996-02-28 17:32:01.0 |
| null | null |
+----------+------------------------+
21 rows selected (0.368 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> alter session set `store.hive.optimize_scan_with_native_readers` = true;
+-------+--------------------------------------------------------+
| ok | summary |
+-------+--------------------------------------------------------+
| true | store.hive.optimize_scan_with_native_readers updated. |
+-------+--------------------------------------------------------+
1 row selected (0.213 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from hive1_fewtypes_null_parquet;
+----------+------------------------+
| int_col | timestamp_col |
+----------+------------------------+
| 1 | null |
| null | 1997-01-02 00:00:00.0 |
| 3 | 1997-02-10 17:32:00.0 |
| 4 | null |
| 5 | 1997-02-11 17:32:01.0 |
| 6 | 1997-02-12 17:32:01.0 |
| 7 | 1997-02-13 17:32:01.0 |
| 8 | 1997-02-15 17:32:01.0 |
| 9 | 1997-02-16 17:32:01.0 |
| 10 | 1900-02-16 17:32:01.0 |
| null | 1897-02-16 17:32:01.0 |
| 12 | 1997-02-16 17:32:01.0 |
| 13 | 1996-02-28 17:32:01.0 |
| 14 | 1997-01-02 00:00:00.0 |
| 15 | 1997-01-02 00:00:00.0 |
| 16 | 1997-01-02 00:00:00.0 |
| null | 1997-01-02 00:00:00.0 |
| 18 | 1997-01-02 00:00:00.0 |
| null | 1997-01-02 00:00:00.0 |
| 20 | 1997-01-02 00:00:00.0 |
| null | 1997-01-02 00:00:00.0 |
+----------+------------------------+
21 rows selected (0.352 seconds)
{code}
DDL for hive table :
{code}
create external table hive1_fewtypes_null_parquet (
int_col int,
bigint_col bigint,
date_col string,
time_col string,
timestamp_col timestamp,
interval_col string,
varchar_col string,
float_col float,
double_col double,
bool_col boolean
)
stored as parquet
location '/drill/testdata/hive_storage/hive1_fewtypes_null';
{code}
Attached the underlying parquet file
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)