You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2016/02/03 01:16:40 UTC
[jira] [Created] (DRILL-4345) Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file

Rahul Challapalli created DRILL-4345:
----------------------------------------

             Summary: Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file
                 Key: DRILL-4345
                 URL: https://issues.apache.org/jira/browse/DRILL-4345
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Hive, Storage - Parquet
            Reporter: Rahul Challapalli
            Priority: Critical


git.commit.id.abbrev=1b96174

Below you can see different results returned from hive plugin and native reader for the same table.

{code}
0: jdbc:drill:zk=10.10.100.190:5181> use hive;
+-------+-----------------------------------+
|  ok   |              summary              |
+-------+-----------------------------------+
| true  | Default schema changed to [hive]  |
+-------+-----------------------------------+
1 row selected (0.415 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from hive1_fewtypes_null_parquet;
+----------+------------------------+
| int_col  |     timestamp_col      |
+----------+------------------------+
| 1        | null                   |
| null     | 1997-01-02 00:00:00.0  |
| 3        | null                   |
| 4        | null                   |
| 5        | 1997-02-10 17:32:00.0  |
| 6        | 1997-02-11 17:32:01.0  |
| 7        | 1997-02-12 17:32:01.0  |
| 8        | 1997-02-13 17:32:01.0  |
| 9        | null                   |
| 10       | 1997-02-15 17:32:01.0  |
| null     | 1997-02-16 17:32:01.0  |
| 12       | 1897-02-18 17:32:01.0  |
| 13       | 2002-02-14 17:32:01.0  |
| 14       | 1991-02-10 17:32:01.0  |
| 15       | 1900-02-16 17:32:01.0  |
| 16       | null                   |
| null     | 1897-02-16 17:32:01.0  |
| 18       | 1997-02-16 17:32:01.0  |
| null     | null                   |
| 20       | 1996-02-28 17:32:01.0  |
| null     | null                   |
+----------+------------------------+
21 rows selected (0.368 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> alter session set `store.hive.optimize_scan_with_native_readers` = true;
+-------+--------------------------------------------------------+
|  ok   |                        summary                         |
+-------+--------------------------------------------------------+
| true  | store.hive.optimize_scan_with_native_readers updated.  |
+-------+--------------------------------------------------------+
1 row selected (0.213 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from hive1_fewtypes_null_parquet;
+----------+------------------------+
| int_col  |     timestamp_col      |
+----------+------------------------+
| 1        | null                   |
| null     | 1997-01-02 00:00:00.0  |
| 3        | 1997-02-10 17:32:00.0  |
| 4        | null                   |
| 5        | 1997-02-11 17:32:01.0  |
| 6        | 1997-02-12 17:32:01.0  |
| 7        | 1997-02-13 17:32:01.0  |
| 8        | 1997-02-15 17:32:01.0  |
| 9        | 1997-02-16 17:32:01.0  |
| 10       | 1900-02-16 17:32:01.0  |
| null     | 1897-02-16 17:32:01.0  |
| 12       | 1997-02-16 17:32:01.0  |
| 13       | 1996-02-28 17:32:01.0  |
| 14       | 1997-01-02 00:00:00.0  |
| 15       | 1997-01-02 00:00:00.0  |
| 16       | 1997-01-02 00:00:00.0  |
| null     | 1997-01-02 00:00:00.0  |
| 18       | 1997-01-02 00:00:00.0  |
| null     | 1997-01-02 00:00:00.0  |
| 20       | 1997-01-02 00:00:00.0  |
| null     | 1997-01-02 00:00:00.0  |
+----------+------------------------+
21 rows selected (0.352 seconds)
{code}

DDL for hive table :
{code}
create external table hive1_fewtypes_null_parquet (
      int_col int,
      bigint_col bigint,
      date_col string,
      time_col string,
      timestamp_col timestamp,
      interval_col string,
      varchar_col string,
      float_col float,
      double_col double,
      bool_col boolean
    )
stored as parquet
location '/drill/testdata/hive_storage/hive1_fewtypes_null';
{code}

Attached the underlying parquet file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)