You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2016/11/11 23:22:58 UTC

[jira] [Closed] (DRILL-4342) Drill fails to read a date column from hive generated parquet

     [ https://issues.apache.org/jira/browse/DRILL-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rahul Challapalli closed DRILL-4342.
------------------------------------

> Drill fails to read a date column from hive generated parquet
> -------------------------------------------------------------
>
>                 Key: DRILL-4342
>                 URL: https://issues.apache.org/jira/browse/DRILL-4342
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive, Storage - Parquet
>            Reporter: Rahul Challapalli
>         Attachments: fewtypes_null.parquet
>
>
> git.commit.id.abbrev=576271d
> Below is the hive ddl (using hive 1.2 which supports date in parquet)
> {code}
> create external table hive1dot2_fewtypes_null (
>       int_col int,
>       bigint_col bigint,
>       date_col date,
>       time_col string,
>       timestamp_col timestamp,
>       interval_col string,
>       varchar_col string,
>       float_col float,
>       double_col double,
>       bool_col boolean
>     )
> stored as parquet
> location '/drill/testdata/hive_storage/hive1dot2_fewtypes_null';
> {code}
> Query using the hive storage plugin
> {code}
> date_col from hive.hive1dot2_fewtypes_null;
> +-------------+
> |  date_col   |
> +-------------+
> | null        |
> | null        |
> | null        |
> | 1996-01-29  |
> | 1996-03-01  |
> | 1996-03-02  |
> | 1997-02-28  |
> | null        |
> | 1997-03-01  |
> | 1997-03-02  |
> | 2000-04-01  |
> | 2000-04-03  |
> | 2038-04-08  |
> | 2039-04-09  |
> | 2040-04-10  |
> | null        |
> | 1999-02-08  |
> | 1999-03-08  |
> | 1999-01-18  |
> | 2003-01-02  |
> | null        |
> +-------------+
> {code}
> Below is the output reading through dfs parquet reader. 
> {code}
> 0: jdbc:drill:zk=10.10.10.41:5181> select date_col from dfs.`/drill/testdata/hive_storage/hive1dot2_fewtypes_null`;
> +-------------+
> |  date_col   |
> +-------------+
> | null        |
> | null        |
> | null        |
> | 369-02-09  |
> | 369-03-12  |
> | 369-03-13  |
> | 368-03-11  |
> | null        |
> | 368-03-12  |
> | 368-03-13  |
> | 365-04-12  |
> | 365-04-14  |
> | 327-04-19  |
> | 326-04-20  |
> | 325-04-21  |
> | null        |
> | 366-02-19  |
> | 366-03-19  |
> | 366-01-29  |
> | 362-01-13  |
> | null        |
> +-------------+
> {code}
> I attached the parquet file generated from hive. Let me know if anything else is needed for reproducing this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)