You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Krystal (JIRA)" <ji...@apache.org> on 2015/04/17 20:46:58 UTC

[jira] [Closed] (DRILL-1524) Data from hive parquet table is displayed as "null" when select all columns

     [ https://issues.apache.org/jira/browse/DRILL-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krystal closed DRILL-1524.
--------------------------

> Data from hive parquet table is displayed as "null" when select all columns 
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-1524
>                 URL: https://issues.apache.org/jira/browse/DRILL-1524
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Krystal
>            Assignee: Venki Korukanti
>             Fix For: 0.7.0
>
>         Attachments: 0003-DRILL-1524-Fix-Hive-Parquet-SerDe-reading-issue-when.patch
>
>
> git.commit.id.abbrev=42f0a7e
> From hive-13, I created a parquet table:
> hive> create table voter_parquet(voter_id int,name string,age tinyint, registration string,contributions float,voterzone smallint,create_time string) stored as parquet; 
> hive> insert overwrite table voter_parquet select * from voter;
> I can select against this table from hive:
> hive> select * from voter_parquet limit 5;
> OK
> 1	nick miller	68	green	717.12	13809	2014-05-25 03:41:54
> 2	ulysses white	48	green	840.06	19451	2014-07-30 08:03:11
> 3	holly garcia	18	democrat	128.2	8750	2014-09-15 02:33:11
> 4	victor thompson	61	independent	721.6	20462	2014-06-17 13:04:09
> 5	luke allen	39	socialist	800.22	25151	2015-02-01 02:02:37
> I ran the same select from sqlline and got all nulls:
> 0: jdbc:drill:schema=hive> select * from voter_parquet limit 5;
> +------------+------------+------------+--------------+---------------+------------+-------------+
> |  voter_id  |    name    |    age     | registration | contributions | voterzone  | create_time |
> +------------+------------+------------+--------------+---------------+------------+-------------+
> | null       | null       | null       | null         | null          | null       | null        |
> | null       | null       | null       | null         | null          | null       | null        |
> | null       | null       | null       | null         | null          | null       | null        |
> | null       | null       | null       | null         | null          | null       | null        |
> | null       | null       | null       | null         | null          | null       | null        |
> +------------+------------+------------+--------------+---------------+------------+-------------+
> Same if I explicitly specify all the columns:
> 0: jdbc:drill:schema=hive> select voter_id, name, age, registration, contributions, voterzone, create_time from voter_parquet limit 2;
> +------------+------------+------------+--------------+---------------+------------+-------------+
> |  voter_id  |    name    |    age     | registration | contributions | voterzone  | create_time |
> +------------+------------+------------+--------------+---------------+------------+-------------+
> | null       | null       | null       | null         | null          | null       | null        |
> | null       | null       | null       | null         | null          | null       | null        |
> +------------+------------+------------+--------------+---------------+------------+-------------+
> However, if I select a few columns, then the data displays correctly:
> 0: jdbc:drill:schema=hive> select voter_id, name, age, registration from voter_parquet limit 5;
> +------------+------------+------------+--------------+
> |  voter_id  |    name    |    age     | registration |
> +------------+------------+------------+--------------+
> | 1          | nick miller | 68         | green        |
> | 2          | ulysses white | 48         | green        |
> | 3          | holly garcia | 18         | democrat     |
> | 4          | victor thompson | 61         | independent  |
> | 5          | luke allen | 39         | socialist    |
> +------------+------------+------------+--------------+
> 0: jdbc:drill:schema=hive> describe voter_parquet;
> +-------------+------------+-------------+
> | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
> +-------------+------------+-------------+
> | voter_id    | INTEGER    | YES         |
> | name        | VARCHAR    | YES         |
> | age         | TINYINT    | YES         |
> | registration | VARCHAR    | YES         |
> | contributions | FLOAT      | YES         |
> | voterzone   | SMALLINT   | YES         |
> | create_time | VARCHAR    | YES         |
> +-------------+------------+-------------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)