You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/10/20 17:17:27 UTC

[jira] [Commented] (DRILL-3938) Hive: Failure reading from a partition when a new column is added to the table after the partition creation

    [ https://issues.apache.org/jira/browse/DRILL-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965248#comment-14965248 ] 

ASF GitHub Bot commented on DRILL-3938:
---------------------------------------

GitHub user vkorukanti opened a pull request:

    https://github.com/apache/drill/pull/211

    DRILL-3938: Support reading from Hive tables that have schema altered after the creation

    Hive creates a converter to convert from the partition schema to table schema when the table schema is altered after the partition is created. The behavior in mapping partition schema to table schema is:
     - if a column doesn't exist in partition schema, its value is considered as null
     - if the column type doesn't match the required type, it is converted according to various convert methods available in Hive.
    
    Currently we have to rely on the Hive converters, because Drill doesn't have the same convert methods that Hive has [1]. 
    
    [1] https://github.com/apache/hive/blob/branch-1.0/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
    
    Also:
    + Remove "redoRecord" logic which is not needed after "automatic reallocation" (DRILL-1960) changes.
    + Remove HiveTestRecordReader. This is incomplete in implementation and not used anywhere. It is currently just
      a burden to maintain with changes in its superclass HiveRecordReader

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vkorukanti/drill DRILL-3938

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/211.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #211
    
----
commit 21c4325abcb25ce22846217e61eee23816f73571
Author: vkorukanti <ve...@gmail.com>
Date:   2015-10-19T18:35:09Z

    DRILL-3938: Support reading from Hive tables that have schema altered after the creation
    
    Also:
    + Remove "redoRecord" logic which is not needed after "automatic reallocation" (DRILL-1960) changes.
    + Remove HiveTestRecordReader. This is incomplete in implementation and not used anywhere. It is currently just
      a burden to maintain with changes in its superclass HiveRecordReader

----


> Hive: Failure reading from a partition when a new column is added to the table after the partition creation
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3938
>                 URL: https://issues.apache.org/jira/browse/DRILL-3938
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>    Affects Versions: 0.4.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 1.3.0
>
>
> Repro:
> From Hive:
> {code}
> CREATE TABLE kv(key INT, value STRING);
> LOAD DATA LOCAL INPATH '/Users/hadoop/apache-repos/hive-install/apache-hive-1.0.0-bin/examples/files/kv1.txt' INTO TABLE kv;
> CREATE TABLE kv_p(key INT, value STRING, part1 STRING);
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions=10000;
> set hive.exec.max.dynamic.partitions.pernode=10000;
> INSERT INTO TABLE kv_p PARTITION (part1) SELECT key, value, value as s FROM kv;
> ALTER TABLE kv_p ADD COLUMNS (newcol STRING);
> {code}
> From Drill:
> {code}
> USE hive;
> DESCRIBE kv_p;
> SELECT newcol FROM kv_p;
> throws column 'newcol' not found error in HiveRecordReader while selecting only the projected columns.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)