You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Prasanth Jayachandran (JIRA)" <ji...@apache.org> on 2015/06/18 00:19:01 UTC

[jira] [Commented] (HIVE-11035) PPD: Orc Split elimination fails because filterColumns=[-1]

    [ https://issues.apache.org/jira/browse/HIVE-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590728#comment-14590728 ] 

Prasanth Jayachandran commented on HIVE-11035:
----------------------------------------------

This is a regression introduced by HIVE-7052. As a result of this regression, the first column id is not pushed down properly but the column name exists. For the example in the description, the value for hive.io.file.readcolumn.ids in conf is empty and the value for hive.io.file.readcolumn.names is ",x" (comma is prepended). Making the CSV joiner more reliable fixes the issue and pushes the column ids properly which is then used by SARG for populating filterColumns array.

> PPD: Orc Split elimination fails because filterColumns=[-1]
> -----------------------------------------------------------
>
>                 Key: HIVE-11035
>                 URL: https://issues.apache.org/jira/browse/HIVE-11035
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.0, 1.3.0, 2.0.0
>            Reporter: Gopal V
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-11035.patch
>
>
> {code}
> create temporary table xx (x int) stored as orc ;
> insert into xx values (20),(200);
> set hive.fetch.task.conversion=none;
> select * from xx where x is null;
> {code}
> This should generate zero tasks after optional split elimination in the app master, instead of generating the 1 task which for sure hits the row-index filters and removes all rows anyway.
> Right now, this runs 1 task for the stripe containing (min=20, max=200, has_null=false), which is broken.
> Instead, it returns YES_NO_NULL from the following default case
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L976



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)