You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Vihang Karajgaonkar (JIRA)" <ji...@apache.org> on 2018/01/10 03:10:00 UTC

[jira] [Created] (HIVE-18422) Vectorized input format should not be used when input format is excluded and row.serde is enabled

Vihang Karajgaonkar created HIVE-18422:
------------------------------------------

             Summary: Vectorized input format should not be used when input format is excluded and row.serde is enabled
                 Key: HIVE-18422
                 URL: https://issues.apache.org/jira/browse/HIVE-18422
             Project: Hive
          Issue Type: Bug
          Components: Vectorization
    Affects Versions: 3.0.0, 2.4.0
            Reporter: Vihang Karajgaonkar
            Assignee: Vihang Karajgaonkar
            Priority: Minor


HIVE-17534 introduced a config which gives a capability to exclude certain inputformat from vectorized execution without affecting other input formats. If an input format is excluded and row.serde is enabled at the same time, vectorizer still sets the {{useVectorizedInputFormat}} to true which causes Vectorized readers to be used in row.serde mode.

In order to reproduce:
{noformat}
set hive.fetch.task.conversion=none;
set hive.vectorized.use.row.serde.deserialize=true;
set hive.vectorized.use.vector.serde.deserialize=true;
set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;
set hive.vectorized.row.serde.inputformat.excludes=;

-- SORT_QUERY_RESULTS

-- exclude MapredParquetInputFormat from vectorization, this should cause mapwork vectorization to be disabled
set hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
set hive.vectorized.use.vectorized.input.format=true;


create table orcTbl (t1 tinyint, t2 tinyint)
stored as orc;

insert into orcTbl values (54, 9), (-104, 25), (-112, 24);
explain vectorization select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)