You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vihang Karajgaonkar (JIRA)" <ji...@apache.org> on 2017/10/23 00:52:00 UTC
[jira] [Commented] (HIVE-17876) row.serde.deserialize broken for non-vectorized file inputformats

    [ https://issues.apache.org/jira/browse/HIVE-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214500#comment-16214500 ] 

Vihang Karajgaonkar commented on HIVE-17876:
--------------------------------------------

CC: [~mmccline]

> row.serde.deserialize broken for non-vectorized file inputformats
> -----------------------------------------------------------------
>
>                 Key: HIVE-17876
>                 URL: https://issues.apache.org/jira/browse/HIVE-17876
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Vihang Karajgaonkar
>
> Vectorization using {{hive.vectorized.use.row.serde.deserialize}} errors out for both Orc and Parquet input format.
> Steps to reproduce:
> {noformat}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.use.row.serde.deserialize=true;
> set hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
> set hive.vectorized.execution.enabled=true;
> explain vectorization select * from alltypesorc where cint = 528534767 limit 10;
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> | PLAN VECTORIZATION:                                |
> |   enabled: true                                    |
> |   enabledConditionsMet: [hive.vectorized.execution.enabled IS true] |
> |                                                    |
> | STAGE DEPENDENCIES:                                |
> |   Stage-1 is a root stage                          |
> |   Stage-0 depends on stages: Stage-1               |
> |                                                    |
> | STAGE PLANS:                                       |
> |   Stage: Stage-1                                   |
> |     Map Reduce                                     |
> |       Map Operator Tree:                           |
> |           TableScan                                |
> |             alias: alltypesorc                     |
> |             Statistics: Num rows: 12288 Data size: 2641964 Basic stats: COMPLETE Column stats: NONE |
> |             Filter Operator                        |
> |               predicate: (cint = 528534767) (type: boolean) |
> |               Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE |
> |               Select Operator                      |
> |                 expressions: ctinyint (type: tinyint), csmallint (type: smallint), 528534767 (type: int), cbigint (type: bigint), cfloat (type: float), cdouble (type: double), cstring1 (type: string), cstring2 (type: string), ctimestamp1 (type: timestamp), ctimestamp2 (type: timestamp), cboolean1 (type: boolean), cboolean2 (type: boolean) |
> |                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11 |
> |                 Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE |
> |                 Limit                              |
> |                   Number of rows: 10               |
> |                   Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE |
> |                   File Output Operator             |
> |                     compressed: false              |
> |                     Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE |
> |                     table:                         |
> |                         input format: org.apache.hadoop.mapred.SequenceFileInputFormat |
> |                         output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat |
> |                         serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
> |       Execution mode: vectorized                   |
> |       Map Vectorization:                           |
> |           enabled: true                            |
> |           enabledConditionsMet: hive.vectorized.use.row.serde.deserialize IS true |
> |           groupByVectorOutput: true                |
> |           inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat |
> |           allNative: false                         |
> |           usesVectorUDFAdaptor: false              |
> |           vectorized: true                         |
> |                                                    |
> |   Stage: Stage-0                                   |
> |     Fetch Operator                                 |
> |       limit: 10                                    |
> |       Processor Tree:                              |
> |         ListSink                                   |
> |                                                    |
> +----------------------------------------------------+
> 48 rows selected (0.742 seconds)
> 0: jdbc:hive2://localhost:10000/default>
> 0: jdbc:hive2://localhost:10000/default> select * from alltypesorc where cint = 528534767 limit 10;
> Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)