You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Vihang Karajgaonkar (JIRA)" <ji...@apache.org> on 2017/10/23 00:51:00 UTC

[jira] [Created] (HIVE-17876) row.serde.deserialize broken for non-vectorized file inputformats

Vihang Karajgaonkar created HIVE-17876:
------------------------------------------

             Summary: row.serde.deserialize broken for non-vectorized file inputformats
                 Key: HIVE-17876
                 URL: https://issues.apache.org/jira/browse/HIVE-17876
             Project: Hive
          Issue Type: Bug
    Affects Versions: 3.0.0, 2.4.0
            Reporter: Vihang Karajgaonkar


Vectorization using {{hive.vectorized.use.row.serde.deserialize}} errors out for both Orc and Parquet input format.

Steps to reproduce:

{noformat}
set hive.fetch.task.conversion=none;
set hive.vectorized.use.row.serde.deserialize=true;
set hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
set hive.vectorized.execution.enabled=true;

explain vectorization select * from alltypesorc where cint = 528534767 limit 10;
+----------------------------------------------------+
|                      Explain                       |
+----------------------------------------------------+
| PLAN VECTORIZATION:                                |
|   enabled: true                                    |
|   enabledConditionsMet: [hive.vectorized.execution.enabled IS true] |
|                                                    |
| STAGE DEPENDENCIES:                                |
|   Stage-1 is a root stage                          |
|   Stage-0 depends on stages: Stage-1               |
|                                                    |
| STAGE PLANS:                                       |
|   Stage: Stage-1                                   |
|     Map Reduce                                     |
|       Map Operator Tree:                           |
|           TableScan                                |
|             alias: alltypesorc                     |
|             Statistics: Num rows: 12288 Data size: 2641964 Basic stats: COMPLETE Column stats: NONE |
|             Filter Operator                        |
|               predicate: (cint = 528534767) (type: boolean) |
|               Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE |
|               Select Operator                      |
|                 expressions: ctinyint (type: tinyint), csmallint (type: smallint), 528534767 (type: int), cbigint (type: bigint), cfloat (type: float), cdouble (type: double), cstring1 (type: string), cstring2 (type: string), ctimestamp1 (type: timestamp), ctimestamp2 (type: timestamp), cboolean1 (type: boolean), cboolean2 (type: boolean) |
|                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11 |
|                 Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE |
|                 Limit                              |
|                   Number of rows: 10               |
|                   Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE |
|                   File Output Operator             |
|                     compressed: false              |
|                     Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE |
|                     table:                         |
|                         input format: org.apache.hadoop.mapred.SequenceFileInputFormat |
|                         output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat |
|                         serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
|       Execution mode: vectorized                   |
|       Map Vectorization:                           |
|           enabled: true                            |
|           enabledConditionsMet: hive.vectorized.use.row.serde.deserialize IS true |
|           groupByVectorOutput: true                |
|           inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat |
|           allNative: false                         |
|           usesVectorUDFAdaptor: false              |
|           vectorized: true                         |
|                                                    |
|   Stage: Stage-0                                   |
|     Fetch Operator                                 |
|       limit: 10                                    |
|       Processor Tree:                              |
|         ListSink                                   |
|                                                    |
+----------------------------------------------------+
48 rows selected (0.742 seconds)
0: jdbc:hive2://localhost:10000/default>

0: jdbc:hive2://localhost:10000/default> select * from alltypesorc where cint = 528534767 limit 10;
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)