You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/06/10 03:01:00 UTC

[jira] [Commented] (IMPALA-11344) Selecting only the missing fields of ORC files should return NULLs

    [ https://issues.apache.org/jira/browse/IMPALA-11344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552513#comment-17552513 ] 

Quanlong Huang commented on IMPALA-11344:
-----------------------------------------

[~tangzhi] Do you want to take this? Same as what you did in IMPALA-11296, we just need to fix the code in OrcStructReader::TopLevelReadValueBatch().

> Selecting only the missing fields of ORC files should return NULLs
> ------------------------------------------------------------------
>
>                 Key: IMPALA-11344
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11344
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Quanlong Huang
>            Priority: Critical
>              Labels: newbie, ramp-up
>
> While looking into the bug of IMPALA-11296, I found a bug on the same scenario (scanning only the missing columns of ORC files) in current master branch.
> Creating an ORC table with missing fields in the underlying files:
> {code:sql}
> hive> create external table missing_field_orc (f0 int) stored as orc;
> hive> insert into table missing_field_orc select 1;
> hive> alter table missing_field_orc add columns (f1 int);
> hive> select f1 from missing_field_orc;
> +-------+
> |  f1   |
> +-------+
> | NULL  |
> +-------+
> hive> select f0, f1 from missing_field_orc;
> +-----+-------+
> | f0  |  f1   |
> +-----+-------+
> | 1   | NULL  |
> +-----+-------+
> {code}
> Run the same queries in Impala:
> {code:sql}
> impala> VERSION;
> Shell version: impala shell build version not available
> Server version: impalad version 4.2.0-SNAPSHOT DEBUG (build 7273cfdfb901b9ef564c2737cf00c7a8abb57f07)
> impala> invalidate metadata missing_field_orc;
> impala> select f1 from missing_field_orc;
> ERROR: Parse error in possibly corrupt ORC file: 'hdfs://localhost:20500/test-warehouse/missing_field_orc/000000_0'. No columns found for this scan.
> impala> select f0, f1 from missing_field_orc;
> +----+------+
> | f0 | f1   |
> +----+------+
> | 1  | NULL |
> +----+------+
> {code}
> While selecting only the column 'f1', the query failed by an error. It should return NULL.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org