You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/06/09 01:54:00 UTC
[jira] [Created] (IMPALA-11344) Selecting only the missing fields of ORC files should return NULLs
Quanlong Huang created IMPALA-11344:
---------------------------------------
Summary: Selecting only the missing fields of ORC files should return NULLs
Key: IMPALA-11344
URL: https://issues.apache.org/jira/browse/IMPALA-11344
Project: IMPALA
Issue Type: Bug
Reporter: Quanlong Huang
While looking into the bug of IMPALA-11296, I found a bug on the same scenario (scanning only the missing columns of ORC files) in current master branch.
Creating an ORC table with missing fields in the underlying files:
{code:sql}
hive> create external table missing_field_orc (f0 int) stored as orc;
hive> insert into table missing_field_orc select 1;
hive> alter table missing_field_orc add columns (f1 int);
hive> select f1 from missing_field_orc;
+-------+
| f1 |
+-------+
| NULL |
+-------+
hive> select f0, f1 from missing_field_orc;
+-----+-------+
| f0 | f1 |
+-----+-------+
| 1 | NULL |
+-----+-------+
{code}
Run the same queries in Impala:
{code:sql}
impala> invalidate metadata missing_field_orc;
impala> select f1 from missing_field_orc;
ERROR: Parse error in possibly corrupt ORC file: 'hdfs://localhost:20500/test-warehouse/missing_field_orc/000000_0'. No columns found for this scan.
impala> select f0, f1 from missing_field_orc;
+----+------+
| f0 | f1 |
+----+------+
| 1 | NULL |
+----+------+
{code}
While selecting only the column 'f1', the query failed by an error. It should return NULL.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org