You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Miklos Szurap (Jira)" <ji...@apache.org> on 2022/03/30 09:58:00 UTC
[jira] [Created] (IMPALA-11209) Inconsistent results querying tables with subdirectories
Miklos Szurap created IMPALA-11209:
--------------------------------------
Summary: Inconsistent results querying tables with subdirectories
Key: IMPALA-11209
URL: https://issues.apache.org/jira/browse/IMPALA-11209
Project: IMPALA
Issue Type: Bug
Components: Catalog, Frontend
Reporter: Miklos Szurap
IMPALA-8454 introduced the recursive listing of table/partition directories. It seems that it is not properly handling if we would like to intentionally disable this new behavior through the impala.disable.recursive.listing=true table property. Within the same session a refresh statement on the table flaps the behavior, see below reproduction steps.
{code}
CREATE EXTERNAL TABLE subdirtest (col1 string) partitioned by (p1 string) TBLPROPERTIES ('impala.disable.recursive.listing'='true');
ALTER TABLE subdirtest ADD PARTITION (p1='A');
{code}
then ingest some files into subdirectories
{code}
hdfs dfs -mkdir /warehouse/tablespace/external/hive/subdirtest/p1=A/00
hdfs dfs -put testdata.parq /warehouse/tablespace/external/hive/subdirtest/p1=A/00/
{code}
The "testdata.parq" matches the schema, and has two rows/records.
{code}
[coordinator.example.com:21050] default> refresh subdirtest;
...
[coordinator.example.com:21050] default> select count(*) from subdirtest;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
[coordinator.example.com:21050] default> refresh subdirtest;
...
[coordinator.example.com:21050] default> select count(*) from subdirtest;
+----------+
| count(*) |
+----------+
| 2 |
+----------+
[coordinator.example.com:21050] default> refresh subdirtest;
...
[coordinator.example.com:21050] default> select count(*) from subdirtest;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
[coordinator.example.com:21050] default> refresh subdirtest;
...
[coordinator.example.com:21050] default> select count(*) from subdirtest;
+----------+
| count(*) |
+----------+
| 2 |
+----------+
{code}
This can be reproduced within the same / single impala-shell session (without any other coordinators or load-balancing).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org