You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Li Penglin (Jira)" <ji...@apache.org> on 2023/01/16 11:15:00 UTC
[jira] [Updated] (IMPALA-11844) The Iceberg Position-Delete Table will not work if 'file_path' in DeleteFile is not Fully-qualifies
[ https://issues.apache.org/jira/browse/IMPALA-11844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Li Penglin updated IMPALA-11844:
--------------------------------
Description:
{code:java}
Given:
Table 'test_tbl' has two files, a datafile and a deletefile.
data_file-00001.parquet:
| input__file__name | file__position | col_int | col_long | col_string |
| hdfs://localhost:20500/data/data_file-00001.parquet | 0 | 1 | 10 | "1-10-a" |
| hdfs://localhost:20500/data/data_file-00001.parquet | 1 | 2 | 20 | "2-20-a" |
| hdfs://localhost:20500/data/data_file-00001.parquet | 2 | 3 | 30 | "3-30-a" |
delete_file-00001.parquet:
| file_path | pos |
| hdfs://localhost:20500/data/data_file-00001.parquet | 0 |
| /data/data_file-00001.parquet | 1 |
Expect:
select * from test_tbl;
| col_int | col_long | col_string |
| 3 | 30 | "3-30-a" |
Actual:
| col_int | col_long | col_string |
| 2 | 20 | "2-20-a" |
| 3 | 30 | "3-30-a" | {code}
'file_path' in DeleteFile is not Fully-qualifies should also work properly. Maybe we should look at other engines, including how the native Iceberg api handles it.
was:
{code:java}
Given:
Table 'test_tbl' has two files, a datafile and a deletefile.
data_file-00001.parquet:
| input__file__name | file__position | col_int | col_long | col_string |
| hdfs://localhost:20500/data/data_file-00001.parquet | 0 | 1 | 10 | "1-10-a" |
| hdfs://localhost:20500/data/data_file-00001.parquet | 1 | 2 | 20 | "2-20-a" |
| hdfs://localhost:20500/data/data_file-00001.parquet | 2 | 3 | 30 | "3-30-a" |
delete_file-00001.parquet:
| file_path | pos |
| hdfs://localhost:20500/data/data_file-00001.parquet | 0 |
| /data/data_file-00001.parquet | 1 |Expect:
select * from test_tbl;
| col_int | col_long | col_string |
| 3 | 30 | "3-30-a" | Actual:
| col_int | col_long | col_string |
| 2 | 20 | "2-20-a" |
| 3 | 30 | "3-30-a" | {code}
'file_path' in DeleteFile is not Fully-qualifies should also work properly. Maybe we should look at other engines, including how the native Iceberg api handles it.
> The Iceberg Position-Delete Table will not work if 'file_path' in DeleteFile is not Fully-qualifies
> ---------------------------------------------------------------------------------------------------
>
> Key: IMPALA-11844
> URL: https://issues.apache.org/jira/browse/IMPALA-11844
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Li Penglin
> Priority: Major
> Labels: impala-iceberg
>
> {code:java}
> Given:
> Table 'test_tbl' has two files, a datafile and a deletefile.
> data_file-00001.parquet:
> | input__file__name | file__position | col_int | col_long | col_string |
> | hdfs://localhost:20500/data/data_file-00001.parquet | 0 | 1 | 10 | "1-10-a" |
> | hdfs://localhost:20500/data/data_file-00001.parquet | 1 | 2 | 20 | "2-20-a" |
> | hdfs://localhost:20500/data/data_file-00001.parquet | 2 | 3 | 30 | "3-30-a" |
> delete_file-00001.parquet:
> | file_path | pos |
> | hdfs://localhost:20500/data/data_file-00001.parquet | 0 |
> | /data/data_file-00001.parquet | 1 |
> Expect:
> select * from test_tbl;
> | col_int | col_long | col_string |
> | 3 | 30 | "3-30-a" |
> Actual:
> | col_int | col_long | col_string |
> | 2 | 20 | "2-20-a" |
> | 3 | 30 | "3-30-a" | {code}
> 'file_path' in DeleteFile is not Fully-qualifies should also work properly. Maybe we should look at other engines, including how the native Iceberg api handles it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org