You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Noemi Pap-Takacs (Jira)" <ji...@apache.org> on 2024/01/22 15:01:00 UTC

[jira] [Comment Edited] (IMPALA-12742) DELETE/UPDATE Iceberg table partitioned by DATE gives incorrect value

    [ https://issues.apache.org/jira/browse/IMPALA-12742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809492#comment-17809492 ] 

Noemi Pap-Takacs edited comment on IMPALA-12742 at 1/22/24 3:00 PM:
--------------------------------------------------------------------

Steps to reproduce the issue:
1. CREATE V2 TABLE based on 'functional_parquet.iceberg_alltypes_part'
{code:java}
create table ice_alltypes_part_v2 
(i INT NULL, p_bool BOOLEAN NULL, p_int INT NULL, p_bigint BIGINT NULL, p_float FLOAT NULL, p_double DOUBLE NULL, p_decimal DECIMAL(6,3) NULL, p_date DATE NULL, p_string STRING NULL) 
PARTITIONED BY SPEC(p_bool, p_int, p_bigint, p_float, p_double, p_decimal, p_date, p_string) STORED AS ICEBERG 
TBLPROPERTIES ('engine.hive.enabled'='true', 'iceberg.catalog'='hadoop.catalog', 'iceberg.catalog_location'='/test-warehouse/iceberg_test/hadoop_catalog', 'format-version'='2'){code}

 

2. INSERT DATA
{code:java}
insert into ice_alltypes_part_v2 select * from functional_parquet.iceberg_alltypes_part;{code}
select * from ice_alltypes_part_v2;


|i|p_bool|p_int|p_bigint|p_float|p_double|p_decimal|p_date|p_string|

 
|1|true|1|11|1.10000002384|2.222|123.321|2022-02-22|impala|
|2|true|1|11|1.10000002384|2.222|123.321|2022-02-22|impala|


3. DELETE or UPDATE
{code:java}
delete from ice_alltypes_part_v2 where i=1;{code}
or
{code:java}
update ice_alltypes_part_v2 set p_bigint=100 where p_int=1;
{code}
 

 

ERROR: DateTimeParseException: Text '19045' could not be parsed at index 0

impalad logs:
{code:java}
client-request-state.cc:1537] Updating metastore with 2 altered partitions (p_bool=true/p_int=1/p_bigint=100/p_float=1.100000023841858/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala, p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=19045/p_string=impala)
client-request-state.cc:1564] Executing FinalizeDml() using CatalogService
client-request-state.cc:1574] ERROR Finalizing DML: DateTimeParseException: Text '19045' could not be parsed at index 0


{code}
catalogd logs:
{code:java}
java.time.format.DateTimeParseException: Text '19045' could not be parsed at index 0
    at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
    at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
    at java.time.LocalDate.parse(LocalDate.java:400)
    at org.apache.iceberg.expressions.Literals$StringLiteral.to(Literals.java:495)
    at org.apache.iceberg.types.Conversions.fromPartitionString(Conversions.java:70)
    at org.apache.impala.util.IcebergUtil.getPartitionValue(IcebergUtil.java:780)
    at org.apache.impala.util.IcebergUtil.partitionDataFromDataFile(IcebergUtil.java:758)
    at org.apache.impala.service.IcebergCatalogOpExecutor.createDeleteFile(IcebergCatalogOpExecutor.java:443)
{code}


was (Author: noemi):
Steps to reproduce the issue:
1. CREATE V2 TABLE based on 'functional_parquet.iceberg_alltypes_part'
create table ice_alltypes_part_v2 
(i INT NULL, p_bool BOOLEAN NULL, p_int INT NULL, p_bigint BIGINT NULL, p_float FLOAT NULL, p_double DOUBLE NULL, p_decimal DECIMAL(6,3) NULL, p_date DATE NULL, p_string STRING NULL) 
PARTITIONED BY SPEC(p_bool, p_int, p_bigint, p_float, p_double, p_decimal, p_date, p_string) STORED AS ICEBERG 
TBLPROPERTIES ('engine.hive.enabled'='true', 'iceberg.catalog'='hadoop.catalog', 'iceberg.catalog_location'='/test-warehouse/iceberg_test/hadoop_catalog', 'format-version'='2')
 

2. INSERT DATA
{code:java}
insert into ice_alltypes_part_v2 select * from functional_parquet.iceberg_alltypes_part;{code}
select * from ice_alltypes_part_v2;
+---+--------+-------+----------+---------------+----------+-----------+------------+----------+
| i | p_bool | p_int | p_bigint | p_float       | p_double | p_decimal | p_date     | p_string |
+---+--------+-------+----------+---------------+----------+-----------+------------+----------+
| 1 | true   | 1     | 11       | 1.10000002384 | 2.222    | 123.321   | 2022-02-22 | impala   |
| 2 | true   | 1     | 11       | 1.10000002384 | 2.222    | 123.321   | 2022-02-22 | impala   |
+---+--------+-------+----------+---------------+----------+-----------+------------+----------+ 
3. DELETE or UPDATE
{code:java}
delete from ice_alltypes_part_v2 where i=1;{code}
or
{code:java}
update ice_alltypes_part_v2 set p_bigint=100 where p_int=1;
{code}
 

 

ERROR: DateTimeParseException: Text '19045' could not be parsed at index 0

impalad logs:
{code:java}
client-request-state.cc:1537] Updating metastore with 2 altered partitions (p_bool=true/p_int=1/p_bigint=100/p_float=1.100000023841858/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala, p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=19045/p_string=impala)
client-request-state.cc:1564] Executing FinalizeDml() using CatalogService
client-request-state.cc:1574] ERROR Finalizing DML: DateTimeParseException: Text '19045' could not be parsed at index 0


{code}
catalogd logs:
{code:java}
java.time.format.DateTimeParseException: Text '19045' could not be parsed at index 0
    at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
    at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
    at java.time.LocalDate.parse(LocalDate.java:400)
    at org.apache.iceberg.expressions.Literals$StringLiteral.to(Literals.java:495)
    at org.apache.iceberg.types.Conversions.fromPartitionString(Conversions.java:70)
    at org.apache.impala.util.IcebergUtil.getPartitionValue(IcebergUtil.java:780)
    at org.apache.impala.util.IcebergUtil.partitionDataFromDataFile(IcebergUtil.java:758)
    at org.apache.impala.service.IcebergCatalogOpExecutor.createDeleteFile(IcebergCatalogOpExecutor.java:443)
{code}

> DELETE/UPDATE Iceberg table partitioned by DATE gives incorrect value
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-12742
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12742
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend, Catalog
>            Reporter: Noemi Pap-Takacs
>            Priority: Major
>              Labels: impala-iceberg
>
> Iceberg tables can be identity partitioned by any type, e.g. int, date and even float.
> If a table is partitioned, the file path contains the partition value in human readable form. When an UPDATE or DELETE command is executed, the delete file contains the file path to the referenced data file. It seems that DATE type is converted to this form incorrectly, and cannot be parsed by the Catalog and throws an error.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org