You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2023/06/30 05:38:00 UTC

[jira] [Created] (IMPALA-12256) Stale DROP_PARTITION events might not be skipped correctly

Quanlong Huang created IMPALA-12256:
---------------------------------------

             Summary: Stale DROP_PARTITION events might not be skipped correctly
                 Key: IMPALA-12256
                 URL: https://issues.apache.org/jira/browse/IMPALA-12256
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang
             Fix For: Impala 4.1.2, Impala 4.1.1, Impala 4.2.0, Impala 4.1.0


Since IMPALA-10502, we track the create event id for db/table/partitions when they are created. It's used to skip stale DROP events, i.e. events that are generated earlier than the object is created.

However, in some DDLs like COMPUTE INCREMENTAL STATS, we lost the create event id when reloading partitions. This results in stale DROP_PARTITION events not be skipped correctly.

This can be reproduced with a higher value of "hms_event_polling_interval_s" so the DROP_PARTITION event can come later than the COMPUTE INCREMENTAL STATS finishes.
{code:bash}
bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=10 {code}
Create a non-transactional partitioned table with one partition:
{code:sql}
create table my_part (id int) partitioned by (p int) stored as textfile;
insert into my_part partition(p=0) values (0);{code}
Put the below commands in a file and run them at once:
{code:sql}
alter table my_part drop if exists partition (p=0);
insert into my_part partition(p=0) values (0),(1),(2),(3);
compute incremental stats my_part partition(p=0);
{code}
In the catalogd logs, we can see the partition being dropped by the DROP_PARTITION event:
{code:java}
I0630 13:27:11.840737 17106 CatalogOpExecutor.java:4484] EventId: 8316831 Skipping removal of 0/1 partitions since they don't exist or were created later in table default.my_part.
I0630 13:27:11.841095 17106 MetastoreEvents.java:628] EventId: 8316831 EventType: DROP_PARTITION 1 partitions dropped from table default.my_part
{code}
This event should be skipped since the partition is recreated after it. Although there is a follow-up ADD_PARTITION event (generated by the recreation statement) that will add back the partition, there is a period between them that the metadata is incorrect (missing the actually existing partition).

The cause is we lost the create_event_id of the recreated partition when reloading it for the COMPUTE INCREMENTAL STATS. There are other DDLs that could cause the same issue, e.g. ALTER TABLE DROP STATS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org