You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/07/14 01:50:00 UTC

[jira] [Commented] (IMPALA-10502) delayed 'Invalidated objects in cache' cause 'Table already exists'

    [ https://issues.apache.org/jira/browse/IMPALA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380245#comment-17380245 ] 

ASF subversion and git services commented on IMPALA-10502:
----------------------------------------------------------

Commit 7f7a631e92c69a6dafc1f25ceb407f7b79db10e9 in impala's branch refs/heads/master from Vihang Karajgaonkar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7f7a631 ]

IMPALA-10502: Handle CREATE/DROP events correctly

The current way to detect self-events in case of CREATE/DROP events on
database, table and partition is problematic when the same object is
created and dropped repeatedly in quick succession. This happens mainly
due to couple of reasons. For example if we have the below
sequence of DDLs in Impala:
1. create table foo; --> catalogd creates table foo
2. drop table foo; --> catalogd drops table foo
...
Events processor receives the CREATE_TABLE event pertaining to (1)
above. Now it cannot determine whether the table needs to be created
or not. Similarly, if we interchange the order of DROP and CREATE
statements above, the DROP_TABLE event received by the events processor
will unnecessarily remove the table when it should not.

This can cause problems for queries which expect the table to exist or
not exist. E.g create table query fails with a table already exists or
a drop table query fails with table does not exist error.

In order to fix this issue, catalogd now keeps track of dropped objects
in a deleteLog which are garbage collected as the events come in. Every
time a database, table or partition is dropped, the deleteLog is
populated with the drop event id generated due to the drop
operation. This deleteLog is looked up when the event is received to
determine if the event can be ignored. Additionally, catalogd keeps
track of the create event id at the Database, Table or Partition level
during the create DDL execution so that the event can be ignored later
by events processor.

Testing:
1. Added test_create_drop_events and test_local_catalog_create_drop_events
test which loops to create create/drop events for database, table and
partitions.
2. Added new metrics which the test verifies to ensure that events
don't create or drop the object.

Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
Reviewed-on: http://gerrit.cloudera.org:8080/17308
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vi...@cloudera.com>


> delayed 'Invalidated objects in cache' cause 'Table already exists'
> -------------------------------------------------------------------
>
>                 Key: IMPALA-10502
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10502
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog, Clients, Frontend
>    Affects Versions: Impala 3.4.0
>            Reporter: Adriano
>            Assignee: Vihang Karajgaonkar
>            Priority: Critical
>             Fix For: Impala 4.1
>
>
> In fast paced environment where the interval between the step 1 and 2 is # < 100ms (a simplified pipeline looks like):
> 0- catalog 'on demand' in use and disableHmsSync (enabled or disabled: no difference)
> 1- open session to coord A -> DROP TABLE X -> close session
> 2- open session to coord A -> CREATE TABLE X-> close session
> Results: the step -2- can fail with table already exist.
> During the internal investigation was discovered that IMPALA-9913 will regress the issue in almost all scenarios.
> However considering that the investigation are internally ongoing it is nice to have the event tracked also here.
> Once we are sure that IMPALA-9913 fix these events we can close this as duplicate, in alternative carry on the investigation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org