You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2021/06/21 08:49:00 UTC

[jira] [Created] (IMPALA-10757) ACID table locking for DML statements is faulty

Zoltán Borók-Nagy created IMPALA-10757:
------------------------------------------

             Summary: ACID table locking for DML statements is faulty
                 Key: IMPALA-10757
                 URL: https://issues.apache.org/jira/browse/IMPALA-10757
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
            Reporter: Zoltán Borók-Nagy


Plain SELECT queries don't take ACID locks. They use the latest snapshot of the table that is loaded by CatalogD.

However, DML statements lock all the tables it references, not just the target table.

E.g.:
{noformat}
INSERT INTO target_table SELECT * FROM source_table;
{noformat}
acquires locks for both target_table and source_table. However, after acquiring the locks Impala doesn't reload the tables.

Therefore the following situation is possible:
{noformat}
INSERT OVERWRITE foo SELECT ...; (takes an exclusive lock for foo)
{noformat}
while the following statement also tries to take a SHARED_LOCK for foo:
{noformat}
INSERT INTO bar SELECT * FROM foo;
{noformat}
It means the INSERT INTO statement might wait for the completion of the INSERT OVERWRITE statement, but since it doesn't reload foo it will still use the old snapshot of foo, hence there was no benefit of waiting for the lock.

Possible solutions:
 # Re-load tables after the lock is acquired
 # Only take lock for the target table. This would be better than the current behavior, also it would be consistent with plain SELECT queries.

I think reloading should be favored as Impala should run every statement (that involves ACID tables) in a transaction and take proper locks, see IMPALA-8788.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)