You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org> on 2021/07/26 17:45:33 UTC

[Impala-ASF-CR] IMPALA-10801: Check the latest compaction Id before serving ACID table

Yu-Wen Lai has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17697 )

Change subject: IMPALA-10801: Check the latest compaction Id before serving ACID table
......................................................................

IMPALA-10801: Check the latest compaction Id before serving ACID table

Since compactions don't advance write id, we don't know if a
table/partition is compacted by comparing writeIdList. A possible
issue is that CatalogD provides obsolete file metadata and causes a
runtime error.

In order to fix this issue, we introduced a HMS API that can get the
latest compaction record for a table/partition (HIVE-24828). In
CatalogD, we cache compaction id while loading partitions and compare
the cached id with the latest compaction id before serving. If there
is a newer compaction happened, it would refresh the file metadata.

Besides, this patch also change how to replace the existing table
after a table full reloading. The current way is to replace the table
if the catalog version is not changed. For transactional tables,
things get additional complexity given that file metadata refreshing
and full table reloading can happen together. We can actually use
writeIdList to determine whether we should replace the table for
transactional tables. As long as the updated table has more recent
writeIdList than the existing one, we are safe to replace the table.
For Non-transactional tables, we still keep original behavior.

Testing:
- Add a test in PartialCatalogInfoWriteIdTest

Change-Id: I86a112a77980fef7f6238978bc9668a65262101e
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java
M testdata/bin/create-load-data.sh
R testdata/cluster/ranger/setup/policy_5_revised.json
9 files changed, 367 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/17697/6
-- 
To view, visit http://gerrit.cloudera.org:8080/17697
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86a112a77980fef7f6238978bc9668a65262101e
Gerrit-Change-Number: 17697
Gerrit-PatchSet: 6
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>