You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/06/07 20:05:00 UTC
[jira] [Commented] (IMPALA-12189) updateCatalog not releasing the catalog lock if createTblTransaction() throws exceptions
[ https://issues.apache.org/jira/browse/IMPALA-12189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730267#comment-17730267 ]
ASF subversion and git services commented on IMPALA-12189:
----------------------------------------------------------
Commit 58590376ed3a1b2cac2bbc3b6a59d5cd34c53672 in impala's branch refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=58590376e ]
IMPALA-12189: updateCatalog should handle failures in createTblTransaction
updateCatalog() invokes createTblTransaction() for transactional tables.
It's called after acquiring the table lock. The write lock of catalog's
versionLock will also be acquired by the current thread. Whenever we hit
an exception, we should release those locks. This patch moves the code
calling createTblTransaction() into the exception handling scope.
Tests:
- Add a debug action to abort the transaction in updateCatalog() so
createTblTransaction() will fail.
- Add e2e test for the error handling.
Change-Id: I3a64764d0568fc1e6c6f4c52f9e220df3130bd84
Reviewed-on: http://gerrit.cloudera.org:8080/20020
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
> updateCatalog not releasing the catalog lock if createTblTransaction() throws exceptions
> ----------------------------------------------------------------------------------------
>
> Key: IMPALA-12189
> URL: https://issues.apache.org/jira/browse/IMPALA-12189
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> We saw an issue that catalogd can't finish RPC requests after this error:
> {code:java}
> I0605 21:04:49.356642 6145 jni-util.cc:288] org.apache.impala.common.TransactionException: Internal error processing allocate_table_write_ids
> at org.apache.impala.catalog.Hive3MetastoreShimBase.allocateTableWriteId(Hive3MetastoreShimBase.java:763)
> at org.apache.impala.catalog.Hive3MetastoreShimBase.createTblTransaction(Hive3MetastoreShimBase.java:129)
> at org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:6394)
> at org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:507)
> I0605 21:04:49.356665 6145 status.cc:129] TransactionException: Internal error processing allocate_table_write_ids
> {code}
> Code snipper of the downstream branch:
> {code:java}
> 6370 public TUpdateCatalogResponse updateCatalog(TUpdateCatalogRequest update)
> 6371 throws ImpalaException {
> 6372 TUpdateCatalogResponse response = new TUpdateCatalogResponse();
> 6373 // Only update metastore for Hdfs tables.
> 6374 Table table = getExistingTable(update.getDb_name(), update.getTarget_table(),
> 6375 "Load for INSERT");
> 6376 if (!(table instanceof FeFsTable)) {
> 6377 throw new InternalException("Unexpected table type: " +
> 6378 update.getTarget_table());
> 6379 }
> 6380
> 6381 tryWriteLock(table, "updating the catalog");
> 6382 final Timer.Context context
> 6383 = table.getMetrics().getTimer(HdfsTable.CATALOG_UPDATE_DURATION_METRIC).time();
> 6384
> 6385 long transactionId = -1;
> 6386 TblTransaction tblTxn = null;
> 6387 if (update.isSetTransaction_id()) {
> 6388 transactionId = update.getTransaction_id();
> 6389 Preconditions.checkState(transactionId > 0);
> 6390 try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) {
> 6391 // Setup transactional parameters needed to do alter table/partitions later.
> 6392 // TODO: Could be optimized to possibly save some RPCs, as these parameters are
> 6393 // not always needed + the writeId of the INSERT could be probably reused.
> 6394 tblTxn = MetastoreShim.createTblTransaction(
> 6395 msClient.getHiveClient(), table.getMetaStoreTable(), transactionId);
> 6396 }
> 6397 }
> 6398
> 6399 try {
> 6400 // Get new catalog version for table in insert.
> 6401 long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
> 6402 catalog_.getLock().writeLock().unlock();
> ...
> 6617 } finally {
> 6618 context.stop();
> 6619 UnlockWriteLockIfErronouslyLocked();
> 6620 table.releaseWriteLock();
> 6621 }
> {code}
> The catalog lock (versionLock) is acquired at line 6381 if the current thread get the table lock. In normal workload, it will be released at line 6402. However, if MetastoreShim.createTblTransaction() throws exceptions, there are no place to release the lock. Note that there is a finally-clause at line 6619 that can release the lock. But it's not guarding the code that calls createTblTransaction().
> If the write lock of versionLock is not released, other threads can't proceed in their catalog operations, including table loading and the event-processor.
> I'm able to reproduce the issue by modifying the code to explicitly throws an exception at
> [https://github.com/apache/impala/blob/4cf0bfa83f9641eb95d83c76af7962e6a3f1e064/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L6636]
> CC [~csringhofer] [~gfurnstahl]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org