You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "liuyao (Jira)" <ji...@apache.org> on 2021/07/06 02:25:00 UTC
[jira] [Work stopped] (IMPALA-5476) Catalogd restart bring about
metadata is out of sync
[ https://issues.apache.org/jira/browse/IMPALA-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on IMPALA-5476 stopped by liuyao.
--------------------------------------
> Catalogd restart bring about metadata is out of sync
> -----------------------------------------------------
>
> Key: IMPALA-5476
> URL: https://issues.apache.org/jira/browse/IMPALA-5476
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog, Frontend
> Affects Versions: Impala 2.8.0
> Reporter: fengYu
> Assignee: liuyao
> Priority: Major
> Labels: consistency, fault-tolerance
>
> I having been using impala in our env. Here is our cluster deployment:
> 20+ impalad backend.
> 4 of all impalads act as coordinator.
> one catalogd and one statestored
> I encounter one problem that one impalad's metadata is out of sync after catalogd restart.I find that while catalogd restarting, a DML operation is executing.
> After I analyze impala source code, I reappear the problem. this is my steps and analysis:
> 1. Start the impala cluster.
> 2. The cluster run a long time, lots of metadata operations, and current catalogVersion_ is big(such as bigger than 10000)
> 3. Submit a DML query(such as 'insert into xx partition() select xxx') to one impalad, and the query run about 1m.
> 4. While the query running, I stop catalogd, and I start catalogd just before the query execute QueryExecState->UpdateCatalog().
> 5. UpdateCatalog() will request catalogd for UpdateCatalog and catalogd will update the metadata of the table and response the newest metadata of the table.
> 6. After catalogd response, UpdateCatalog() update metadata cached in impalad(call updateCatalogCache()), and the run the following code:
> if (!catalogServiceId_.equals(req.getCatalog_service_id())) {
> boolean firstRun = catalogServiceId_.equals(INITIAL_CATALOG_SERVICE_ID);
> catalogServiceId_ = req.getCatalog_service_id();
> if (!firstRun) {
> // Throw an exception which will trigger a full topic update request.
> throw new CatalogException("Detected catalog service ID change. Aborting " +
> "updateCatalog()");
> }
> }
> serviceId is the new started catalogd's serviceId and do not equals to the impalad's catalogServiceId_, so the function throw CatalogException and the query get EXCEPTION, what is more, the impalad's catalogServiceId_ is set to the new one.
> 7. After catalogd start successfully, and publish all metadata to statestored, then push to the impalad, After step 6, impalad's catalogServiceId_ equals to the catalogd's serviceId, no exception throws.
> 8. In normal steps, step 7 will throw the CatalogException and set the from_version to 0 and statestored send full metadatas to impalad in next UpdateState().
> 9. After all steps finish, the impalad is out of sync, all new metadata operation will be lost because CatalogObjectCache.add() need 'new item will only be added if it has a larger catalog version'.
> I said "new metadata will be lost" means the following metadata operation which happened to the existing table will be lost until the table's version catch up with the older version. I think any operation can not recover it because impalad update local cached metadata by comparing new version and older version.
> I try some operations which can trigger table's metadata reloading and new version generated such as refresh, alter table. But new metadata always lost until the catalog_version bigger than the older one, for new created catalog object(such as create table/ create database..), the metadata is up-to-date.
> I think it is a bug, we need keep older catalogServiceId_ until full newly metadata applied(non-delta one, pushed by statestored), even all of metadata operations will be EXCEPTION at this time gap. perhaps there are some better solutions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org