You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tamas Mate (JIRA)" <ji...@apache.org> on 2019/08/12 15:08:00 UTC
[jira] [Commented] (IMPALA-2426) COMPUTE INCREMENTAL STATS doesn't
compute stats for newly discovered partitions
[ https://issues.apache.org/jira/browse/IMPALA-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905295#comment-16905295 ]
Tamas Mate commented on IMPALA-2426:
------------------------------------
An alter table {{UPDATE_STATS}} is being called to persist the stats, at the moment it leaves the {{reloadMetadata}} true, which later causes a table reload from HMS. The alter table {{UPDATE_STATS}} is called after the stats collection queries are executed, therefore Impala does not have stats for a new partition.
These are the related code parts from [CatalogOpExecutor|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L685]:
{code:java}
case UPDATE_STATS:
Preconditions.checkState(params.isSetUpdate_stats_params());
Reference<Long> numUpdatedColumns = new Reference<>(0L);
alterTableUpdateStats(tbl, params.getUpdate_stats_params(),
numUpdatedPartitions, numUpdatedColumns);
reloadTableSchema = true;
addSummary(response, "Updated " + numUpdatedPartitions.getRef() +
" partition(s) and " + numUpdatedColumns.getRef() + " column(s).");
break;
{code}
{code:java}
if (reloadMetadata) {
loadTableMetadata(tbl, newCatalogVersion, reloadFileMetadata,
reloadTableSchema, null, "ALTER TABLE " + params.getAlter_type().name());
addTableToCatalogUpdate(tbl, response.result);
}
{code}
We talked about this Jira during a discussion with [~balazsj_impala_220b] and this unexpected side effect should possibly be removed. The fact that compute stats refreshing the metadata could cause trouble during a Hive ingestion for example.
> COMPUTE INCREMENTAL STATS doesn't compute stats for newly discovered partitions
> -------------------------------------------------------------------------------
>
> Key: IMPALA-2426
> URL: https://issues.apache.org/jira/browse/IMPALA-2426
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 2.2.4, Impala 2.3.0
> Reporter: Jim Apple
> Assignee: Tamas Mate
> Priority: Minor
> Labels: catalog-server, ramp-up
>
> In the following sequence, I expect the stats for partition 333 to be computed, but they are not:
> # In Impala: create table T (x int) paritioned by (y int)
> # In Impala: insert into table T partition (y=42) values (2)
> # In Hive: alter table T add partition (y=333)
> # In Impala: compute incremental stats T
> # In Impala: show table stats T
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org