You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Michael Brown (JIRA)" <ji...@apache.org> on 2018/11/29 18:27:00 UTC

[jira] [Created] (IMPALA-7910) COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore

Michael Brown created IMPALA-7910:
-------------------------------------

             Summary: COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
                 Key: IMPALA-7910
                 URL: https://issues.apache.org/jira/browse/IMPALA-7910
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
    Affects Versions: Impala 2.12.0, Impala 2.11.0, Impala 2.9.0
            Reporter: Michael Brown
            Assignee: Tianyi Wang


COMPUTE STATS and possibly other DDL operations unnecessarily do the equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary operation can be very expensive, so should be avoided.

The behavior can be confirmed from the catalogd logs:
{code}
compute stats functional_parquet.alltypes;
+-------------------------------------------+
| summary                                   |
+-------------------------------------------+
| Updated 24 partition(s) and 11 column(s). |
+-------------------------------------------+

Relevant catalogd.INFO snippet
I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table metadata for: functional_parquet.alltypes
I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=10: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.277053 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=11: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.282152 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=12: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.285684 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=2: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.288921 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=3: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.292757 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=4: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.303673 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=5: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.308387 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=6: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.311506 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=7: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.314600 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=8: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.317709 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=9: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
I0413 14:40:24.317873 27295 HdfsTable.java:1273] Incrementally loaded table metadata for: functional_parquet.alltypes
{code}

The relevant code starts from CatalogOpExecutor:
{code}
boolean reloadMetadata = true;
catalog_.getLock().writeLock().unlock();

if (tbl instanceof KuduTable && altersKuduTable(params.getAlter_type())) {
  alterKuduTable(params, response, (KuduTable) tbl, newCatalogVersion);
  return;
}
switch (params.getAlter_type()) {
...
        case UPDATE_STATS:
          Preconditions.checkState(params.isSetUpdate_stats_params());
          Reference<Long> numUpdatedColumns = new Reference<>(0L);
          alterTableUpdateStats(tbl, params.getUpdate_stats_params(),
              numUpdatedPartitions, numUpdatedColumns);
          reloadTableSchema = true;
          addSummary(response, "Updated " + numUpdatedPartitions.getRef() +
              " partition(s) and " + numUpdatedColumns.getRef() + " column(s).");
          break;
....
}

if (reloadMetadata) { <-- REFRESH here
  loadTableMetadata(tbl, newCatalogVersion, reloadFileMetadata,
      reloadTableSchema, null);
  addTableToCatalogUpdate(tbl, response.result);
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)