You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2019/08/07 03:27:17 UTC

[impala] branch master updated (4fb8e8e -> 411189a)

This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.


    from 4fb8e8e  IMPALA-8816: reduce custom cluster test runtime in core
     new 39613c8  IMPALA-8627: Enable catalog-v2 in tests
     new 411189a  IMPALA-8376: directory limits for scratch usage

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/runtime/tmp-file-mgr-internal.h             |  15 +-
 be/src/runtime/tmp-file-mgr-test.cc                | 203 +++++++++++++++++++++
 be/src/runtime/tmp-file-mgr.cc                     | 128 ++++++++++---
 be/src/runtime/tmp-file-mgr.h                      |  33 +++-
 be/src/service/query-options-test.cc               |   8 +-
 be/src/util/parse-util-test.cc                     |  11 ++
 be/src/util/parse-util.cc                          |   6 +
 common/thrift/generate_error_codes.py              |   3 +-
 common/thrift/metrics.json                         |  10 +
 docker/catalogd/Dockerfile                         |   2 +-
 docker/impalad_coord_exec/Dockerfile               |   2 +-
 docker/impalad_coordinator/Dockerfile              |   2 +-
 docs/topics/impala_mem_limit.xml                   |   6 +-
 .../org/apache/impala/catalog/local/LocalDb.java   |   5 +-
 .../java/org/apache/impala/service/JdbcTest.java   |   7 +-
 .../java/org/apache/impala/testutil/TestUtils.java |  36 ++++
 .../queries/QueryTest/describe-db.test             |  21 ---
 .../queries/QueryTest/describe-hive-db.test        |  30 +++
 tests/common/environ.py                            |  15 +-
 tests/common/skip.py                               |   6 +
 tests/hs2/hs2_test_suite.py                        |  67 +++++--
 tests/hs2/test_hs2.py                              |  19 +-
 tests/metadata/test_hms_integration.py             |   5 +-
 tests/metadata/test_metadata_query_statements.py   |  13 +-
 tests/metadata/test_refresh_partition.py           |   2 +-
 tests/query_test/test_observability.py             |  44 ++++-
 26 files changed, 590 insertions(+), 109 deletions(-)
 create mode 100644 testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test


[impala] 02/02: IMPALA-8376: directory limits for scratch usage

Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 411189a8d733a66c363c72f8c404123d68640a3e
Author: Tim Armstrong <ta...@cloudera.com>
AuthorDate: Fri Aug 2 16:37:02 2019 -0700

    IMPALA-8376: directory limits for scratch usage
    
    This extends the --scratch_dirs syntax to support specifying a max
    capacity per directory, similarly to the --data_cache confirmation.
    The capacity is delimited from the directory name with ":" and
    uses the usual syntax for specifying memory. The following are
    valid arguments:
    * --scratch_dirs=/dir1,/dir2 (no limits)
    * --scratch_dirs=/dir1,/dir2:25G (only a limit on /dir2)
    * --scratch_dirs=/dir1:5MB,/dir2 (only a limit on /dir)
    * --scratch_dirs=/dir1:-1,/dir2:0 (alternative ways of
      expressing no limit)
    
    The usage is tracked with a metric per directory. Allocations
    from that directory start to fail when the limit is exceeded.
    These metrics are exposed as
    tmp-file-mgr.scratch-space-bytes-used.dir-0,
    tmp-file-mgr.scratch-space-bytes-used.dir-1, etc.
    
    Also add support for parsing terabyte specifiers to a utility
    function that is used for parsing many configurations.
    
    Testing:
    Added a unit test to exercise TmpFileMgr.
    
    Manually ran a spilling query on an impalad with multiple scratch dirs
    configured with different limits. Confirmed via metrics that the
    capacities were enforced.
    
    Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
    Reviewed-on: http://gerrit.cloudera.org:8080/13986
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/runtime/tmp-file-mgr-internal.h |  15 ++-
 be/src/runtime/tmp-file-mgr-test.cc    | 203 +++++++++++++++++++++++++++++++++
 be/src/runtime/tmp-file-mgr.cc         | 128 +++++++++++++++++----
 be/src/runtime/tmp-file-mgr.h          |  33 +++++-
 be/src/service/query-options-test.cc   |   8 +-
 be/src/util/parse-util-test.cc         |  11 ++
 be/src/util/parse-util.cc              |   6 +
 common/thrift/generate_error_codes.py  |   3 +-
 common/thrift/metrics.json             |  10 ++
 docs/topics/impala_mem_limit.xml       |   6 +-
 10 files changed, 382 insertions(+), 41 deletions(-)

diff --git a/be/src/runtime/tmp-file-mgr-internal.h b/be/src/runtime/tmp-file-mgr-internal.h
index 5d11c4d..090a019 100644
--- a/be/src/runtime/tmp-file-mgr-internal.h
+++ b/be/src/runtime/tmp-file-mgr-internal.h
@@ -36,11 +36,13 @@ class TmpFileMgr::File {
  public:
   File(FileGroup* file_group, DeviceId device_id, const std::string& path);
 
-  /// Allocates 'num_bytes' bytes in this file for a new block of data.
-  /// The file size is increased by a call to truncate() if necessary.
-  /// Sets 'offset' to the file offset of the first byte in the allocated
+  /// Allocates 'num_bytes' bytes in this file for a new block of data if there is
+  /// free capacity in this temporary directory. If there is insufficient capacity,
+  /// return false. Otherwise, update state and return true.
+  /// This function does not actually perform any file operations.
+  /// On success, sets 'offset' to the file offset of the first byte in the allocated
   /// range on success.
-  void AllocateSpace(int64_t num_bytes, int64_t* offset);
+  bool AllocateSpace(int64_t num_bytes, int64_t* offset);
 
   /// Called when an IO error is encountered for this file. Logs the error and blacklists
   /// the file.
@@ -85,7 +87,10 @@ class TmpFileMgr::File {
 
   /// Set to true to indicate that we shouldn't allocate any more space in this file.
   bool blacklisted_;
+
+  /// Helper to get the TmpDir that this file is associated with.
+  TmpDir* GetDir();
 };
-}
+} // namespace impala
 
 #endif
diff --git a/be/src/runtime/tmp-file-mgr-test.cc b/be/src/runtime/tmp-file-mgr-test.cc
index 5bed3b4..9b852a6 100644
--- a/be/src/runtime/tmp-file-mgr-test.cc
+++ b/be/src/runtime/tmp-file-mgr-test.cc
@@ -17,6 +17,7 @@
 
 #include <cstdio>
 #include <cstdlib>
+#include <limits>
 #include <numeric>
 
 #include <boost/filesystem.hpp>
@@ -52,6 +53,11 @@ namespace impala {
 
 using namespace io;
 
+static const int64_t KILOBYTE = 1024L;
+static const int64_t MEGABYTE = 1024L * KILOBYTE;
+static const int64_t GIGABYTE = 1024L * MEGABYTE;
+static const int64_t TERABYTE = 1024L * GIGABYTE;
+
 class TmpFileMgrTest : public ::testing::Test {
  public:
   virtual void SetUp() {
@@ -76,6 +82,18 @@ class TmpFileMgrTest : public ::testing::Test {
 
   DiskIoMgr* io_mgr() { return test_env_->exec_env()->disk_io_mgr(); }
 
+  /// Helper to create a TmpFileMgr and initialise it with InitCustom(). Adds the mgr to
+  /// 'obj_pool_' for automatic cleanup at the end of each test. Fails the test if
+  /// InitCustom() fails.
+  TmpFileMgr* CreateTmpFileMgr(const string& tmp_dirs_spec) {
+    // Allocate a new metrics group for each TmpFileMgr so they don't get confused by
+    // the pre-existing metrics (TmpFileMgr assumes it's a singleton in product code).
+    MetricGroup* metrics = obj_pool_.Add(new MetricGroup(""));
+    TmpFileMgr* mgr = obj_pool_.Add(new TmpFileMgr());
+    EXPECT_OK(mgr->InitCustom(tmp_dirs_spec, false, metrics));
+    return mgr;
+  }
+
   /// Check that metric values are consistent with TmpFileMgr state.
   void CheckMetrics(TmpFileMgr* tmp_file_mgr) {
     vector<TmpFileMgr::DeviceId> active = tmp_file_mgr->ActiveTmpDevices();
@@ -122,6 +140,11 @@ class TmpFileMgrTest : public ::testing::Test {
     return Status::OK();
   }
 
+  /// Helper to get the private tmp_dirs_ member.
+  static const vector<TmpFileMgr::TmpDir>& GetTmpDirs(TmpFileMgr* mgr) {
+    return mgr->tmp_dirs_;
+  }
+
   /// Helper to call the private TmpFileMgr::NewFile() method.
   static void NewFile(TmpFileMgr* mgr, TmpFileMgr::FileGroup* group,
       TmpFileMgr::DeviceId device_id, unique_ptr<TmpFileMgr::File>* new_file) {
@@ -644,4 +667,184 @@ TEST_F(TmpFileMgrTest, TestHWMMetric) {
   file_group_2.Close();
   checkHWMMetrics(0, 2 * LIMIT);
 }
+
+// Test that usage per directory is tracked correctly and per-directory limits are
+// enforced. Sets up several scratch directories, some with limits, and checks
+// that the allocations occur in the right directories.
+TEST_F(TmpFileMgrTest, TestDirectoryLimits) {
+  vector<string> tmp_dirs({"/tmp/tmp-file-mgr-test.1", "/tmp/tmp-file-mgr-test.2",
+      "/tmp/tmp-file-mgr-test.3"});
+  vector<string> tmp_dir_specs({"/tmp/tmp-file-mgr-test.1:512",
+      "/tmp/tmp-file-mgr-test.2:1k", "/tmp/tmp-file-mgr-test.3"});
+  RemoveAndCreateDirs(tmp_dirs);
+  TmpFileMgr tmp_file_mgr;
+  ASSERT_OK(tmp_file_mgr.InitCustom(tmp_dir_specs, false, metrics_.get()));
+
+  TmpFileMgr::FileGroup file_group_1(
+      &tmp_file_mgr, io_mgr(), RuntimeProfile::Create(&obj_pool_, "p1"), TUniqueId());
+  TmpFileMgr::FileGroup file_group_2(
+      &tmp_file_mgr, io_mgr(), RuntimeProfile::Create(&obj_pool_, "p2"), TUniqueId());
+
+  vector<TmpFileMgr::File*> files;
+  ASSERT_OK(CreateFiles(&file_group_1, &files));
+  ASSERT_OK(CreateFiles(&file_group_2, &files));
+
+  IntGauge* dir1_usage = metrics_->FindMetricForTesting<IntGauge>(
+      "tmp-file-mgr.scratch-space-bytes-used.dir-0");
+  IntGauge* dir2_usage = metrics_->FindMetricForTesting<IntGauge>(
+      "tmp-file-mgr.scratch-space-bytes-used.dir-1");
+  IntGauge* dir3_usage = metrics_->FindMetricForTesting<IntGauge>(
+      "tmp-file-mgr.scratch-space-bytes-used.dir-2");
+
+  // A power-of-two so that FileGroup allocates exactly this amount of scratch space.
+  const int64_t ALLOC_SIZE = 512;
+  int64_t offset;
+  TmpFileMgr::File* alloc_file;
+
+  // Allocate three times - once per directory. We expect these allocations to go through
+  // so we should have one allocation in each directory.
+  SetNextAllocationIndex(&file_group_1, 0);
+  for (int i = 0; i < tmp_dir_specs.size(); ++i) {
+    ASSERT_OK(GroupAllocateSpace(&file_group_1, ALLOC_SIZE, &alloc_file, &offset));
+  }
+  EXPECT_EQ(ALLOC_SIZE, dir1_usage->GetValue());
+  EXPECT_EQ(ALLOC_SIZE, dir2_usage->GetValue());
+  EXPECT_EQ(ALLOC_SIZE, dir3_usage->GetValue());
+
+  // This time we should hit the limit on the first directory. Do this from a
+  // different file group to show that limits are enforced across file groups.
+  for (int i = 0; i < 2; ++i) {
+    ASSERT_OK(GroupAllocateSpace(&file_group_2, ALLOC_SIZE, &alloc_file, &offset));
+  }
+  EXPECT_EQ(ALLOC_SIZE, dir1_usage->GetValue());
+  EXPECT_EQ(2 * ALLOC_SIZE, dir2_usage->GetValue());
+  EXPECT_EQ(2 * ALLOC_SIZE, dir3_usage->GetValue());
+
+  // Now we're at the limits on two directories, all allocations should got to the
+  // last directory without a limit.
+  for (int i = 0; i < 100; ++i) {
+    ASSERT_OK(GroupAllocateSpace(&file_group_2, ALLOC_SIZE, &alloc_file, &offset));
+  }
+  EXPECT_EQ(ALLOC_SIZE, dir1_usage->GetValue());
+  EXPECT_EQ(2 * ALLOC_SIZE, dir2_usage->GetValue());
+  EXPECT_EQ(102 * ALLOC_SIZE, dir3_usage->GetValue());
+
+  file_group_2.Close();
+  // Metrics should be decremented when the file groups delete the underlying files.
+  EXPECT_EQ(ALLOC_SIZE, dir1_usage->GetValue());
+  EXPECT_EQ(ALLOC_SIZE, dir2_usage->GetValue());
+  EXPECT_EQ(ALLOC_SIZE, dir3_usage->GetValue());
+
+  // We should be able to reuse the space freed up.
+  ASSERT_OK(GroupAllocateSpace(&file_group_1, ALLOC_SIZE, &alloc_file, &offset));
+
+  EXPECT_EQ(2 * ALLOC_SIZE, dir2_usage->GetValue());
+  file_group_1.Close();
+  EXPECT_EQ(0, dir1_usage->GetValue());
+  EXPECT_EQ(0, dir2_usage->GetValue());
+  EXPECT_EQ(0, dir3_usage->GetValue());
+}
+
+// Test the case when all per-directory limits are hit. We expect to return a status
+// and fail gracefully.
+TEST_F(TmpFileMgrTest, TestDirectoryLimitsExhausted) {
+  vector<string> tmp_dirs({"/tmp/tmp-file-mgr-test.1", "/tmp/tmp-file-mgr-test.2"});
+  vector<string> tmp_dir_specs(
+      {"/tmp/tmp-file-mgr-test.1:256kb", "/tmp/tmp-file-mgr-test.2:1mb"});
+  const int64_t DIR1_LIMIT = 256L * 1024L;
+  const int64_t DIR2_LIMIT = 1024L * 1024L;
+  RemoveAndCreateDirs(tmp_dirs);
+  TmpFileMgr tmp_file_mgr;
+  ASSERT_OK(tmp_file_mgr.InitCustom(tmp_dir_specs, false, metrics_.get()));
+
+  TmpFileMgr::FileGroup file_group_1(
+      &tmp_file_mgr, io_mgr(), RuntimeProfile::Create(&obj_pool_, "p1"), TUniqueId());
+  TmpFileMgr::FileGroup file_group_2(
+      &tmp_file_mgr, io_mgr(), RuntimeProfile::Create(&obj_pool_, "p2"), TUniqueId());
+
+  vector<TmpFileMgr::File*> files;
+  ASSERT_OK(CreateFiles(&file_group_1, &files));
+  ASSERT_OK(CreateFiles(&file_group_2, &files));
+
+  IntGauge* dir1_usage = metrics_->FindMetricForTesting<IntGauge>(
+      "tmp-file-mgr.scratch-space-bytes-used.dir-0");
+  IntGauge* dir2_usage = metrics_->FindMetricForTesting<IntGauge>(
+      "tmp-file-mgr.scratch-space-bytes-used.dir-1");
+
+  // A power-of-two so that FileGroup allocates exactly this amount of scratch space.
+  const int64_t ALLOC_SIZE = 512;
+  const int64_t MAX_ALLOCATIONS = (DIR1_LIMIT + DIR2_LIMIT) / ALLOC_SIZE;
+  int64_t offset;
+  TmpFileMgr::File* alloc_file;
+
+  // Allocate exactly the maximum total capacity of the directories.
+  SetNextAllocationIndex(&file_group_1, 0);
+  for (int i = 0; i < MAX_ALLOCATIONS; ++i) {
+    ASSERT_OK(GroupAllocateSpace(&file_group_1, ALLOC_SIZE, &alloc_file, &offset));
+  }
+  EXPECT_EQ(DIR1_LIMIT, dir1_usage->GetValue());
+  EXPECT_EQ(DIR2_LIMIT, dir2_usage->GetValue());
+  // The directories are at capacity, so allocations should fail.
+  Status err1 = GroupAllocateSpace(&file_group_1, ALLOC_SIZE, &alloc_file, &offset);
+  ASSERT_EQ(err1.code(), TErrorCode::SCRATCH_ALLOCATION_FAILED);
+  Status err2 = GroupAllocateSpace(&file_group_2, ALLOC_SIZE, &alloc_file, &offset);
+  ASSERT_EQ(err2.code(), TErrorCode::SCRATCH_ALLOCATION_FAILED);
+
+  // A FileGroup should recover once allocations are released, i.e. it does not
+  // permanently block allocating files from the group.
+  file_group_1.Close();
+  ASSERT_OK(GroupAllocateSpace(&file_group_2, ALLOC_SIZE, &alloc_file, &offset));
+  file_group_2.Close();
+}
+
+// Test the directory parsing logic, including the various error cases.
+TEST_F(TmpFileMgrTest, TestDirectoryLimitParsing) {
+  RemoveAndCreateDirs({"/tmp/tmp-file-mgr-test1", "/tmp/tmp-file-mgr-test2",
+      "/tmp/tmp-file-mgr-test3", "/tmp/tmp-file-mgr-test4", "/tmp/tmp-file-mgr-test5",
+      "/tmp/tmp-file-mgr-test6", "/tmp/tmp-file-mgr-test7"});
+  // Configure various directories with valid formats.
+  auto& dirs = GetTmpDirs(
+      CreateTmpFileMgr("/tmp/tmp-file-mgr-test1:5g,/tmp/tmp-file-mgr-test2,"
+                       "/tmp/tmp-file-mgr-test3:1234,/tmp/tmp-file-mgr-test4:99999999,"
+                       "/tmp/tmp-file-mgr-test5:200tb,/tmp/tmp-file-mgr-test6:100MB"));
+  EXPECT_EQ(6, dirs.size());
+  EXPECT_EQ(5 * GIGABYTE, dirs[0].bytes_limit);
+  EXPECT_EQ(numeric_limits<int64_t>::max(), dirs[1].bytes_limit);
+  EXPECT_EQ(1234, dirs[2].bytes_limit);
+  EXPECT_EQ(99999999, dirs[3].bytes_limit);
+  EXPECT_EQ(200 * TERABYTE, dirs[4].bytes_limit);
+  EXPECT_EQ(100 * MEGABYTE, dirs[5].bytes_limit);
+
+  // Various invalid limit formats result in the directory getting skipped.
+  // Include a valid dir on the end to ensure that we don't short-circuit all
+  // directories.
+  auto& dirs2 = GetTmpDirs(
+      CreateTmpFileMgr("/tmp/tmp-file-mgr-test1:foo,/tmp/tmp-file-mgr-test2:?,"
+                       "/tmp/tmp-file-mgr-test3:1.2.3.4,/tmp/tmp-file-mgr-test4: ,"
+                       "/tmp/tmp-file-mgr-test5:5pb,/tmp/tmp-file-mgr-test6:10%,"
+                       "/tmp/tmp-file-mgr-test1:100"));
+  EXPECT_EQ(1, dirs2.size());
+  EXPECT_EQ("/tmp/tmp-file-mgr-test1/impala-scratch", dirs2[0].path);
+  EXPECT_EQ(100, dirs2[0].bytes_limit);
+
+  // Various valid ways of specifying "unlimited".
+  auto& dirs3 =
+      GetTmpDirs(CreateTmpFileMgr("/tmp/tmp-file-mgr-test1:,/tmp/tmp-file-mgr-test2:-1,"
+                                  "/tmp/tmp-file-mgr-test3,/tmp/tmp-file-mgr-test4:0"));
+  EXPECT_EQ(4, dirs3.size());
+  for (const auto& dir : dirs3) {
+    EXPECT_EQ(numeric_limits<int64_t>::max(), dir.bytes_limit);
+  }
+
+  // Extra colons
+  auto& dirs4 = GetTmpDirs(
+      CreateTmpFileMgr("/tmp/tmp-file-mgr-test1:1:,/tmp/tmp-file-mgr-test2:10mb::"));
+  EXPECT_EQ(0, dirs4.size());
+
+  // Empty strings.
+  auto& nodirs = GetTmpDirs(CreateTmpFileMgr(""));
+  EXPECT_EQ(0, nodirs.size());
+  auto& empty_paths = GetTmpDirs(CreateTmpFileMgr(","));
+  EXPECT_EQ(2, empty_paths.size());
 }
+} // namespace impala
diff --git a/be/src/runtime/tmp-file-mgr.cc b/be/src/runtime/tmp-file-mgr.cc
index 400d4d6..7954a62 100644
--- a/be/src/runtime/tmp-file-mgr.cc
+++ b/be/src/runtime/tmp-file-mgr.cc
@@ -17,6 +17,8 @@
 
 #include "runtime/tmp-file-mgr.h"
 
+#include <limits>
+
 #include <boost/algorithm/string.hpp>
 #include <boost/filesystem.hpp>
 #include <boost/lexical_cast.hpp>
@@ -35,6 +37,7 @@
 #include "util/debug-util.h"
 #include "util/disk-info.h"
 #include "util/filesystem-util.h"
+#include "util/parse-util.h"
 #include "util/pretty-printer.h"
 #include "util/runtime-profile-counters.h"
 
@@ -43,7 +46,14 @@
 DEFINE_bool(disk_spill_encryption, true,
     "Set this to encrypt and perform an integrity "
     "check on all data spilled to disk during a query");
-DEFINE_string(scratch_dirs, "/tmp", "Writable scratch directories");
+DEFINE_string(scratch_dirs, "/tmp",
+    "Writable scratch directories. "
+    "This is a comma-separated list of directories. Each directory is "
+    "specified as the directory path and an optional limit on the bytes that will "
+    "be allocated in that directory. If the optional limit is provided, the path and "
+    "the limit are separated by a colon. E.g. '/dir1:10G,/dir2:5GB,/dir3' will allow "
+    "allocating up to 10GB of scratch in /dir1, 5GB of scratch in /dir2 and an "
+    "unlimited amount in /dir3.");
 DEFINE_bool(allow_multiple_scratch_dirs_per_device, true,
     "If false and --scratch_dirs contains multiple directories on the same device, "
     "then only the first writable directory is used");
@@ -71,6 +81,8 @@ const string TMP_FILE_MGR_SCRATCH_SPACE_BYTES_USED_HIGH_WATER_MARK =
     "tmp-file-mgr.scratch-space-bytes-used-high-water-mark";
 const string TMP_FILE_MGR_SCRATCH_SPACE_BYTES_USED =
     "tmp-file-mgr.scratch-space-bytes-used";
+const string SCRATCH_DIR_BYTES_USED_FORMAT =
+    "tmp-file-mgr.scratch-space-bytes-used.dir-$0";
 
 TmpFileMgr::TmpFileMgr()
   : initialized_(false),
@@ -79,27 +91,61 @@ TmpFileMgr::TmpFileMgr()
     scratch_bytes_used_metric_(nullptr) {}
 
 Status TmpFileMgr::Init(MetricGroup* metrics) {
-  string tmp_dirs_spec = FLAGS_scratch_dirs;
+  return InitCustom(
+      FLAGS_scratch_dirs, !FLAGS_allow_multiple_scratch_dirs_per_device, metrics);
+}
+
+Status TmpFileMgr::InitCustom(
+    const string& tmp_dirs_spec, bool one_dir_per_device, MetricGroup* metrics) {
   vector<string> all_tmp_dirs;
   // Empty string should be interpreted as no scratch
   if (!tmp_dirs_spec.empty()) {
     split(all_tmp_dirs, tmp_dirs_spec, is_any_of(","), token_compress_on);
   }
-  return InitCustom(all_tmp_dirs, !FLAGS_allow_multiple_scratch_dirs_per_device, metrics);
+  return InitCustom(all_tmp_dirs, one_dir_per_device, metrics);
 }
 
-Status TmpFileMgr::InitCustom(const vector<string>& tmp_dirs, bool one_dir_per_device,
-      MetricGroup* metrics) {
+Status TmpFileMgr::InitCustom(const vector<string>& tmp_dir_specifiers,
+    bool one_dir_per_device, MetricGroup* metrics) {
   DCHECK(!initialized_);
-  if (tmp_dirs.empty()) {
+  if (tmp_dir_specifiers.empty()) {
     LOG(WARNING) << "Running without spill to disk: no scratch directories provided.";
   }
+  vector<TmpDir> tmp_dirs;
+  // Parse the directory specifiers. Don't return an error on parse errors, just log a
+  // warning - we don't want to abort process startup because of misconfigured scratch,
+  // since queries will generally still be runnable.
+  for (const string& tmp_dir_spec : tmp_dir_specifiers) {
+    vector<string> toks;
+    split(toks, tmp_dir_spec, is_any_of(":"), token_compress_on);
+    if (toks.size() > 2) {
+      LOG(ERROR) << "Could not parse temporary dir specifier, too many colons: '"
+                 << tmp_dir_spec << "'";
+      continue;
+    }
+    int64_t bytes_limit = numeric_limits<int64_t>::max();
+    if (toks.size() == 2) {
+      bool is_percent;
+      bytes_limit = ParseUtil::ParseMemSpec(toks[1], &is_percent, 0);
+      if (bytes_limit < 0 || is_percent) {
+        LOG(ERROR) << "Malformed data cache capacity configuration '" << tmp_dir_spec
+                   << "'";
+        continue;
+      } else if (bytes_limit == 0) {
+        // Interpret -1, 0 or empty string as no limit.
+        bytes_limit = numeric_limits<int64_t>::max();
+      }
+    }
+    IntGauge* bytes_used_metric = metrics->AddGauge(
+        SCRATCH_DIR_BYTES_USED_FORMAT, 0, Substitute("$0", tmp_dirs.size()));
+    tmp_dirs.emplace_back(toks[0], bytes_limit, bytes_used_metric);
+  }
 
   vector<bool> is_tmp_dir_on_disk(DiskInfo::num_disks(), false);
   // For each tmp directory, find the disk it is on,
   // so additional tmp directories on the same disk can be skipped.
   for (int i = 0; i < tmp_dirs.size(); ++i) {
-    path tmp_path(trim_right_copy_if(tmp_dirs[i], is_any_of("/")));
+    path tmp_path(trim_right_copy_if(tmp_dirs[i].path, is_any_of("/")));
     tmp_path = absolute(tmp_path);
     path scratch_subdir_path(tmp_path / TMP_SUB_DIR_NAME);
     // tmp_path must be a writable directory.
@@ -127,8 +173,10 @@ Status TmpFileMgr::InitCustom(const vector<string>& tmp_dirs, bool one_dir_per_d
       if (status.ok()) {
         if (disk_id >= 0) is_tmp_dir_on_disk[disk_id] = true;
         LOG(INFO) << "Using scratch directory " << scratch_subdir_path.string() << " on "
-                  << "disk " << disk_id;
-        tmp_dirs_.push_back(scratch_subdir_path.string());
+                  << "disk " << disk_id
+                  << " limit: " << PrettyPrinter::PrintBytes(tmp_dirs[i].bytes_limit);
+        tmp_dirs_.emplace_back(scratch_subdir_path.string(), tmp_dirs[i].bytes_limit,
+            tmp_dirs[i].bytes_used_metric);
       } else {
         LOG(WARNING) << "Could not remove and recreate directory "
                      << scratch_subdir_path.string() << ": cannot use it for scratch. "
@@ -144,7 +192,7 @@ Status TmpFileMgr::InitCustom(const vector<string>& tmp_dirs, bool one_dir_per_d
       metrics, TMP_FILE_MGR_ACTIVE_SCRATCH_DIRS_LIST, set<string>());
   num_active_scratch_dirs_metric_->SetValue(tmp_dirs_.size());
   for (int i = 0; i < tmp_dirs_.size(); ++i) {
-    active_scratch_dirs_metric_->Add(tmp_dirs_[i]);
+    active_scratch_dirs_metric_->Add(tmp_dirs_[i].path);
   }
   scratch_bytes_used_metric_ =
       metrics->AddHWMGauge(TMP_FILE_MGR_SCRATCH_SPACE_BYTES_USED_HIGH_WATER_MARK,
@@ -154,7 +202,7 @@ Status TmpFileMgr::InitCustom(const vector<string>& tmp_dirs, bool one_dir_per_d
 
   if (tmp_dirs_.empty() && !tmp_dirs.empty()) {
     LOG(ERROR) << "Running without spill to disk: could not use any scratch "
-               << "directories in list: " << join(tmp_dirs, ",")
+               << "directories in list: " << join(tmp_dir_specifiers, ",")
                << ". See previous warnings for information on causes.";
   }
   return Status::OK();
@@ -170,7 +218,7 @@ void TmpFileMgr::NewFile(
   string unique_name = lexical_cast<string>(random_generator()());
   stringstream file_name;
   file_name << PrintId(file_group->unique_id()) << "_" << unique_name;
-  path new_file_path(tmp_dirs_[device_id]);
+  path new_file_path(tmp_dirs_[device_id].path);
   new_file_path /= file_name.str();
 
   new_file->reset(new File(file_group, device_id, new_file_path.string()));
@@ -180,7 +228,7 @@ string TmpFileMgr::GetTmpDirPath(DeviceId device_id) const {
   DCHECK(initialized_);
   DCHECK_GE(device_id, 0);
   DCHECK_LT(device_id, tmp_dirs_.size());
-  return tmp_dirs_[device_id];
+  return tmp_dirs_[device_id].path;
 }
 
 int TmpFileMgr::NumActiveTmpDevices() {
@@ -206,10 +254,17 @@ TmpFileMgr::File::File(FileGroup* file_group, DeviceId device_id, const string&
   DCHECK(file_group != nullptr);
 }
 
-void TmpFileMgr::File::AllocateSpace(int64_t num_bytes, int64_t* offset) {
+bool TmpFileMgr::File::AllocateSpace(int64_t num_bytes, int64_t* offset) {
   DCHECK_GT(num_bytes, 0);
+  TmpDir* dir = GetDir();
+  // Increment optimistically and roll back if the limit is exceeded.
+  if (dir->bytes_used_metric->Increment(num_bytes) > dir->bytes_limit) {
+    dir->bytes_used_metric->Increment(-num_bytes);
+    return false;
+  }
   *offset = bytes_allocated_;
   bytes_allocated_ += num_bytes;
+  return true;
 }
 
 int TmpFileMgr::File::AssignDiskQueue() const {
@@ -223,7 +278,13 @@ void TmpFileMgr::File::Blacklist(const ErrorMsg& msg) {
 
 Status TmpFileMgr::File::Remove() {
   // Remove the file if present (it may not be present if no writes completed).
-  return FileSystemUtil::RemovePaths({path_});
+  Status status = FileSystemUtil::RemovePaths({path_});
+  GetDir()->bytes_used_metric->Increment(-bytes_allocated_);
+  return status;
+}
+
+TmpFileMgr::TmpDir* TmpFileMgr::File::GetDir() {
+  return &file_group_->tmp_file_mgr_->tmp_dirs_[device_id_];
 }
 
 string TmpFileMgr::File::DebugString() {
@@ -272,7 +333,7 @@ Status TmpFileMgr::FileGroup::CreateFiles() {
     ++files_allocated;
   }
   DCHECK_EQ(tmp_files_.size(), files_allocated);
-  if (tmp_files_.size() == 0) return ScratchAllocationFailedStatus();
+  if (tmp_files_.size() == 0) return ScratchAllocationFailedStatus({});
   // Start allocating on a random device to avoid overloading the first device.
   next_allocation_index_ = rand() % tmp_files_.size();
   return Status::OK();
@@ -315,18 +376,27 @@ Status TmpFileMgr::FileGroup::AllocateSpace(
   // Lazily create the files on the first write.
   if (tmp_files_.empty()) RETURN_IF_ERROR(CreateFiles());
 
+  // Track the indices of any directories where we failed due to capacity. This is
+  // required for error reporting if we are totally out of capacity so that it's clear
+  // that some disks were at capacity.
+  vector<int> at_capacity_dirs;
+
   // Find the next physical file in round-robin order and allocate a range from it.
   for (int attempt = 0; attempt < tmp_files_.size(); ++attempt) {
-    *tmp_file = tmp_files_[next_allocation_index_].get();
+    int idx = next_allocation_index_;
     next_allocation_index_ = (next_allocation_index_ + 1) % tmp_files_.size();
+    *tmp_file = tmp_files_[idx].get();
     if ((*tmp_file)->is_blacklisted()) continue;
-    (*tmp_file)->AllocateSpace(scratch_range_bytes, file_offset);
+    if (!(*tmp_file)->AllocateSpace(scratch_range_bytes, file_offset)) {
+      at_capacity_dirs.push_back(idx);
+      continue;
+    }
     scratch_space_bytes_used_counter_->Add(scratch_range_bytes);
     tmp_file_mgr_->scratch_bytes_used_metric_->Increment(scratch_range_bytes);
     current_bytes_allocated_ += num_bytes;
     return Status::OK();
   }
-  return ScratchAllocationFailedStatus();
+  return ScratchAllocationFailedStatus(at_capacity_dirs);
 }
 
 void TmpFileMgr::FileGroup::RecycleFileRange(unique_ptr<WriteHandle> handle) {
@@ -484,11 +554,21 @@ Status TmpFileMgr::FileGroup::RecoverWriteError(
   return handle->RetryWrite(io_ctx_.get(), tmp_file, file_offset);
 }
 
-Status TmpFileMgr::FileGroup::ScratchAllocationFailedStatus() {
-  Status status(TErrorCode::SCRATCH_ALLOCATION_FAILED,
-        join(tmp_file_mgr_->tmp_dirs_, ","), GetBackendString(),
-        PrettyPrinter::PrintBytes(scratch_space_bytes_used_counter_->value()),
-        PrettyPrinter::PrintBytes(current_bytes_allocated_));
+Status TmpFileMgr::FileGroup::ScratchAllocationFailedStatus(
+    const vector<int>& at_capacity_dirs) {
+  vector<string> tmp_dir_paths;
+  for (TmpDir& tmp_dir : tmp_file_mgr_->tmp_dirs_) {
+    tmp_dir_paths.push_back(tmp_dir.path);
+  }
+  vector<string> at_capacity_dir_paths;
+  for (int dir_idx : at_capacity_dirs) {
+    at_capacity_dir_paths.push_back(tmp_file_mgr_->tmp_dirs_[dir_idx].path);
+  }
+  Status status(TErrorCode::SCRATCH_ALLOCATION_FAILED, join(tmp_dir_paths, ","),
+      GetBackendString(),
+      PrettyPrinter::PrintBytes(scratch_space_bytes_used_counter_->value()),
+      PrettyPrinter::PrintBytes(current_bytes_allocated_),
+      join(at_capacity_dir_paths, ","));
   // Include all previous errors that may have caused the failure.
   for (Status& err : scratch_errors_) status.MergeStatus(err);
   return status;
diff --git a/be/src/runtime/tmp-file-mgr.h b/be/src/runtime/tmp-file-mgr.h
index 9a38e4f..70ec1d4 100644
--- a/be/src/runtime/tmp-file-mgr.h
+++ b/be/src/runtime/tmp-file-mgr.h
@@ -92,6 +92,22 @@ class TmpFileMgr {
   /// Same typedef as io::WriteRange::WriteDoneCallback.
   typedef std::function<void(const Status&)> WriteDoneCallback;
 
+  /// A configured temporary directory that TmpFileMgr allocates files in.
+  struct TmpDir {
+    TmpDir(const std::string& path, int64_t bytes_limit, IntGauge* bytes_used_metric)
+      : path(path), bytes_limit(bytes_limit), bytes_used_metric(bytes_used_metric) {}
+
+    /// Path to the temporary directory.
+    const std::string path;
+
+    /// Limit on bytes that should be written to this path. Set to maximum value
+    /// of int64_t if there is no limit.
+    int64_t const bytes_limit;
+
+    /// The current bytes of scratch used for this temporary directory.
+    IntGauge* const bytes_used_metric;
+  };
+
   /// Represents a group of temporary files - one per disk with a scratch directory. The
   /// total allocated bytes of the group can be bound by setting the space allocation
   /// limit. The owner of the FileGroup object is responsible for calling the Close()
@@ -202,8 +218,11 @@ class TmpFileMgr {
 
     /// Return a SCRATCH_ALLOCATION_FAILED error with the appropriate information,
     /// including scratch directories, the amount of scratch allocated and previous
-    /// errors that caused this failure. 'lock_' must be held by caller.
-    Status ScratchAllocationFailedStatus();
+    /// errors that caused this failure. If some directories were at capacity,
+    /// but had not encountered an error, the indices of these directories in
+    /// tmp_file_mgr_->tmp_dir_ should be included in 'at_capacity_dirs'.
+    /// 'lock_' must be held by caller.
+    Status ScratchAllocationFailedStatus(const std::vector<int>& at_capacity_dirs);
 
     /// The TmpFileMgr it is associated with.
     TmpFileMgr* const tmp_file_mgr_;
@@ -390,9 +409,13 @@ class TmpFileMgr {
 
   /// Custom initialization - initializes with the provided list of directories.
   /// If one_dir_per_device is true, only use one temporary directory per device.
-  /// This interface is intended for testing purposes.
-  Status InitCustom(const std::vector<std::string>& tmp_dirs, bool one_dir_per_device,
+  /// This interface is intended for testing purposes. 'tmp_dir_specifiers'
+  /// use the command-line syntax, i.e. <path>[:<limit>]. The first variant takes
+  /// a comma-separated list, the second takes a vector.
+  Status InitCustom(const std::string& tmp_dirs_spec, bool one_dir_per_device,
       MetricGroup* metrics) WARN_UNUSED_RESULT;
+  Status InitCustom(const std::vector<std::string>& tmp_dir_specifiers,
+      bool one_dir_per_device, MetricGroup* metrics) WARN_UNUSED_RESULT;
 
   /// Return the scratch directory path for the device.
   std::string GetTmpDirPath(DeviceId device_id) const;
@@ -419,7 +442,7 @@ class TmpFileMgr {
   bool initialized_;
 
   /// The paths of the created tmp directories.
-  std::vector<std::string> tmp_dirs_;
+  std::vector<TmpDir> tmp_dirs_;
 
   /// Metrics to track active scratch directories.
   IntGauge* num_active_scratch_dirs_metric_;
diff --git a/be/src/service/query-options-test.cc b/be/src/service/query-options-test.cc
index 8c02417..79f50cc 100644
--- a/be/src/service/query-options-test.cc
+++ b/be/src/service/query-options-test.cc
@@ -82,7 +82,8 @@ auto MakeTestOkFn(TQueryOptions& options, OptionDef<T> option_def) {
 template<typename T>
 auto MakeTestErrFn(TQueryOptions& options, OptionDef<T> option_def) {
   return [&options, option_def](const char* str) {
-    EXPECT_FALSE(SetQueryOption(option_def.option_name, str, &options, nullptr).ok());
+    EXPECT_FALSE(SetQueryOption(option_def.option_name, str, &options, nullptr).ok())
+      << option_def.option_name << " " << str;
   };
 }
 
@@ -110,7 +111,7 @@ void TestByteCaseSet(TQueryOptions& options,
     }
     TestError(to_string(range.lower_bound - 1).c_str());
     TestError(to_string(static_cast<uint64_t>(range.upper_bound) + 1).c_str());
-    TestError("1tb");
+    TestError("1pb");
     TestError("1%");
     TestError("1%B");
     TestError("1B%");
@@ -120,6 +121,7 @@ void TestByteCaseSet(TQueryOptions& options,
         {"1 B", 1},
         {"0Kb", 0},
         {"4G",  4ll * 1024 * 1024 * 1024},
+        {"4tb",  4ll * 1024 * 1024 * 1024 * 1024},
         {"-1M", -1024 * 1024}
     };
     for (const auto& value_def : common_values) {
@@ -307,7 +309,7 @@ TEST(QueryOptions, SetSpecialOptions) {
     TestOk("0", 0);
     TestOk("4GB", 4ll * 1024 * 1024 * 1024);
     TestError("-1MB");
-    TestError("1tb");
+    TestError("1pb");
     TestError("1%");
     TestError("1%B");
     TestError("1B%");
diff --git a/be/src/util/parse-util-test.cc b/be/src/util/parse-util-test.cc
index 6b8e7a5..c8fb4a0 100644
--- a/be/src/util/parse-util-test.cc
+++ b/be/src/util/parse-util-test.cc
@@ -35,6 +35,7 @@ TEST(ParseMemSpecs, Basic) {
   int64_t kilobytes = 1024;
   int64_t megabytes = 1024 * kilobytes;
   int64_t gigabytes = 1024 * megabytes;
+  int64_t terabytes = 1024 * gigabytes;
 
   bytes = ParseUtil::ParseMemSpec("1", &is_percent, MemInfo::physical_mem());
   ASSERT_EQ(1, bytes);
@@ -72,6 +73,14 @@ TEST(ParseMemSpecs, Basic) {
   ASSERT_EQ(12 * gigabytes, bytes);
   ASSERT_FALSE(is_percent);
 
+  bytes = ParseUtil::ParseMemSpec("8T", &is_percent, MemInfo::physical_mem());
+  ASSERT_EQ(8 * terabytes, bytes);
+  ASSERT_FALSE(is_percent);
+
+  bytes = ParseUtil::ParseMemSpec("12tb", &is_percent, MemInfo::physical_mem());
+  ASSERT_EQ(12 * terabytes, bytes);
+  ASSERT_FALSE(is_percent);
+
   bytes = ParseUtil::ParseMemSpec("13%", &is_percent, MemInfo::physical_mem());
   ASSERT_GT(bytes, 0);
   ASSERT_TRUE(is_percent);
@@ -91,6 +100,8 @@ TEST(ParseMemSpecs, Basic) {
   bad_values.push_back("1Bb");
   bad_values.push_back("1%%");
   bad_values.push_back("1.1");
+  bad_values.push_back("1pb");
+  bad_values.push_back("1eb");
   stringstream ss;
   ss << UINT64_MAX;
   bad_values.push_back(ss.str());
diff --git a/be/src/util/parse-util.cc b/be/src/util/parse-util.cc
index d17cf3e..75253d7 100644
--- a/be/src/util/parse-util.cc
+++ b/be/src/util/parse-util.cc
@@ -40,6 +40,12 @@ int64_t ParseUtil::ParseMemSpec(const string& mem_spec_str, bool* is_percent,
     number_str_len--;
   }
   switch (*suffix_char) {
+    case 't':
+    case 'T':
+      // Terabytes.
+      number_str_len--;
+      multiplier = 1024L * 1024L * 1024L * 1024L;
+      break;
     case 'g':
     case 'G':
       // Gigabytes.
diff --git a/common/thrift/generate_error_codes.py b/common/thrift/generate_error_codes.py
index 5e1070d..bb45c3d 100755
--- a/common/thrift/generate_error_codes.py
+++ b/common/thrift/generate_error_codes.py
@@ -308,7 +308,8 @@ error_codes = (
   ("SCRATCH_ALLOCATION_FAILED", 101, "Could not create files in any configured scratch "
    "directories (--scratch_dirs=$0) on backend '$1'. $2 of scratch is currently in "
    "use by this Impala Daemon ($3 by this query). See logs for previous errors that "
-   "may have prevented creating or writing scratch files."),
+   "may have prevented creating or writing scratch files. The following directories "
+   "were at capacity: $4"),
 
   ("SCRATCH_READ_TRUNCATED", 102, "Error reading $0 bytes from scratch file '$1' "
    "on backend $2 at offset $3: could only read $4 bytes"),
diff --git a/common/thrift/metrics.json b/common/thrift/metrics.json
index bd30674..1ce9264 100644
--- a/common/thrift/metrics.json
+++ b/common/thrift/metrics.json
@@ -2151,6 +2151,16 @@
     "key": "tmp-file-mgr.scratch-space-bytes-used-high-water-mark"
   },
   {
+    "description": "The current total spilled bytes for a single scratch directory.",
+    "contexts": [
+      "IMPALAD"
+    ],
+    "label": "Per-directory scratch space bytes used",
+    "units": "BYTES",
+    "kind": "GAUGE",
+    "key": "tmp-file-mgr.scratch-space-bytes-used.dir-$0"
+  },
+  {
     "description": "Number of senders waiting for receiving fragment to initialize",
     "contexts": [
       "IMPALAD"
diff --git a/docs/topics/impala_mem_limit.xml b/docs/topics/impala_mem_limit.xml
index 2bd89e5..d61edf3 100644
--- a/docs/topics/impala_mem_limit.xml
+++ b/docs/topics/impala_mem_limit.xml
@@ -167,10 +167,10 @@ MEM_LIMIT set to 3mb
     </p>
 
 <codeblock rev="">
-[localhost:21000] > set mem_limit=3tb;
-MEM_LIMIT set to 3tb
+[localhost:21000] > set mem_limit=3pb;
+MEM_LIMIT set to 3pb
 [localhost:21000] > select 5;
-ERROR: Failed to parse query memory limit from '3tb'.
+ERROR: Failed to parse query memory limit from '3pb'.
 
 [localhost:21000] > set mem_limit=xyz;
 MEM_LIMIT set to xyz


[impala] 01/02: IMPALA-8627: Enable catalog-v2 in tests

Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 39613c8226aeb48f639bccc361f002c7085cf75a
Author: Vihang Karajgaonkar <vi...@cloudera.com>
AuthorDate: Fri Jul 26 16:55:33 2019 -0700

    IMPALA-8627: Enable catalog-v2 in tests
    
    This patch enables catalog-v2 by default in all the tests.
    
    Test fixes:
    1. Modified test_observability which fails on catalog-v2 since
    the profile emits different metadata load events. The test now looks for
    the right events on the profile depending on whether catalogv2 is
    enabled or not.
    2. TableName.java constructor allows non-lowercased
    table and database names. This causes problems at the local catalog
    cache which expects the tablenames to be always in lowercase. More
    details on this failure are available in IMPALA-8627. The patch makes
    sure that the loadTable requests in local catalog do a explicit
    conversion of tablename to lowercase in order to get around the issue.
    3. Fixes the JdbcTest which checks for existence of table comment in the
    getTables metadata jdbc call. In catalog-v2 since the columns are not
    requested, LocalTable is not loaded and hence the test needs to be
    modified to check if catalog-v2 is enabled.
    4. Skips test_sanity which creates a Hive db and issues a invalidate
    metadata to make it visible in catalog. Unfortunately, in catalog-v2
    currently there is no way to see a newly created database when event
    polling is disabled.
    5. Similar to above (4) test_metadata_query_statements.py creates a hive
    db and issues a invalidate metadata. The test runs QueryTest/describe-db
    which is split into two one for checking the hive-db and other contains
    rest of the queries of the original describe-db. The split makes it
    possible to only execute the test partially when catalog-v2 is enabled
    
    Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
    Reviewed-on: http://gerrit.cloudera.org:8080/13933
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 docker/catalogd/Dockerfile                         |  2 +-
 docker/impalad_coord_exec/Dockerfile               |  2 +-
 docker/impalad_coordinator/Dockerfile              |  2 +-
 .../org/apache/impala/catalog/local/LocalDb.java   |  5 +-
 .../java/org/apache/impala/service/JdbcTest.java   |  7 ++-
 .../java/org/apache/impala/testutil/TestUtils.java | 36 ++++++++++++
 .../queries/QueryTest/describe-db.test             | 21 -------
 .../queries/QueryTest/describe-hive-db.test        | 30 ++++++++++
 tests/common/environ.py                            | 15 ++++-
 tests/common/skip.py                               |  6 ++
 tests/hs2/hs2_test_suite.py                        | 67 ++++++++++++++++------
 tests/hs2/test_hs2.py                              | 19 +++---
 tests/metadata/test_hms_integration.py             |  5 +-
 tests/metadata/test_metadata_query_statements.py   | 13 ++++-
 tests/metadata/test_refresh_partition.py           |  2 +-
 tests/query_test/test_observability.py             | 44 ++++++++++++--
 16 files changed, 208 insertions(+), 68 deletions(-)

diff --git a/docker/catalogd/Dockerfile b/docker/catalogd/Dockerfile
index 5473c12..ce9ca5a 100644
--- a/docker/catalogd/Dockerfile
+++ b/docker/catalogd/Dockerfile
@@ -25,5 +25,5 @@ EXPOSE 25020
 ENTRYPOINT ["/opt/impala/bin/daemon_entrypoint.sh", "/opt/impala/bin/catalogd",\
      "-log_dir=/opt/impala/logs",\
      "-abort_on_config_error=false", "-state_store_host=statestored",\
-     "-catalog_topic_mode=full", "-hms_event_polling_interval_s=0",\
+     "-catalog_topic_mode=minimal", "-hms_event_polling_interval_s=0",\
      "-invalidate_tables_on_memory_pressure=true"]
diff --git a/docker/impalad_coord_exec/Dockerfile b/docker/impalad_coord_exec/Dockerfile
index 542734b..24e024f 100644
--- a/docker/impalad_coord_exec/Dockerfile
+++ b/docker/impalad_coord_exec/Dockerfile
@@ -32,4 +32,4 @@ ENTRYPOINT ["/opt/impala/bin/daemon_entrypoint.sh", "/opt/impala/bin/impalad",\
      "-log_dir=/opt/impala/logs",\
      "-abort_on_config_error=false", "-state_store_host=statestored",\
      "-catalog_service_host=catalogd", "-mem_limit_includes_jvm=true",\
-     "-use_local_catalog=false", "--rpc_use_loopback=true"]
+     "-use_local_catalog=true", "--rpc_use_loopback=true"]
diff --git a/docker/impalad_coordinator/Dockerfile b/docker/impalad_coordinator/Dockerfile
index b12814d..8e07f9f 100644
--- a/docker/impalad_coordinator/Dockerfile
+++ b/docker/impalad_coordinator/Dockerfile
@@ -32,5 +32,5 @@ ENTRYPOINT ["/opt/impala/bin/daemon_entrypoint.sh", "/opt/impala/bin/impalad",\
      "-log_dir=/opt/impala/logs",\
      "-abort_on_config_error=false", "-state_store_host=statestored",\
      "-catalog_service_host=catalogd", "-is_executor=false", \
-     "-mem_limit_includes_jvm=true", "-use_local_catalog=false", \
+     "-mem_limit_includes_jvm=true", "-use_local_catalog=true", \
      "--rpc_use_loopback=true"]
diff --git a/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java b/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
index 2dcce7a..f14e255 100644
--- a/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
+++ b/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
@@ -115,7 +115,10 @@ class LocalDb implements FeDb {
   }
 
   @Override
-  public FeTable getTable(String tblName) {
+  public FeTable getTable(String tableName) {
+    // the underlying layers of the cache expect all the table name to be in lowercase
+    String tblName = Preconditions.checkNotNull(tableName,
+        "Received a null table name").toLowerCase();
     FeTable tbl = getTableIfCached(tblName);
     if (tbl instanceof LocalIncompleteTable) {
       // The table exists but hasn't been loaded yet.
diff --git a/fe/src/test/java/org/apache/impala/service/JdbcTest.java b/fe/src/test/java/org/apache/impala/service/JdbcTest.java
index 0c91de6..e11feca 100644
--- a/fe/src/test/java/org/apache/impala/service/JdbcTest.java
+++ b/fe/src/test/java/org/apache/impala/service/JdbcTest.java
@@ -35,6 +35,7 @@ import java.util.List;
 import java.util.Map;
 
 import org.apache.impala.testutil.ImpalaJdbcClient;
+import org.apache.impala.testutil.TestUtils;
 import org.apache.impala.util.Metrics;
 import org.junit.AfterClass;
 import org.junit.Before;
@@ -414,7 +415,11 @@ public class JdbcTest extends JdbcTestBase {
     assertTrue(rs.next());
     assertEquals("Incorrect table name", "jdbc_column_comments_test",
         rs.getString("TABLE_NAME"));
-    assertEquals("Incorrect table comment", "table comment", rs.getString("REMARKS"));
+    // if this is a catalog-v2 cluster the getTables call does not load the localTable
+    // since it does not request columns. See IMPALA-8606 for more details
+    if (!TestUtils.isCatalogV2Enabled("localhost", 25020)) {
+      assertEquals("Incorrect table comment", "table comment", rs.getString("REMARKS"));
+    }
   }
 
   @Test
diff --git a/fe/src/test/java/org/apache/impala/testutil/TestUtils.java b/fe/src/test/java/org/apache/impala/testutil/TestUtils.java
index 9a1c852..3849b2e 100644
--- a/fe/src/test/java/org/apache/impala/testutil/TestUtils.java
+++ b/fe/src/test/java/org/apache/impala/testutil/TestUtils.java
@@ -17,15 +17,19 @@
 
 package org.apache.impala.testutil;
 
+import com.fasterxml.jackson.databind.ObjectMapper;
 import com.google.common.base.Preconditions;
+import java.io.IOException;
 import java.io.StringReader;
 import java.io.StringWriter;
+import java.net.URL;
 import java.text.SimpleDateFormat;
 import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.Calendar;
 import java.util.Collections;
 import java.util.Date;
+import java.util.LinkedHashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.Scanner;
@@ -393,4 +397,36 @@ public class TestUtils {
         "IMPALA_HIVE_MAJOR_VERSION"));
     return Integer.parseInt(hiveMajorVersion);
   }
+
+  /**
+   * Gets checks if the catalog server running on the given host and port has
+   * catalog-v2 enabled
+   * @return
+   * @throws IOException
+   */
+  public static boolean isCatalogV2Enabled(String host, int port) throws IOException {
+    Preconditions.checkNotNull(host);
+    Preconditions.checkState(port >= 0);
+    String topicMode = getConfigValue(new URL(String.format("http://%s:%s"
+            + "/varz?json", host, port)), "catalog_topic_mode");
+    Preconditions.checkNotNull(topicMode);
+    return topicMode.equals("minimal");
+  }
+
+  /**
+   * Gets a flag value from the given URL. Useful to scrubbing the catalog/coordinator
+   * varz json output to look for interesting configurations
+   */
+  private static String getConfigValue(URL url, String key) throws IOException {
+    Map<Object, Object> map = new ObjectMapper().readValue(url, Map.class);
+    if (map.containsKey("flags")) {
+      Preconditions.checkState(map.containsKey("flags"));
+      ArrayList<LinkedHashMap<String, String>> flags =
+          (ArrayList<LinkedHashMap<String, String>>) map.get("flags");
+      for (LinkedHashMap<String, String> flag : flags) {
+        if (flag.getOrDefault("name", "").equals(key)) return flag.get("current");
+      }
+    }
+    return null;
+  }
 }
diff --git a/testdata/workloads/functional-query/queries/QueryTest/describe-db.test b/testdata/workloads/functional-query/queries/QueryTest/describe-db.test
index 97159ab..c08ebdd 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/describe-db.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/describe-db.test
@@ -8,27 +8,6 @@ describe database default
 string, string, string
 ====
 ---- QUERY
-# Test printing of hive_test_desc_db database.
-describe database hive_test_desc_db
----- RESULTS
-'hive_test_desc_db','$NAMENODE/test-warehouse/hive_test_desc_db.db','test comment'
----- TYPES
-string, string, string
-====
----- QUERY
-# Test printing of hive_test_desc_db database with extended information.
-describe database extended hive_test_desc_db
----- RESULTS
-'hive_test_desc_db','$NAMENODE/test-warehouse/hive_test_desc_db.db','test comment'
-'Parameter: ','',''
-'','$USER','USER'
-'Owner: ','',''
-'','e','2.82'
-'','pi','3.14'
----- TYPES
-string, string, string
-====
----- QUERY
 describe database extended impala_test_desc_db1
 ---- RESULTS
 '','$USER','USER'
diff --git a/testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test b/testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test
new file mode 100644
index 0000000..10d9101
--- /dev/null
+++ b/testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test
@@ -0,0 +1,30 @@
+====
+---- QUERY
+# Test printing of default database.
+describe database default
+---- RESULTS
+'default','$NAMENODE/test-warehouse','Default Hive database'
+---- TYPES
+string, string, string
+====
+---- QUERY
+# Test printing of hive_test_desc_db database.
+describe database hive_test_desc_db
+---- RESULTS
+'hive_test_desc_db','$NAMENODE/test-warehouse/hive_test_desc_db.db','test comment'
+---- TYPES
+string, string, string
+====
+---- QUERY
+# Test printing of hive_test_desc_db database with extended information.
+describe database extended hive_test_desc_db
+---- RESULTS
+'hive_test_desc_db','$NAMENODE/test-warehouse/hive_test_desc_db.db','test comment'
+'Parameter: ','',''
+'','$USER','USER'
+'Owner: ','',''
+'','e','2.82'
+'','pi','3.14'
+---- TYPES
+string, string, string
+====
diff --git a/tests/common/environ.py b/tests/common/environ.py
index 64582a7..36b4a8a 100644
--- a/tests/common/environ.py
+++ b/tests/common/environ.py
@@ -333,8 +333,7 @@ class ImpalaTestClusterProperties(object):
     return self._runtime_flags
 
   def is_catalog_v2_cluster(self):
-    """Whether we use CATALOG_V2 options, including local catalog and HMS notifications.
-    For now, assume that --use_local_catalog=true implies that the others are enabled."""
+    """Checks whether we use local catalog."""
     try:
       key = "use_local_catalog"
       # --use_local_catalog is hidden so does not appear in JSON if disabled.
@@ -346,6 +345,18 @@ class ImpalaTestClusterProperties(object):
         return False
       raise
 
+  def is_event_polling_enabled(self):
+    """Whether we use HMS notifications to automatically refresh catalog service.
+    Checks if --hms_event_polling_interval_s is set to non-zero value"""
+    try:
+      key = "hms_event_polling_interval_s"
+      # --use_local_catalog is hidden so does not appear in JSON if disabled.
+      return key in self.runtime_flags and int(self.runtime_flags[key]["current"]) > 0
+    except Exception:
+      if self.is_remote_cluster():
+        # IMPALA-8553: be more tolerant of failures on remote cluster builds.
+        LOG.exception("Failed to get flags from web UI, assuming catalog V1")
+        return False
 
 def build_flavor_timeout(default_timeout, slow_build_timeout=None,
         asan_build_timeout=None, code_coverage_build_timeout=None):
diff --git a/tests/common/skip.py b/tests/common/skip.py
index 0547049..afad729 100644
--- a/tests/common/skip.py
+++ b/tests/common/skip.py
@@ -264,6 +264,12 @@ class SkipIfCatalogV2:
       reason="IMPALA-7539: support HDFS permission checks for LocalCatalog")
 
   @classmethod
+  def impala_7506(self):
+    return pytest.mark.skipif(
+      IMPALA_TEST_CLUSTER_PROPERTIES.is_catalog_v2_cluster(),
+      reason="IMPALA-7506: Support global INVALIDATE METADATA on fetch-on-demand impalad")
+
+  @classmethod
   def hms_event_polling_enabled(self):
     return pytest.mark.skipif(
       IMPALA_TEST_CLUSTER_PROPERTIES.is_catalog_v2_cluster(),
diff --git a/tests/hs2/hs2_test_suite.py b/tests/hs2/hs2_test_suite.py
index d627316..f89aaac 100644
--- a/tests/hs2/hs2_test_suite.py
+++ b/tests/hs2/hs2_test_suite.py
@@ -26,33 +26,62 @@ from thrift.protocol import TBinaryProtocol
 from tests.common.impala_test_suite import ImpalaTestSuite, IMPALAD_HS2_HOST_PORT
 from time import sleep, time
 
+
+def add_session_helper(self, protocol_version, conf_overlay, close_session, fn):
+  """Helper function used in the various needs_session decorators before to set up
+  a session, call fn(), then optionally tear down the session."""
+  open_session_req = TCLIService.TOpenSessionReq()
+  open_session_req.username = getuser()
+  open_session_req.configuration = dict()
+  if conf_overlay is not None:
+    open_session_req.configuration = conf_overlay
+  open_session_req.client_protocol = protocol_version
+  resp = self.hs2_client.OpenSession(open_session_req)
+  HS2TestSuite.check_response(resp)
+  self.session_handle = resp.sessionHandle
+  assert protocol_version <= resp.serverProtocolVersion
+  try:
+    fn()
+  finally:
+    if close_session:
+      close_session_req = TCLIService.TCloseSessionReq()
+      close_session_req.sessionHandle = resp.sessionHandle
+      HS2TestSuite.check_response(self.hs2_client.CloseSession(close_session_req))
+    self.session_handle = None
+
 def needs_session(protocol_version=
                   TCLIService.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V6,
                   conf_overlay=None,
-                  close_session=True):
+                  close_session=True,
+                  cluster_properties=None):
   def session_decorator(fn):
     """Decorator that establishes a session and sets self.session_handle. When the test is
     finished, the session is closed.
     """
     def add_session(self):
-      open_session_req = TCLIService.TOpenSessionReq()
-      open_session_req.username = getuser()
-      open_session_req.configuration = dict()
-      if conf_overlay is not None:
-        open_session_req.configuration = conf_overlay
-      open_session_req.client_protocol = protocol_version
-      resp = self.hs2_client.OpenSession(open_session_req)
-      HS2TestSuite.check_response(resp)
-      self.session_handle = resp.sessionHandle
-      assert protocol_version <= resp.serverProtocolVersion
-      try:
-        fn(self)
-      finally:
-        if close_session:
-          close_session_req = TCLIService.TCloseSessionReq()
-          close_session_req.sessionHandle = resp.sessionHandle
-          HS2TestSuite.check_response(self.hs2_client.CloseSession(close_session_req))
-        self.session_handle = None
+      add_session_helper(self, protocol_version, conf_overlay, close_session,
+          lambda: fn(self))
+    return add_session
+
+  return session_decorator
+
+
+# same as needs_session but takes in a cluster_properties as a argument
+# cluster_properties is defined as a fixture in conftest.py which allows us
+# to pass it as an argument to a test. However, it does not work well with
+# decorators without installing new modules.
+# Ref: https://stackoverflow.com/questions/19614658
+def needs_session_cluster_properties(protocol_version=
+                  TCLIService.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V6,
+                  conf_overlay=None,
+                  close_session=True):
+  def session_decorator(fn):
+    """Decorator that establishes a session and sets self.session_handle. When the test is
+    finished, the session is closed.
+    """
+    def add_session(self, cluster_properties, unique_database):
+      add_session_helper(self, protocol_version, conf_overlay, close_session,
+          lambda: fn(self, cluster_properties, unique_database))
     return add_session
 
   return session_decorator
diff --git a/tests/hs2/test_hs2.py b/tests/hs2/test_hs2.py
index 0294b13..223e29c 100644
--- a/tests/hs2/test_hs2.py
+++ b/tests/hs2/test_hs2.py
@@ -30,7 +30,7 @@ from tests.common.environ import ImpalaTestClusterProperties
 from tests.common.skip import SkipIfDockerizedCluster
 from tests.hs2.hs2_test_suite import (HS2TestSuite, needs_session,
     operation_id_to_query_id, create_session_handle_without_secret,
-    create_op_handle_without_secret)
+    create_op_handle_without_secret, needs_session_cluster_properties)
 from TCLIService import TCLIService
 
 LOG = logging.getLogger('test_hs2')
@@ -436,15 +436,13 @@ class TestHS2(HS2TestSuite):
         self.session_handle)
     TestHS2.check_invalid_session(self.hs2_client.GetSchemas(get_schemas_req))
 
-  @pytest.mark.execute_serially
-  @needs_session()
-  def test_get_tables(self):
+  @needs_session_cluster_properties()
+  def test_get_tables(self, cluster_properties, unique_database):
     """Basic test for the GetTables() HS2 method. Needs to execute serially because
     the test depends on controlling whether a table is loaded or not and other
     concurrent tests loading or invalidating tables could interfere with it."""
-    # TODO: unique_database would be better, but it doesn't work with @needs_session
-    # at the moment.
     table = "__hs2_column_comments_test"
+    self.execute_query("use {0}".format(unique_database))
     self.execute_query("drop table if exists {0}".format(table))
     self.execute_query("""
         create table {0} (a int comment 'column comment')
@@ -452,7 +450,7 @@ class TestHS2(HS2TestSuite):
     try:
       req = TCLIService.TGetTablesReq()
       req.sessionHandle = self.session_handle
-      req.schemaName = "default"
+      req.schemaName = unique_database
       req.tableName = table
 
       # Execute the request twice, the first time with the table unloaded and the second
@@ -470,13 +468,14 @@ class TestHS2(HS2TestSuite):
         table_type = results.columns[3].stringVal.values[0]
         table_remarks = results.columns[4].stringVal.values[0]
         assert table_cat == ''
-        assert table_schema == "default"
+        assert table_schema == unique_database
         assert table_name == table
         assert table_type == "TABLE"
         if i == 0:
           assert table_remarks == ""
         else:
-          assert table_remarks == "table comment"
+          if not cluster_properties.is_catalog_v2_cluster():
+            assert table_remarks == "table comment"
         # Ensure the table is loaded for the second iteration.
         self.execute_query("describe {0}".format(table))
 
@@ -484,7 +483,7 @@ class TestHS2(HS2TestSuite):
       invalid_req = TCLIService.TGetTablesReq()
       invalid_req.sessionHandle = create_session_handle_without_secret(
           self.session_handle)
-      invalid_req.schemaName = "default"
+      invalid_req.schemaName = unique_database
       invalid_req.tableName = table
       TestHS2.check_invalid_session(self.hs2_client.GetTables(invalid_req))
     finally:
diff --git a/tests/metadata/test_hms_integration.py b/tests/metadata/test_hms_integration.py
index f7d2595..9b1fd04 100644
--- a/tests/metadata/test_hms_integration.py
+++ b/tests/metadata/test_hms_integration.py
@@ -32,7 +32,7 @@ from subprocess import call
 from tests.common.environ import HIVE_MAJOR_VERSION
 from tests.common.impala_test_suite import ImpalaTestSuite
 from tests.common.skip import (SkipIfS3, SkipIfABFS, SkipIfADLS, SkipIfHive2,
-    SkipIfIsilon, SkipIfLocal)
+    SkipIfIsilon, SkipIfLocal, SkipIfCatalogV2)
 from tests.common.test_dimensions import (
     create_single_exec_option_dimension,
     create_uncompressed_text_dimension)
@@ -57,7 +57,10 @@ class TestHmsIntegrationSanity(ImpalaTestSuite):
     cls.ImpalaTestMatrix.add_dimension(
         create_uncompressed_text_dimension(cls.get_workload()))
 
+  # Skip this test if catalogv2 is enabled since global invalidate is not
+  # supported. #TODO This can be re-enabled when event polling is turned on
   @pytest.mark.execute_serially
+  @SkipIfCatalogV2.impala_7506()
   def test_sanity(self, vector, cluster_properties):
     """Verifies that creating a catalog entity (database, table) in Impala using
     'IF NOT EXISTS' while the entity exists in HMS, does not throw an error."""
diff --git a/tests/metadata/test_metadata_query_statements.py b/tests/metadata/test_metadata_query_statements.py
index 93fb111..313e2c3 100644
--- a/tests/metadata/test_metadata_query_statements.py
+++ b/tests/metadata/test_metadata_query_statements.py
@@ -169,13 +169,20 @@ class TestMetadataQueryStatements(ImpalaTestSuite):
                           "location \"" + get_fs_path("/test2.db") + "\"")
       self.run_stmt_in_hive("create database hive_test_desc_db comment 'test comment' "
                            "with dbproperties('pi' = '3.14', 'e' = '2.82')")
-      if cluster_properties.is_catalog_v2_cluster():
-        # Using local catalog + HMS event processor - wait until the database shows up.
+      if cluster_properties.is_event_polling_enabled():
+        # Using HMS event processor - wait until the database shows up.
         self.wait_for_db_to_appear("hive_test_desc_db", timeout_s=30)
-      else:
+      elif not cluster_properties.is_catalog_v2_cluster():
+        # Hive created database is visible
         # Using traditional catalog - need to invalidate to pick up hive-created db.
         self.client.execute("invalidate metadata")
+      else:
+        # In local catalog mode global invalidate metadata is not supported.
+        # TODO Once IMPALA-7506 is fixed, re-enable global invalidate for catalog-v2
+        pass
       self.run_test_case('QueryTest/describe-db', vector)
+      if not cluster_properties.is_catalog_v2_cluster():
+        self.run_test_case('QueryTest/describe-hive-db', vector)
     finally:
       self.__test_describe_db_cleanup()
 
diff --git a/tests/metadata/test_refresh_partition.py b/tests/metadata/test_refresh_partition.py
index dc197ef..67901da 100644
--- a/tests/metadata/test_refresh_partition.py
+++ b/tests/metadata/test_refresh_partition.py
@@ -134,7 +134,7 @@ class TestRefreshPartition(ImpalaTestSuite):
     # Make sure its still shows the same result before refreshing
     result = self.client.execute("select count(*) from %s" % table_name)
     valid_counts = [0]
-    if cluster_properties.is_catalog_v2_cluster():
+    if cluster_properties.is_event_polling_enabled():
       # HMS notifications may pick up added partition racily.
       valid_counts.append(1)
     assert int(result.data[0]) in valid_counts
diff --git a/tests/query_test/test_observability.py b/tests/query_test/test_observability.py
index 2012bab..fe2d92b 100644
--- a/tests/query_test/test_observability.py
+++ b/tests/query_test/test_observability.py
@@ -293,19 +293,51 @@ class TestObservability(ImpalaTestSuite):
     runtime_profile = self.execute_query(query).runtime_profile
     self.__verify_profile_event_sequence(event_regexes, runtime_profile)
 
-  def test_query_profile_contains_query_compilation_metadata_load_events(self):
+  def test_query_profile_contains_query_compilation_metadata_load_events(self,
+        cluster_properties):
     """Test that the Metadata load started and finished events appear in the query
     profile when Catalog cache is evicted."""
     invalidate_query = "invalidate metadata functional.alltypes"
     select_query = "select * from functional.alltypes"
     self.execute_query(invalidate_query).runtime_profile
     runtime_profile = self.execute_query(select_query).runtime_profile
-    event_regexes = [r'Query Compilation:',
-        r'Metadata load started:',
-        r'Metadata load finished. loaded-tables=.*/.* load-requests=.* '
+    # Depending on whether this is a catalog-v2 cluster or not some of the metadata
+    # loading events are different
+    if not cluster_properties.is_catalog_v2_cluster():
+      load_event_regexes = [r'Query Compilation:', r'Metadata load started:',
+          r'Metadata load finished. loaded-tables=.*/.* load-requests=.* '
             r'catalog-updates=.*:',
-        r'Analysis finished:']
-    self.__verify_profile_event_sequence(event_regexes, runtime_profile)
+          r'Analysis finished:']
+    else:
+      load_event_regexes = [
+        r'Frontend:',
+        r'CatalogFetch.ColumnStats.Misses',
+        r'CatalogFetch.ColumnStats.Requests',
+        r'CatalogFetch.ColumnStats.Time',
+        # The value of this counter may or not be present if it has a value of zero
+        r'CatalogFetch.Config.Misses|CatalogFetch.Config.Hits',
+        r'CatalogFetch.Config.Requests',
+        r'CatalogFetch.Config.Time',
+        r'CatalogFetch.DatabaseList.Hits',
+        r'CatalogFetch.DatabaseList.Requests',
+        r'CatalogFetch.DatabaseList.Time',
+        r'CatalogFetch.PartitionLists.Misses',
+        r'CatalogFetch.PartitionLists.Requests',
+        r'CatalogFetch.PartitionLists.Time',
+        r'CatalogFetch.Partitions.Hits',
+        r'CatalogFetch.Partitions.Misses',
+        r'CatalogFetch.Partitions.Requests',
+        r'CatalogFetch.Partitions.Time',
+        r'CatalogFetch.RPCs.Bytes',
+        r'CatalogFetch.RPCs.Requests',
+        r'CatalogFetch.RPCs.Time',
+        r'CatalogFetch.TableNames.Hits',
+        r'CatalogFetch.TableNames.Requests',
+        r'CatalogFetch.TableNames.Time',
+        r'CatalogFetch.Tables.Misses',
+        r'CatalogFetch.Tables.Requests',
+        r'CatalogFetch.Tables.Time']
+    self.__verify_profile_event_sequence(load_event_regexes, runtime_profile)
 
   def test_query_profile_contains_query_compilation_metadata_cached_event(self):
     """Test that the Metadata cache available event appears in the query profile when