You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jo...@apache.org on 2022/10/17 21:52:13 UTC

[impala] branch master updated (35dc24fbc -> dc912f016)

This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


    from 35dc24fbc IMPALA-10148: Cleanup cores in TestHooksStartupFail
     new 1b59d32ef Revert "IMPALA-11617: Pool service should be made aware of processing cost limit"
     new 5c5708114 IMPALA-10715: test decimal min max filters failed in exhaustive run
     new dc912f016 IMPALA-11655: Impala should set write mode "merge-on-read" by default

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/scheduling/admission-controller.cc          |  8 --
 be/src/scheduling/admission-controller.h           |  1 -
 common/thrift/ImpalaInternalService.thrift         |  5 --
 common/thrift/metrics.json                         | 10 ---
 docs/topics/impala_admission_config.xml            |  4 -
 .../apache/impala/analysis/CreateTableStmt.java    | 23 ++++++
 .../org/apache/impala/catalog/IcebergTable.java    |  6 ++
 .../apache/impala/service/CatalogOpExecutor.java   | 24 ++++++
 .../java/org/apache/impala/service/Frontend.java   |  4 -
 .../java/org/apache/impala/util/IcebergUtil.java   | 11 +++
 .../org/apache/impala/util/RequestPoolService.java | 10 +--
 .../apache/impala/util/TestRequestPoolService.java |  9 +--
 fe/src/test/resources/llama-site-2-groups.xml      | 10 ---
 fe/src/test/resources/llama-site-test.xml          |  4 -
 fe/src/test/resources/llama-site-test2.xml         |  4 -
 .../test/resources/mem-limit-test-llama-site.xml   | 24 ------
 fe/src/test/resources/minicluster-llama-site.xml   |  7 --
 .../queries/QueryTest/iceberg-alter.test           | 85 ++++++++++++++++++++++
 .../queries/QueryTest/show-create-table.test       | 74 +++++++++++++++++++
 tests/common/resource_pool_config.py               |  4 +-
 tests/custom_cluster/test_executor_groups.py       |  7 +-
 tests/query_test/test_runtime_filters.py           |  5 +-
 www/admission_controller.tmpl                      |  5 --
 23 files changed, 237 insertions(+), 107 deletions(-)


[impala] 01/03: Revert "IMPALA-11617: Pool service should be made aware of processing cost limit"

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 1b59d32eff725664a7a224d980f08ea4d245d63b
Author: Wenzhe Zhou <wz...@cloudera.com>
AuthorDate: Thu Oct 13 16:23:53 2022 +0000

    Revert "IMPALA-11617: Pool service should be made aware of processing cost limit"
    
    This reverts commit 1d62bddb84753a040a46687391d9c797d849bd0e.
    
    Change-Id: I1ebf5ff9685072079e18497d869d06b2c55153fe
    Reviewed-on: http://gerrit.cloudera.org:8080/19139
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
    Reviewed-by: Wenzhe Zhou <wz...@cloudera.com>
---
 be/src/scheduling/admission-controller.cc          |  8 --------
 be/src/scheduling/admission-controller.h           |  1 -
 common/thrift/ImpalaInternalService.thrift         |  5 -----
 common/thrift/metrics.json                         | 10 ---------
 docs/topics/impala_admission_config.xml            |  4 ----
 .../java/org/apache/impala/service/Frontend.java   |  4 ----
 .../org/apache/impala/util/RequestPoolService.java | 10 ++-------
 .../apache/impala/util/TestRequestPoolService.java |  9 ++++----
 fe/src/test/resources/llama-site-2-groups.xml      | 10 ---------
 fe/src/test/resources/llama-site-test.xml          |  4 ----
 fe/src/test/resources/llama-site-test2.xml         |  4 ----
 .../test/resources/mem-limit-test-llama-site.xml   | 24 ----------------------
 fe/src/test/resources/minicluster-llama-site.xml   |  7 -------
 tests/common/resource_pool_config.py               |  4 +---
 tests/custom_cluster/test_executor_groups.py       |  7 +++----
 www/admission_controller.tmpl                      |  5 -----
 16 files changed, 10 insertions(+), 106 deletions(-)

diff --git a/be/src/scheduling/admission-controller.cc b/be/src/scheduling/admission-controller.cc
index c2957baa9..1e31813ba 100644
--- a/be/src/scheduling/admission-controller.cc
+++ b/be/src/scheduling/admission-controller.cc
@@ -139,8 +139,6 @@ const string POOL_MIN_QUERY_MEM_LIMIT_METRIC_KEY_FORMAT =
   "admission-controller.pool-min-query-mem-limit.$0";
 const string POOL_CLAMP_MEM_LIMIT_QUERY_OPTION_METRIC_KEY_FORMAT =
   "admission-controller.pool-clamp-mem-limit-query-option.$0";
-const string POOL_MAX_QUERY_PROCESSING_COST_LIMIT_METRIC_KEY_FORMAT =
-  "admission-controller.pool-max-query-processing-cost-limit.$0";
 
 // Profile info strings
 const string AdmissionController::PROFILE_INFO_KEY_ADMISSION_RESULT = "Admission result";
@@ -1163,8 +1161,6 @@ void AdmissionController::PoolStats::UpdateConfigMetrics(const TPoolConfig& pool
   metrics_.max_query_mem_limit->SetValue(pool_cfg.max_query_mem_limit);
   metrics_.min_query_mem_limit->SetValue(pool_cfg.min_query_mem_limit);
   metrics_.clamp_mem_limit_query_option->SetValue(pool_cfg.clamp_mem_limit_query_option);
-  metrics_.max_query_processing_cost_limit->SetValue(
-      pool_cfg.max_query_processing_cost_limit);
 }
 
 Status AdmissionController::SubmitForAdmission(const AdmissionRequest& request,
@@ -2351,8 +2347,6 @@ void AdmissionController::PoolStats::ToJson(
       document->GetAllocator());
   pool->AddMember("clamp_mem_limit_query_option",
       metrics_.clamp_mem_limit_query_option->GetValue(), document->GetAllocator());
-  pool->AddMember("max_query_processing_cost_limit",
-      metrics_.max_query_processing_cost_limit->GetValue(), document->GetAllocator());
   pool->AddMember("wait_time_ms_ema", wait_time_ms_ema_, document->GetAllocator());
   Value histogram(kArrayType);
   for (int bucket = 0; bucket < peak_mem_histogram_.size(); bucket++) {
@@ -2437,8 +2431,6 @@ void AdmissionController::PoolStats::InitMetrics() {
       POOL_MIN_QUERY_MEM_LIMIT_METRIC_KEY_FORMAT, 0, name_);
   metrics_.clamp_mem_limit_query_option = parent_->metrics_group_->AddProperty<bool>(
       POOL_CLAMP_MEM_LIMIT_QUERY_OPTION_METRIC_KEY_FORMAT, false, name_);
-  metrics_.max_query_processing_cost_limit = parent_->metrics_group_->AddGauge(
-      POOL_MAX_QUERY_PROCESSING_COST_LIMIT_METRIC_KEY_FORMAT, 0, name_);
 }
 
 void AdmissionController::PopulatePerHostMemReservedAndAdmitted(
diff --git a/be/src/scheduling/admission-controller.h b/be/src/scheduling/admission-controller.h
index f8914c079..fec28d5d4 100644
--- a/be/src/scheduling/admission-controller.h
+++ b/be/src/scheduling/admission-controller.h
@@ -543,7 +543,6 @@ class AdmissionController {
       IntGauge* max_query_mem_limit;
       IntGauge* min_query_mem_limit;
       BooleanProperty* clamp_mem_limit_query_option;
-      IntGauge* max_query_processing_cost_limit;
     };
 
     PoolStats(AdmissionController* parent, const std::string& name)
diff --git a/common/thrift/ImpalaInternalService.thrift b/common/thrift/ImpalaInternalService.thrift
index 59949db26..1f95d7284 100644
--- a/common/thrift/ImpalaInternalService.thrift
+++ b/common/thrift/ImpalaInternalService.thrift
@@ -195,11 +195,6 @@ struct TPoolConfig {
   // maximum, the mt_dop setting is reduced to the maximum. If the max_mt_dop is
   // negative, no limit is enforced.
   9: required i64 max_mt_dop = -1;
-
-  // Maximum processing cost which is the amount of data in bytes to process for a query.
-  // A value of 0 effectively disables the pool. -1 indicates no limit. Default value is
-  // set as -1.
-  10: required i64 max_query_processing_cost_limit = -1;
 }
 
 struct TParseDateStringResult {
diff --git a/common/thrift/metrics.json b/common/thrift/metrics.json
index 5321921d4..a5abd6651 100644
--- a/common/thrift/metrics.json
+++ b/common/thrift/metrics.json
@@ -89,16 +89,6 @@
     "kind": "PROPERTY",
     "key": "admission-controller.pool-clamp-mem-limit-query-option.$0"
   },
-  {
-    "description": "Resource Pool $0 Max Query Processing Cost Limit",
-    "contexts": [
-      "RESOURCE_POOL"
-    ],
-    "label": "Resource Pool $0 Max Query Processing Cost Limit",
-    "units": "BYTES",
-    "kind": "GAUGE",
-    "key": "admission-controller.pool-max-query-processing-cost-limit.$0"
-  },
   {
     "description": "Resource Pool $0 Aggregate Queue Size",
     "contexts": [
diff --git a/docs/topics/impala_admission_config.xml b/docs/topics/impala_admission_config.xml
index 846226afc..5b6ecf5de 100644
--- a/docs/topics/impala_admission_config.xml
+++ b/docs/topics/impala_admission_config.xml
@@ -224,10 +224,6 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph
     &lt;name>impala.admission-control.<b>clamp-mem-limit-query-option</b>.root.default.regularPool&lt;/name>
     &lt;value>true&lt;/value>
   &lt;/property>
-  &lt;property>
-    &lt;name>impala.admission-control.<b>max-query-processing-cost-limit</b>.root.default.regularPool&lt;/name>
-    &lt;value>1610612736&lt;/value>&lt;!--1.5GB-->
-  &lt;/property>
 &lt;/configuration>
 </codeblock>
 
diff --git a/fe/src/main/java/org/apache/impala/service/Frontend.java b/fe/src/main/java/org/apache/impala/service/Frontend.java
index a5d523b29..eeb5b59ea 100644
--- a/fe/src/main/java/org/apache/impala/service/Frontend.java
+++ b/fe/src/main/java/org/apache/impala/service/Frontend.java
@@ -1838,10 +1838,6 @@ public class Frontend {
         Preconditions.checkNotNull(poolConfig);
         new_entry.setMax_mem_limit(poolConfig.getMax_query_mem_limit() > 0 ?
             poolConfig.getMax_query_mem_limit() : Long.MAX_VALUE);
-        // TODO IMPALA-11604: set max_processing_cost_limit for executor group set
-        // new_entry.setMax_processing_cost_limit(
-        //   poolConfig.getMax_query_processing_cost_limit() >= 0 ?
-        //   poolConfig.getMax_query_processing_cost_limit() : Long.MAX_VALUE);
       } else {
         // Set to max possible threshold value when there is no pool service
         new_entry.setMax_mem_limit(Long.MAX_VALUE);
diff --git a/fe/src/main/java/org/apache/impala/util/RequestPoolService.java b/fe/src/main/java/org/apache/impala/util/RequestPoolService.java
index 18ea505a9..828fe3792 100644
--- a/fe/src/main/java/org/apache/impala/util/RequestPoolService.java
+++ b/fe/src/main/java/org/apache/impala/util/RequestPoolService.java
@@ -123,10 +123,6 @@ public class RequestPoolService {
   // Key for specifying the "Max mt_dop" configuration of the pool
   private final static String MAX_MT_DOP = "impala.admission-control.max-mt-dop";
 
-  // Keys for the pool max query processing cost limits (in bytes).
-  private final static String MAX_QUERY_PROCESSING_COST_LIMIT_BYTES =
-      "impala.admission-control.max-query-processing-cost-limit";
-
   // String format for a per-pool configuration key. First parameter is the key for the
   // default, e.g. MAX_PLACED_RESERVATIONS_KEY, and the second parameter is the
   // pool name.
@@ -392,18 +388,16 @@ public class RequestPoolService {
           getPoolConfigValue(currentConf, pool, CLAMP_MEM_LIMIT_QUERY_OPTION, true));
       result.setMax_mt_dop(
           getPoolConfigValue(currentConf, pool, MAX_MT_DOP, -1));
-      result.setMax_query_processing_cost_limit(getPoolConfigValue(
-          currentConf, pool, MAX_QUERY_PROCESSING_COST_LIMIT_BYTES, -1));
     }
     if (LOG.isTraceEnabled()) {
       LOG.debug("getPoolConfig(pool={}): max_mem_resources={}, max_requests={},"
               + " max_queued={},  queue_timeout_ms={}, default_query_options={},"
               + " max_query_mem_limit={}, min_query_mem_limit={},"
-              + " clamp_mem_limit_query_option={}, max_query_processing_cost_limit={}",
+              + " clamp_mem_limit_query_option={}",
           pool, result.max_mem_resources, result.max_requests, result.max_queued,
           result.queue_timeout_ms, result.default_query_options,
           result.max_query_mem_limit, result.min_query_mem_limit,
-          result.clamp_mem_limit_query_option, result.max_query_processing_cost_limit);
+          result.clamp_mem_limit_query_option);
     }
     return result;
   }
diff --git a/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java b/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
index be4bdc889..efc9412ab 100644
--- a/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
+++ b/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
@@ -205,7 +205,7 @@ public class TestRequestPoolService {
         10000L, "mem_limit=1024m,query_timeout_s=10");
     checkPoolConfigResult("root.queueB", 5, 10, -1, 30000L, "mem_limit=1024m");
     checkPoolConfigResult("root.queueC", 5, 10, 1024 * ByteUnits.MEGABYTE, 30000L,
-            "mem_limit=1024m", 1000, 10, false, 1000);
+            "mem_limit=1024m", 1000, 10, false);
   }
 
   @Test
@@ -213,7 +213,7 @@ public class TestRequestPoolService {
     createPoolService(ALLOCATION_FILE_EMPTY, LLAMA_CONFIG_FILE_EMPTY);
     Assert.assertEquals("root.userA", poolService_.assignToPool("", "userA"));
     Assert.assertTrue(poolService_.hasAccess("root.userA", "userA"));
-    checkPoolConfigResult("root", -1, 200, -1, null, "", 0, 0, true, -1);
+    checkPoolConfigResult("root", -1, 200, -1, null, "", 0, 0, true);
   }
 
   @Ignore("IMPALA-4868") @Test
@@ -309,7 +309,7 @@ public class TestRequestPoolService {
   private void checkPoolConfigResult(String pool, long expectedMaxRequests,
       long expectedMaxQueued, long expectedMaxMem, Long expectedQueueTimeoutMs,
       String expectedQueryOptions, long max_query_mem_limit, long min_query_mem_limit,
-      boolean clamp_mem_limit_query_option, long max_query_processing_cost_limit) {
+      boolean clamp_mem_limit_query_option) {
     TPoolConfig expectedResult = new TPoolConfig();
     expectedResult.setMax_requests(expectedMaxRequests);
     expectedResult.setMax_queued(expectedMaxQueued);
@@ -317,7 +317,6 @@ public class TestRequestPoolService {
     expectedResult.setMax_query_mem_limit(max_query_mem_limit);
     expectedResult.setMin_query_mem_limit(min_query_mem_limit);
     expectedResult.setClamp_mem_limit_query_option(clamp_mem_limit_query_option);
-    expectedResult.setMax_query_processing_cost_limit(max_query_processing_cost_limit);
     if (expectedQueueTimeoutMs != null) {
       expectedResult.setQueue_timeout_ms(expectedQueueTimeoutMs);
     }
@@ -332,7 +331,7 @@ public class TestRequestPoolService {
       long expectedMaxQueued, long expectedMaxMem, Long expectedQueueTimeoutMs,
       String expectedQueryOptions) {
     checkPoolConfigResult(pool, expectedMaxRequests, expectedMaxQueued, expectedMaxMem,
-        expectedQueueTimeoutMs, expectedQueryOptions, 0, 0, true, -1);
+        expectedQueueTimeoutMs, expectedQueryOptions, 0, 0, true);
   }
 
   private void checkPoolConfigResult(String pool, long expectedMaxRequests,
diff --git a/fe/src/test/resources/llama-site-2-groups.xml b/fe/src/test/resources/llama-site-2-groups.xml
index 5328edcb2..1a48a847c 100644
--- a/fe/src/test/resources/llama-site-2-groups.xml
+++ b/fe/src/test/resources/llama-site-2-groups.xml
@@ -23,11 +23,6 @@
    <!-- 0MB -->
     <value>0</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.small</name>
-    <!-- 64 MB -->
-    <value>67108864</value>
-  </property>
   <property>
     <name>impala.admission-control.max-query-mem-limit.root.large</name>
    <!-- 8 PB -->
@@ -38,9 +33,4 @@
    <!-- 64MB+1 -->
     <value>67108865</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.large</name>
-    <!-- 8 PB -->
-    <value>9007199254740992</value>
-  </property>
 </configuration>
diff --git a/fe/src/test/resources/llama-site-test.xml b/fe/src/test/resources/llama-site-test.xml
index 47c61b281..373870516 100644
--- a/fe/src/test/resources/llama-site-test.xml
+++ b/fe/src/test/resources/llama-site-test.xml
@@ -55,8 +55,4 @@
     <name>impala.admission-control.clamp-mem-limit-query-option.root.queueC</name>
     <value>false</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.queueC</name>
-    <value>1000</value>
-  </property>
 </configuration>
diff --git a/fe/src/test/resources/llama-site-test2.xml b/fe/src/test/resources/llama-site-test2.xml
index b2fa18761..8890d994d 100644
--- a/fe/src/test/resources/llama-site-test2.xml
+++ b/fe/src/test/resources/llama-site-test2.xml
@@ -65,10 +65,6 @@
     <name>impala.admission-control.min-query-mem-limit.root.queueD</name>
     <value>10</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.queueD</name>
-    <value>62914560</value>
-  </property>
   <property>
     <name>llama.am.throttling.maximum.placed.reservations.root.queueD</name>
     <value>6</value>
diff --git a/fe/src/test/resources/mem-limit-test-llama-site.xml b/fe/src/test/resources/mem-limit-test-llama-site.xml
index a36135ed4..2c4838838 100644
--- a/fe/src/test/resources/mem-limit-test-llama-site.xml
+++ b/fe/src/test/resources/mem-limit-test-llama-site.xml
@@ -13,10 +13,6 @@
     <name>impala.admission-control.clamp-mem-limit-query-option.root.regularPoolWithoutClamping</name>
     <value>false</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.regularPoolWithoutClamping</name>
-    <value>1610612736</value><!--1.5GB-->
-  </property>
   <!--poolLowMinLimit pool config-->
   <property>
     <name>impala.admission-control.min-query-mem-limit.root.poolLowMinLimit</name>
@@ -40,10 +36,6 @@
     <name>impala.admission-control.clamp-mem-limit-query-option.root.regularPool</name>
     <value>true</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.regularPool</name>
-    <value>1610612736</value><!--1.5GB-->
-  </property>
   <!--regularPoolNoMinLimit pool config-->
   <property>
     <name>impala.admission-control.max-query-mem-limit.root.regularPoolNoMinLimit</name>
@@ -57,10 +49,6 @@
     <name>impala.admission-control.clamp-mem-limit-query-option.root.regularPoolNoMinLimit</name>
     <value>true</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.regularPoolNoMinLimit</name>
-    <value>1610612736</value><!--1.5GB-->
-  </property>
   <!--poolNoMemLimits pool config-->
   <property>
     <name>impala.admission-control.max-query-mem-limit.root.poolNoMemLimits</name>
@@ -70,10 +58,6 @@
     <name>impala.admission-control.min-query-mem-limit.root.poolNoMemLimits</name>
     <value>0</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.poolNoMemLimits</name>
-    <value>-1</value>
-  </property>
   <!--maxLessThanMinLimit pool config-->
   <property>
     <name>impala.admission-control.max-query-mem-limit.root.maxLessThanMinLimit</name>
@@ -83,10 +67,6 @@
     <name>impala.admission-control.min-query-mem-limit.root.maxLessThanMinLimit</name>
     <value>100001</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.maxLessThanMinLimit</name>
-    <value>100000</value>
-  </property>
   <!--maxMemLessThanMinLimit pool config-->
   <property>
     <name>impala.admission-control.min-query-mem-limit.root.maxMemLessThanMinLimit</name>
@@ -101,10 +81,6 @@
     <name>impala.admission-control.min-query-mem-limit.root.invalidTestPool</name>
     <value>26214400</value>
   </property>
-  <property>
-    <name>impala.admission-control.max-query-processing-cost-limit.root.invalidTestPool</name>
-    <value>-1</value>
-  </property>
   <property>
     <name>llama.am.throttling.maximum.placed.reservations.root.invalidTestPool</name>
     <value>1</value>
diff --git a/fe/src/test/resources/minicluster-llama-site.xml b/fe/src/test/resources/minicluster-llama-site.xml
index ec5194634..affc9a484 100644
--- a/fe/src/test/resources/minicluster-llama-site.xml
+++ b/fe/src/test/resources/minicluster-llama-site.xml
@@ -19,13 +19,6 @@
     <name>impala.admission-control.clamp-mem-limit-query-option.root.default</name>
     <value>false</value>
   </property>
-  <property>
-    <!-- Max processing cost given to queries by default. This will allow
-         running one large query and multiple small queries on a typical
-         minicluster where each impalad has ~7gb of memory. -->
-    <name>impala.admission-control.max-query-processing-cost-limit.root.default</name>
-    <value>4294967296</value><!--4GB-->
-  </property>
   <!-- We need to increase the pool queue timeout to avoid flakiness from queries
        getting stuck behind queries from other tests and timed out. Set to a
        very high value to avoid failures unless queries are genuinely stuck. -->
diff --git a/tests/common/resource_pool_config.py b/tests/common/resource_pool_config.py
index 47e9b9827..47a37b5f3 100644
--- a/tests/common/resource_pool_config.py
+++ b/tests/common/resource_pool_config.py
@@ -29,9 +29,7 @@ class ResourcePoolConfig(object):
 
   # Mapping of config strings used in the llama_site file with those used on the impala
   # metrics debug page. Add to this dictionary if other configs are need for tests.
-  CONFIG_TO_METRIC_STR_MAPPING = {
-      'max-query-mem-limit': 'pool-max-query-mem-limit',
-      'max-query-processing-cost-limit': 'pool-max-query-processing-cost-limit'}
+  CONFIG_TO_METRIC_STR_MAPPING = {'max-query-mem-limit': 'pool-max-query-mem-limit'}
 
   """'impala_service' should point to an impalad to be used for running queries.
   'ac_service' should point to the service running the admission controller and is used
diff --git a/tests/custom_cluster/test_executor_groups.py b/tests/custom_cluster/test_executor_groups.py
index 2603948cf..e0f945e1d 100644
--- a/tests/custom_cluster/test_executor_groups.py
+++ b/tests/custom_cluster/test_executor_groups.py
@@ -695,10 +695,9 @@ class TestExecutorGroups(CustomClusterTestSuite):
                                  "resources")
     # Define two group sets: small and large
     fs_allocation_path = os.path.join(RESOURCES_DIR, "fair-scheduler-2-groups.xml")
-    # Define the min-query-mem-limit, max-query-mem-limit and
-    # max-query-processing-cost-limit properties of the two sets:
-    # small: [0, 64MB, 64MB]
-    # large: [64MB+1Byte, 8PB, 8PB]
+    # Define the min-query-mem-limit and max-query-mem-limit properties of the two sets:
+    # small: [0, 64MB]
+    # large: [64MB+1Byte, 8PB]
     llama_site_path = os.path.join(RESOURCES_DIR, "llama-site-2-groups.xml")
     # Start with a regular admission config with multiple pools and no resource limits.
     self._restart_coordinators(num_coordinators=1,
diff --git a/www/admission_controller.tmpl b/www/admission_controller.tmpl
index be97886bd..b51c3dab1 100644
--- a/www/admission_controller.tmpl
+++ b/www/admission_controller.tmpl
@@ -36,7 +36,6 @@ Example of json received from the impala server
             "max_query_mem_limit": 0,
             "min_query_mem_limit": 0,
             "clamp_mem_limit_query_option": true,
-            "max_query_processing_cost_limit": -1,
             "wait_time_ms_EMA": 0.0,
             "histogram": [
                 [
@@ -262,10 +261,6 @@ Time since last statestore update containing admission control topic state (ms):
       <td>Clamp MEM_LIMIT query option</td>
       <td>{{clamp_mem_limit_query_option}}</td>
     </tr>
-    <tr>
-      <td>Query processing cost</td>
-      <td>{{max_query_processing_cost_limit}}</td>
-    </tr>
   </table>
 
   <h4>Queued queries in order of being queued (submitted to this coordinator)</h4>


[impala] 03/03: IMPALA-11655: Impala should set write mode "merge-on-read" by default

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit dc912f016e4804331e5a9fe2960f8c89376195b3
Author: Zoltan Borok-Nagy <bo...@cloudera.com>
AuthorDate: Thu Oct 13 18:02:46 2022 +0200

    IMPALA-11655: Impala should set write mode "merge-on-read" by default
    
    Similarly to HIVE-26596 Impala should set merge-on-read write mode
    for V2 tables, unless otherwise specified:
    
    * during table creation with 'format-version'='2'
    * during alter table set tblproperties 'format-version'='2'
    
    We do so because in the foreseeable future Impala will only support
    merge-on-read (on the write-side, on the read side copy-on-write is
    also supported). Also, currently Hive only supports merge-on-read.
    
    Testing:
     * e2e tests added
    
    Change-Id: Icaa32472cde98e21fb23f5461175db1bf401db3d
    Reviewed-on: http://gerrit.cloudera.org:8080/19138
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 .../apache/impala/analysis/CreateTableStmt.java    | 23 ++++++
 .../org/apache/impala/catalog/IcebergTable.java    |  6 ++
 .../apache/impala/service/CatalogOpExecutor.java   | 24 ++++++
 .../java/org/apache/impala/util/IcebergUtil.java   | 11 +++
 .../queries/QueryTest/iceberg-alter.test           | 85 ++++++++++++++++++++++
 .../queries/QueryTest/show-create-table.test       | 74 +++++++++++++++++++
 6 files changed, 223 insertions(+)

diff --git a/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java b/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
index c011b3bfd..819981a84 100644
--- a/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
+++ b/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
@@ -624,6 +624,7 @@ public class CreateTableStmt extends StatementBase {
     putGeneratedProperty(IcebergTable.KEY_STORAGE_HANDLER,
         IcebergTable.ICEBERG_STORAGE_HANDLER);
     putGeneratedProperty(TableProperties.ENGINE_HIVE_ENABLED, "true");
+    addMergeOnReadPropertiesIfNeeded();
 
     String fileformat = getTblProperties().get(IcebergTable.ICEBERG_FILE_FORMAT);
     TIcebergFileFormat icebergFileFormat = IcebergUtil.getIcebergFileFormat(fileformat);
@@ -652,6 +653,28 @@ public class CreateTableStmt extends StatementBase {
     validateIcebergTableProperties(catalog);
   }
 
+  /**
+   * When creating an Iceberg table that supports row-level modifications
+   * (format-version >= 2) we set write modes to "merge-on-read" which is the write
+   * mode Impala will eventually support (IMPALA-11664).
+   */
+  private void addMergeOnReadPropertiesIfNeeded() {
+    Map<String, String> tblProps = getTblProperties();
+    String formatVersion = tblProps.get(TableProperties.FORMAT_VERSION);
+    if (formatVersion == null ||
+        Integer.valueOf(formatVersion) < IcebergTable.ICEBERG_FORMAT_V2) {
+      return;
+    }
+
+    // Only add "merge-on-read" if none of the write modes are specified.
+    final String MERGE_ON_READ = IcebergTable.MERGE_ON_READ;
+    if (!IcebergUtil.isAnyWriteModeSet(tblProps)) {
+      putGeneratedProperty(TableProperties.DELETE_MODE, MERGE_ON_READ);
+      putGeneratedProperty(TableProperties.UPDATE_MODE, MERGE_ON_READ);
+      putGeneratedProperty(TableProperties.MERGE_MODE, MERGE_ON_READ);
+    }
+  }
+
   private void validateIcebergParquetCompressionCodec(
       TIcebergFileFormat icebergFileFormat) throws AnalysisException {
     if (icebergFileFormat != TIcebergFileFormat.PARQUET) {
diff --git a/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java b/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
index 2b36bfa55..b10449d23 100644
--- a/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
@@ -75,6 +75,10 @@ public class IcebergTable extends Table implements FeIcebergTable {
   // Iceberg catalog type key in tblproperties
   public static final String ICEBERG_CATALOG = "iceberg.catalog";
 
+  // Iceberg format version numbers
+  public static final int ICEBERG_FORMAT_V1 = 1;
+  public static final int ICEBERG_FORMAT_V2 = 2;
+
   // Iceberg table catalog location key in tblproperties when using HadoopCatalog
   // This property is necessary for both managed and external Iceberg table with
   // 'hadoop.catalog'
@@ -94,6 +98,8 @@ public class IcebergTable extends Table implements FeIcebergTable {
   public static final String PARQUET_COMPRESSION_LEVEL =
       "write.parquet.compression-level";
 
+  public static final String MERGE_ON_READ = "merge-on-read";
+
   // Default values for parquet compression codec.
   public static final THdfsCompression DEFAULT_PARQUET_COMPRESSION_CODEC =
       THdfsCompression.SNAPPY;
diff --git a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
index cc734318f..04e525c6d 100755
--- a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
+++ b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
@@ -80,6 +80,7 @@ import org.apache.hadoop.hive.metastore.api.SerDeInfo;
 import org.apache.hadoop.hive.metastore.api.StorageDescriptor;
 import org.apache.hadoop.io.IOUtils;
 import org.apache.hadoop.util.StringUtils;
+import org.apache.iceberg.TableProperties;
 import org.apache.iceberg.catalog.TableIdentifier;
 import org.apache.iceberg.mr.Catalogs;
 import org.apache.impala.analysis.AlterTableSortByStmt;
@@ -1412,10 +1413,33 @@ public class CatalogOpExecutor {
     TAlterTableSetTblPropertiesParams setPropsParams =
         params.getSet_tbl_properties_params();
     if (setPropsParams.getTarget() != TTablePropertyType.TBL_PROPERTY) return false;
+
+    addMergeOnReadPropertiesIfNeeded(tbl, setPropsParams.getProperties());
     IcebergCatalogOpExecutor.setTblProperties(iceTxn, setPropsParams.getProperties());
     return true;
   }
 
+  /**
+   * Iceberg format from V2 supports row-level modifications. We set write modes to
+   * "merge-on-read" which is the write mode Impala will eventually
+   * support (IMPALA-11664). Unless the user specified otherwise in the table properties.
+   */
+  private void addMergeOnReadPropertiesIfNeeded(IcebergTable tbl,
+      Map<String, String> properties) {
+    String formatVersion = properties.get(TableProperties.FORMAT_VERSION);
+    if (formatVersion == null ||
+        Integer.valueOf(formatVersion) < IcebergTable.ICEBERG_FORMAT_V2) {
+      return;
+    }
+    if (!IcebergUtil.isAnyWriteModeSet(properties) &&
+        !IcebergUtil.isAnyWriteModeSet(tbl.getMetaStoreTable().getParameters())) {
+      final String MERGE_ON_READ = IcebergTable.MERGE_ON_READ;
+      properties.put(TableProperties.DELETE_MODE, MERGE_ON_READ);
+      properties.put(TableProperties.UPDATE_MODE, MERGE_ON_READ);
+      properties.put(TableProperties.MERGE_MODE, MERGE_ON_READ);
+    }
+  }
+
   /**
    * Unsets table properties for an Iceberg table. Returns true on success, returns false
    * if the operation is not applicable at the Iceberg table level, e.g. setting SERDE
diff --git a/fe/src/main/java/org/apache/impala/util/IcebergUtil.java b/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
index b4b4a7bda..f3185c4bb 100644
--- a/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
+++ b/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
@@ -51,6 +51,7 @@ import org.apache.iceberg.Schema;
 import org.apache.iceberg.Snapshot;
 import org.apache.iceberg.StructLike;
 import org.apache.iceberg.Table;
+import org.apache.iceberg.TableProperties;
 import org.apache.iceberg.TableScan;
 import org.apache.iceberg.Transaction;
 import org.apache.iceberg.catalog.TableIdentifier;
@@ -832,6 +833,16 @@ public class IcebergUtil {
     return IcebergTable.UNSET_PARQUET_ROW_GROUP_SIZE;
   }
 
+  /**
+   * @return true if any of the write modes is being set in 'tblProperties'.
+   */
+  public static boolean isAnyWriteModeSet(Map<String, String> tblProperties) {
+    return
+        tblProperties.get(TableProperties.DELETE_MODE) != null ||
+        tblProperties.get(TableProperties.UPDATE_MODE) != null ||
+        tblProperties.get(TableProperties.MERGE_MODE) != null;
+  }
+
   public static Long parseParquetPageSize(Map<String, String> tblProperties,
       String property, String descr, StringBuilder errMsg) {
     if (tblProperties.containsKey(property)) {
diff --git a/testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test b/testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
index 12518af2b..df2e7adb9 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
@@ -351,3 +351,88 @@ DESCRIBE FORMATTED iceberg_changing_parq_tblprops;
 ---- TYPES
 string, string, string
 ====
+---- QUERY
+CREATE TABLE iceberg_upgrade_v2_no_write_mode (i INT) STORED AS ICEBERG;
+DESCRIBE FORMATTED iceberg_upgrade_v2_no_write_mode;
+---- RESULTS: VERIFY_IS_NOT_IN
+'','write.delete.mode   ','merge-on-read       '
+'','write.update.mode   ','merge-on-read       '
+'','write.merge.mode    ','merge-on-read       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+# Setting format-version to 1 doesn't add write modes.
+ALTER TABLE iceberg_upgrade_v2_no_write_mode SET TBLPROPERTIES('format-version'='1');
+DESCRIBE FORMATTED iceberg_upgrade_v2_no_write_mode;
+---- RESULTS: VERIFY_IS_NOT_IN
+'','write.delete.mode   ','merge-on-read       '
+'','write.update.mode   ','merge-on-read       '
+'','write.merge.mode    ','merge-on-read       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+ALTER TABLE iceberg_upgrade_v2_no_write_mode SET TBLPROPERTIES('format-version'='2');
+DESCRIBE FORMATTED iceberg_upgrade_v2_no_write_mode;
+---- RESULTS: VERIFY_IS_SUBSET
+'','write.delete.mode   ','merge-on-read       '
+'','write.update.mode   ','merge-on-read       '
+'','write.merge.mode    ','merge-on-read       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+CREATE TABLE iceberg_upgrade_v2_delete_mode (i INT) STORED AS ICEBERG;
+ALTER TABLE iceberg_upgrade_v2_delete_mode
+SET TBLPROPERTIES('format-version'='2', 'write.delete.mode'='copy-on-write');
+DESCRIBE FORMATTED iceberg_upgrade_v2_delete_mode;
+---- RESULTS: VERIFY_IS_SUBSET
+'','write.delete.mode   ','copy-on-write       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+DESCRIBE FORMATTED iceberg_upgrade_v2_delete_mode;
+---- RESULTS: VERIFY_IS_NOT_IN
+'','write.update.mode   ','merge-on-read       '
+'','write.merge.mode    ','merge-on-read       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+CREATE TABLE iceberg_upgrade_v2_update_mode (i INT) STORED AS ICEBERG;
+ALTER TABLE iceberg_upgrade_v2_update_mode
+SET TBLPROPERTIES('format-version'='2', 'write.update.mode'='copy-on-write');
+DESCRIBE FORMATTED iceberg_upgrade_v2_update_mode;
+---- RESULTS: VERIFY_IS_SUBSET
+'','write.update.mode   ','copy-on-write       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+DESCRIBE FORMATTED iceberg_upgrade_v2_update_mode;
+---- RESULTS: VERIFY_IS_NOT_IN
+'','write.delete.mode   ','merge-on-read       '
+'','write.merge.mode    ','merge-on-read       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+CREATE TABLE iceberg_upgrade_v2_merge_mode (i INT) STORED AS ICEBERG;
+ALTER TABLE iceberg_upgrade_v2_merge_mode
+SET TBLPROPERTIES('format-version'='2', 'write.merge.mode'='merge-on-read');
+DESCRIBE FORMATTED iceberg_upgrade_v2_merge_mode;
+---- RESULTS: VERIFY_IS_SUBSET
+'','write.merge.mode    ','merge-on-read       '
+---- TYPES
+string, string, string
+====
+---- QUERY
+DESCRIBE FORMATTED iceberg_upgrade_v2_merge_mode;
+---- RESULTS: VERIFY_IS_NOT_IN
+'','write.update.mode   ','merge-on-read       '
+'','write.delete.mode   ','merge-on-read       '
+---- TYPES
+string, string, string
+====
diff --git a/testdata/workloads/functional-query/queries/QueryTest/show-create-table.test b/testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
index ed1809229..c2485aafa 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
@@ -938,3 +938,77 @@ LOCATION '$$location_uri$$'
 TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
 'engine.hive.enabled'='true', 'table_type'='ICEBERG')
 ====
+---- CREATE_TABLE
+# Creating V1 tables explicitly should not set 'merge-on-read' write modes if no write mode is
+# specified.
+CREATE TABLE ice_explicit_v1 (i int)
+STORED AS ICEBERG
+TBLPROPERTIES('format-version'='1')
+---- RESULTS-HIVE-3
+CREATE EXTERNAL TABLE show_create_table_test_db.ice_explicit_v1 (i INT NULL)
+STORED AS ICEBERG
+LOCATION '$$location_uri$$'
+TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
+ 'engine.hive.enabled'='true', 'table_type'='ICEBERG')
+====
+---- CREATE_TABLE
+# Creating V2 tables should set 'merge-on-read' write modes if no write mode is specified.
+CREATE TABLE ice_v2 (i int)
+STORED AS ICEBERG
+TBLPROPERTIES('format-version'='2')
+---- RESULTS-HIVE-3
+CREATE EXTERNAL TABLE show_create_table_test_db.ice_v2 (i INT NULL)
+STORED AS ICEBERG
+LOCATION '$$location_uri$$'
+TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
+ 'engine.hive.enabled'='true', 'table_type'='ICEBERG', 'write.delete.mode'='merge-on-read',
+ 'write.update.mode'='merge-on-read', 'write.merge.mode'='merge-on-read')
+====
+---- CREATE_TABLE
+# Creating V2 tables should not set write mode if user specified any of it to any value.
+CREATE TABLE ice_v2_explicit_delete (i int)
+STORED AS ICEBERG
+TBLPROPERTIES('format-version'='2', 'write.delete.mode'='merge-on-read')
+---- RESULTS-HIVE-3
+CREATE EXTERNAL TABLE show_create_table_test_db.ice_v2_explicit_delete (i INT NULL)
+STORED AS ICEBERG
+LOCATION '$$location_uri$$'
+TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
+ 'engine.hive.enabled'='true', 'table_type'='ICEBERG', 'write.delete.mode'='merge-on-read')
+====
+---- CREATE_TABLE
+# Creating V2 tables should not set write mode if user specified any of it to any value.
+CREATE TABLE ice_v2_explicit_delete_2 (i int)
+STORED AS ICEBERG
+TBLPROPERTIES('format-version'='2', 'write.delete.mode'='copy-on-write')
+---- RESULTS-HIVE-3
+CREATE EXTERNAL TABLE show_create_table_test_db.ice_v2_explicit_delete_2 (i INT NULL)
+STORED AS ICEBERG
+LOCATION '$$location_uri$$'
+TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
+ 'engine.hive.enabled'='true', 'table_type'='ICEBERG', 'write.delete.mode'='copy-on-write')
+====
+---- CREATE_TABLE
+# Creating V2 tables should not set write mode if user specified any of it to any value.
+CREATE TABLE ice_v2_explicit_update (i int)
+STORED AS ICEBERG
+TBLPROPERTIES('format-version'='2', 'write.update.mode'='copy-on-write')
+---- RESULTS-HIVE-3
+CREATE EXTERNAL TABLE show_create_table_test_db.ice_v2_explicit_update (i INT NULL)
+STORED AS ICEBERG
+LOCATION '$$location_uri$$'
+TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
+ 'engine.hive.enabled'='true', 'table_type'='ICEBERG', 'write.update.mode'='copy-on-write')
+====
+---- CREATE_TABLE
+# Creating V2 tables should not set write mode if user specified any of it to any value.
+CREATE TABLE ice_v2_explicit_merge (i int)
+STORED AS ICEBERG
+TBLPROPERTIES('format-version'='2', 'write.merge.mode'='copy-on-write')
+---- RESULTS-HIVE-3
+CREATE EXTERNAL TABLE show_create_table_test_db.ice_v2_explicit_merge (i INT NULL)
+STORED AS ICEBERG
+LOCATION '$$location_uri$$'
+TBLPROPERTIES ('external.table.purge'='TRUE', 'write.format.default'='parquet',
+ 'engine.hive.enabled'='true', 'table_type'='ICEBERG', 'write.merge.mode'='copy-on-write')
+====


[impala] 02/03: IMPALA-10715: test decimal min max filters failed in exhaustive run

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 5c5708114c685f9d99d4b4c1e842bf0f11bfeb84
Author: Qifan Chen <qc...@cloudera.com>
AuthorDate: Wed Oct 12 13:08:47 2022 -0400

    IMPALA-10715: test decimal min max filters failed in exhaustive run
    
    This patch enables only the min/max filters in decimal min/max filter
    test so that some of the non-qualifying rows can be returned from the
    kudu scanners. Previously, the tests allows bloom filters to filter
    out rows at the kudu scanner level which prevents non-qualfying rows
    to arrive at the hash join node. Such non-qualifying rows are required
    by the test.
    
    With the patch, the test passes in the exhaustive mode. The patch also
    refactors the above logic for the entire TestMinMaxFilters test so that
    each test case in it will only get the min/max filter.
    
    Testing:
     - Unit test
     - Core test
    
    Change-Id: I20da28f780a27c6fdd917116e7c14d46d2a5db0f
    Reviewed-on: http://gerrit.cloudera.org:8080/19132
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/query_test/test_runtime_filters.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/query_test/test_runtime_filters.py b/tests/query_test/test_runtime_filters.py
index e48289bd0..c285a1335 100644
--- a/tests/query_test/test_runtime_filters.py
+++ b/tests/query_test/test_runtime_filters.py
@@ -220,9 +220,12 @@ class TestMinMaxFilters(ImpalaTestSuite):
     # Enable query option ASYNC_CODEGEN for slow build
     if build_runs_slowly:
       add_exec_option_dimension(cls, "async_codegen", 1)
+    # IMPALA-10715. Enable only min/max since the bloom filters will return
+    # rows only satisfying the join predicates. This test requires the return
+    # of non-qualifying rows to succeed.
+    add_exec_option_dimension(cls, "ENABLED_RUNTIME_FILTER_TYPES", "MIN_MAX")
 
   def test_min_max_filters(self, vector):
-    self.execute_query("SET ENABLED_RUNTIME_FILTER_TYPES=MIN_MAX")
     self.execute_query("SET MINMAX_FILTER_THRESHOLD=0.5")
     self.run_test_case('QueryTest/min_max_filters', vector,
         test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})