You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2020/10/01 17:39:08 UTC
[impala] 02/03: IMPALA-10193: Limit the memory usage for the whole
test cluster
This is an automated email from the ASF dual-hosted git repository.
tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit a0a25a61c302d864315daa7f09827b37a37419d5
Author: fifteencai <fi...@tencent.com>
AuthorDate: Wed Sep 30 13:03:08 2020 +0800
IMPALA-10193: Limit the memory usage for the whole test cluster
This patch introduces a new approach of limiting the memory usage
for both mini-cluster and CDH cluster.
Without this limit, clusters are prone to getting killed when running
in docker containers with a lower mem limit than host's memory size.
i.e. The mini-cluster may running in a
container with 32GB limitted by CGROUPS, while the host machine has
128GB. Under this circumstance, if the container is started with
'-privileged' command argument, both mini and CDH clusters compute
their mem_limit according to 128GB rather than 32GB. They will be
killed when attempting to apply for extra resource.
Currently, the mem-limit estimating algorithms for Impalad and Node
Manager are different:
for Impalad: mem_limit = 0.7 * sys_mem / cluster_size (default is 3)
for Node Manager:
1. Leave aside 24GB, then fit the left into threasholds below.
2. The bare limit is 4GB and maximum limit 48GB
In headge of over-consumption, we
- Added a new environment variable IMPALA_CLUSTER_MAX_MEM_GB
- Modified the algorithm in 'bin/start-impala-cluster.py', making it
taking IMPALA_CLUSTER_MAX_MEM_GB rather than sys_mem into account.
- Modified the logic in
'testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py'
Similarly, making IMPALA_CLUSTER_MAX_MEM_GB substitutes for sys_mem .
Testing: this patch worked in a 32GB docker container running on a 128GB
host machine. All 1188 unit tests get passed.
Change-Id: I8537fd748e279d5a0e689872aeb4dbfd0c84dc93
Reviewed-on: http://gerrit.cloudera.org:8080/16522
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
bin/impala-config.sh | 3 +++
bin/start-impala-cluster.py | 6 ++++--
.../cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py | 3 ++-
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/bin/impala-config.sh b/bin/impala-config.sh
index e0998c1..5d9b8a6 100755
--- a/bin/impala-config.sh
+++ b/bin/impala-config.sh
@@ -112,6 +112,9 @@ unset IMPALA_LLVM_URL
export IMPALA_LLVM_ASAN_VERSION=5.0.1-p3
unset IMPALA_LLVM_ASAN_URL
+# Maximum memory available for mini-cluster and CDH cluster
+export IMPALA_CLUSTER_MAX_MEM_GB
+
# LLVM stores some files in subdirectories that are named after what
# version it thinks it is. We might think it is 5.0.1-p1, based on a
# patch we have applied, but LLVM thinks its version is 5.0.1.
diff --git a/bin/start-impala-cluster.py b/bin/start-impala-cluster.py
index c708ce4..452700c 100755
--- a/bin/start-impala-cluster.py
+++ b/bin/start-impala-cluster.py
@@ -430,7 +430,7 @@ def build_kerberos_args(daemon):
def compute_impalad_mem_limit(cluster_size):
# Set mem_limit of each impalad to the smaller of 12GB or
- # 1/cluster_size (typically 1/3) of 70% of system memory.
+ # 1/cluster_size (typically 1/3) of 70% of available memory.
#
# The default memory limit for an impalad is 80% of the total system memory. On a
# mini-cluster with 3 impalads that means 240%. Since having an impalad be OOM killed
@@ -442,7 +442,9 @@ def compute_impalad_mem_limit(cluster_size):
# memory choice here to max out at 12GB. This should be sufficient for tests.
#
# Beware that ASAN builds use more memory than regular builds.
- mem_limit = int(0.7 * psutil.virtual_memory().total / cluster_size)
+ physical_mem_gb = psutil.virtual_memory().total / 1024 / 1024 / 1024
+ available_mem = int(os.getenv("IMPALA_CLUSTER_MAX_MEM_GB", str(physical_mem_gb)))
+ mem_limit = int(0.7 * available_mem * 1024 * 1024 * 1024 / cluster_size)
return min(12 * 1024 * 1024 * 1024, mem_limit)
class MiniClusterOperations(object):
diff --git a/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py b/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
index 0987925..b286da4 100644
--- a/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
+++ b/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
@@ -33,11 +33,12 @@ def _get_system_ram_mb():
def _get_yarn_nm_ram_mb():
sys_ram = _get_system_ram_mb()
+ available_ram_gb = int(os.getenv("IMPALA_CLUSTER_MAX_MEM_GB", str(sys_ram / 1024)))
# Fit into the following envelope:
# - need 4GB at a bare minimum
# - leave at least 24G for other services
# - don't need more than 48G
- ret = min(max(sys_ram - 24 * 1024, 4096), 48 * 1024)
+ ret = min(max(available_ram_gb * 1024 - 24 * 1024, 4096), 48 * 1024)
print >>sys.stderr, "Configuring Yarn NM to use {0}MB RAM".format(ret)
return ret