You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2019/05/31 16:04:45 UTC

[impala] 04/05: IMPALA-8491: Non-root user in container

This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 8cfd18ae89683869392693541f7ba05c1013076f
Author: Tim Armstrong <ta...@cloudera.com>
AuthorDate: Tue May 28 10:47:46 2019 -0700

    IMPALA-8491: Non-root user in container
    
    Set a default USER in the Dockerfile per best practices so that
    consumers of the container don't accidentally run as root.
    The default user is "impala" if the container is run in docker
    without specifying a user.
    
    Various frameworks, including kubernetes, will run the container with
    an arbitrary user and group ID set.
    
    This causes issues with some Hadoop libraries, which depend on the
    user having a name. This is generally not the case because inside
    the container usernames are resolved with the container's /etc/passwd.
    
    To work around this, the entrypoint script checks if the current
    user has a name and if not, assigns it one (either dummyuser or
    $HADOOP_USER_NAME).
    
    Remove the umask setting that was required to make logs modifiable
    by the host user - this is not needed for our tests since the host
    host and container users now match up.
    
    Also run apt-get clean in Dockerfile to reduce cruft in the
    image.
    
    Change-Id: I0bea9f44a8199851ed04fbef8caf4a2350ae2c0e
    Reviewed-on: http://gerrit.cloudera.org:8080/13451
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 bin/start-impala-cluster.py   |  5 ++++-
 docker/daemon_entrypoint.sh   | 12 ++++++++----
 docker/impala_base/Dockerfile | 10 +++++++++-
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/bin/start-impala-cluster.py b/bin/start-impala-cluster.py
index 455efcb..8b95af3 100755
--- a/bin/start-impala-cluster.py
+++ b/bin/start-impala-cluster.py
@@ -567,6 +567,9 @@ class DockerMiniClusterOperations(object):
       os.makedirs(log_dir)
     mount_args += ["--mount", "type=bind,src={0},dst=/opt/impala/logs".format(log_dir)]
 
+    # Run the container as the current user.
+    user_args = ["--user", "{0}:{1}".format(os.getuid(), os.getgid())]
+
     # Allow loading LZO plugin, if built.
     lzo_lib_dir = os.path.join(IMPALA_LZO, "build")
     if os.path.isdir(lzo_lib_dir):
@@ -577,7 +580,7 @@ class DockerMiniClusterOperations(object):
     if mem_limit is not None:
       mem_limit_args = ["--memory", str(mem_limit)]
     LOG.info("Running container {0}".format(container_name))
-    run_cmd = (["docker", "run", "-d"] + env_args + port_args + ["--network",
+    run_cmd = (["docker", "run", "-d"] + env_args + port_args + user_args + ["--network",
       self.network_name, "--name", container_name, "--network-alias", host_name] +
       mount_args + mem_limit_args + [image_tag] + args)
     LOG.info("Running command {0}".format(run_cmd))
diff --git a/docker/daemon_entrypoint.sh b/docker/daemon_entrypoint.sh
index d3b7a18..1bc7752 100755
--- a/docker/daemon_entrypoint.sh
+++ b/docker/daemon_entrypoint.sh
@@ -47,10 +47,14 @@ echo "LD_LIBRARY_PATH: $LD_LIBRARY_PATH"
 # Default to 2GB heap. Allow overriding by externally-set JAVA_TOOL_OPTIONS.
 export JAVA_TOOL_OPTIONS="-Xmx2g $JAVA_TOOL_OPTIONS"
 
-# Allow any files written by the container to be modified by other users.
-# This is required when mounting the log directory to a host directory.
-# TODO: IMPALA-8491: once running as non-root user, this may not be needed.
-umask 000
+# Various Hadoop libraries depend on having a username. If we're running under
+# an unknown username, create an entry in the password file for this user.
+if ! whoami ; then
+  export USER=${HADOOP_USER_NAME:-dummyuser}
+  echo "${USER}:x:$(id -u):$(id -g):,,,:/opt/impala:/bin/bash" >> /etc/passwd
+  whoami
+  cat /etc/passwd
+fi
 
 "$@"
 EXIT_CODE=$?
diff --git a/docker/impala_base/Dockerfile b/docker/impala_base/Dockerfile
index fbbb562..3f9bee0 100644
--- a/docker/impala_base/Dockerfile
+++ b/docker/impala_base/Dockerfile
@@ -23,7 +23,15 @@ FROM ubuntu:16.04
 RUN apt-get update && \
   apt-get install -y openjdk-8-jre-headless \
   libsasl2-2 libsasl2-modules libsasl2-modules-gssapi-mit \
-  tzdata liblzo2-2
+  tzdata liblzo2-2 && \
+  apt-get clean
+
+# Use a non-privileged impala user to run the daemons in the container.
+# That user should own everything in the /opt/impala subdirectory.
+RUN groupadd -r impala && useradd --no-log-init -r -g impala impala && \
+    mkdir -p /opt/impala && chown impala /opt/impala && \
+    chmod ugo+w /etc/passwd
+USER impala
 
 # Copy build artifacts required for the daemon processes.
 # Need to have multiple copy commands to preserve directory structure.