You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by yi...@apache.org on 2023/06/02 02:27:11 UTC

[spark-docker] branch master updated: [SPARK-43368] Use `libnss_wrapper` to fake passwd entry

This is an automated email from the ASF dual-hosted git repository.

yikun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark-docker.git


The following commit(s) were added to refs/heads/master by this push:
     new c07ae18  [SPARK-43368] Use `libnss_wrapper` to fake passwd entry
c07ae18 is described below

commit c07ae18355678370fd270bedb8b39ab2aceb5ac2
Author: Yikun Jiang <yi...@gmail.com>
AuthorDate: Fri Jun 2 10:27:01 2023 +0800

    [SPARK-43368] Use `libnss_wrapper` to fake passwd entry
    
    ### What changes were proposed in this pull request?
    Use `libnss_wrapper` to fake passwd entry instead of changing passwd to resolve random UID problem. And also we only attempt to setup fake passwd entry for driver/executor, but for cmd like `bash`, the fake passwd will not be set.
    
    ### Why are the changes needed?
    In the past, we add the entry to  `/etc/passwd` directly for current UID, it's mainly for [OpenShift anonymous random `uid` case](https://github.com/docker-library/official-images/pull/13089#issuecomment-1534706523) (See also in https://github.com/apache-spark-on-k8s/spark/pull/404), but this way bring the pontential security issue about widely permision of `/etc/passwd`.
    
    According to DOI reviewer [suggestion](https://github.com/docker-library/official-images/pull/13089#issuecomment-1561793792), we'd better to resolve this problem by using [libnss_wrapper](https://cwrap.org/nss_wrapper.html). It's a library to help set a fake passwd entry by setting `LD_PRELOAD`, `NSS_WRAPPER_PASSWD`, `NSS_WRAPPER_GROUP`. Such as random UID is `1000`, the env will be:
    
    ```
    spark6f41b8e5be9b:/opt/spark/work-dir$ id -u
    1000
    spark6f41b8e5be9b:/opt/spark/work-dir$ id -g
    1000
    spark6f41b8e5be9b:/opt/spark/work-dir$ whoami
    spark
    spark6f41b8e5be9b:/opt/spark/work-dir$ echo $LD_PRELOAD
    /usr/lib/libnss_wrapper.so
    spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_PASSWD
    /tmp/tmp.r5x4SMX35B
    spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.r5x4SMX35B
    spark:x:1000:1000:${SPARK_USER_NAME:-anonymous uid}:/opt/spark:/bin/false
    spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_GROUP
    /tmp/tmp.XcnnYuD68r
    spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.XcnnYuD68r
    spark:x:1000:
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    Yes, setup fake ENV rather than changing `/etc/passwd`.
    
    ### How was this patch tested?
    #### 1. Without `attempt_setup_fake_passwd_entry`, the user is `I have no name!`
    ```
    # docker run -it --rm --user 1000:1000  spark-test bash
    groups: cannot find name for group ID 1000
    I have no name!998110cd5a26:/opt/spark/work-dir$
    I have no name!0fea1d27d67d:/opt/spark/work-dir$ id -u
    1000
    I have no name!0fea1d27d67d:/opt/spark/work-dir$ id -g
    1000
    I have no name!0fea1d27d67d:/opt/spark/work-dir$ whoami
    whoami: cannot find name for user ID 1000
    ```
    
    #### 2. Mannual stub the `attempt_setup_fake_passwd_entry`, the user is `spark`.
    2.1 Apply a tmp change to cmd
    
    ```patch
    diff --git a/entrypoint.sh.template b/entrypoint.sh.template
    index 08fc925..77d5b04 100644
    --- a/entrypoint.sh.template
    +++ b/entrypoint.sh.template
     -118,6 +118,7  case "$1" in
    
       *)
         # Non-spark-on-k8s command provided, proceeding in pass-through mode...
    +    attempt_setup_fake_passwd_entry
         exec "$"
         ;;
     esac
    ```
    
    2.2 Build and run the image, specify a random UID/GID 1000
    
    ```bash
    $ docker build . -t spark-test
    $ docker run -it --rm --user 1000:1000  spark-test bash
    # the user is set to spark rather than unknow user
    spark6f41b8e5be9b:/opt/spark/work-dir$
    spark6f41b8e5be9b:/opt/spark/work-dir$ id -u
    1000
    spark6f41b8e5be9b:/opt/spark/work-dir$ id -g
    1000
    spark6f41b8e5be9b:/opt/spark/work-dir$ whoami
    spark
    
    ```
    
    ```
    # NSS env is set right
    spark6f41b8e5be9b:/opt/spark/work-dir$ echo $LD_PRELOAD
    /usr/lib/libnss_wrapper.so
    spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_PASSWD
    /tmp/tmp.r5x4SMX35B
    spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.r5x4SMX35B
    spark:x:1000:1000:${SPARK_USER_NAME:-anonymous uid}:/opt/spark:/bin/false
    spark6f41b8e5be9b:/opt/spark/work-dir$ echo $NSS_WRAPPER_GROUP
    /tmp/tmp.XcnnYuD68r
    spark6f41b8e5be9b:/opt/spark/work-dir$ cat /tmp/tmp.XcnnYuD68r
    spark:x:1000:
    ```
    
    #### 3. If specify current exsiting user (such as `spark`, `root`), no fake setup
    ```bash
    # docker run -it --rm --user 0  spark-test bash
    roote5bf55d4df22:/opt/spark/work-dir# echo $LD_PRELOAD
    
    ```
    
    ```bash
    # docker run -it --rm  spark-test bash
    sparkdef8d8ca4e7d:/opt/spark/work-dir$ echo $LD_PRELOAD
    
    ```
    
    Closes #45 from Yikun/SPARK-43368.
    
    Authored-by: Yikun Jiang <yi...@gmail.com>
    Signed-off-by: Yikun Jiang <yi...@gmail.com>
---
 3.4.0/scala2.12-java11-ubuntu/Dockerfile    |  3 +--
 3.4.0/scala2.12-java11-ubuntu/entrypoint.sh | 41 +++++++++++++++++------------
 Dockerfile.template                         |  3 +--
 entrypoint.sh.template                      | 41 +++++++++++++++++------------
 4 files changed, 50 insertions(+), 38 deletions(-)

diff --git a/3.4.0/scala2.12-java11-ubuntu/Dockerfile b/3.4.0/scala2.12-java11-ubuntu/Dockerfile
index a680106..aa754b7 100644
--- a/3.4.0/scala2.12-java11-ubuntu/Dockerfile
+++ b/3.4.0/scala2.12-java11-ubuntu/Dockerfile
@@ -24,7 +24,7 @@ RUN groupadd --system --gid=${spark_uid} spark && \
 RUN set -ex; \
     apt-get update; \
     ln -s /lib /lib64; \
-    apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu; \
+    apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu libnss-wrapper; \
     mkdir -p /opt/spark; \
     mkdir /opt/spark/python; \
     mkdir -p /opt/spark/examples; \
@@ -33,7 +33,6 @@ RUN set -ex; \
     touch /opt/spark/RELEASE; \
     chown -R spark:spark /opt/spark; \
     echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su; \
-    chgrp root /etc/passwd && chmod ug+rw /etc/passwd; \
     rm -rf /var/cache/apt/*; \
     rm -rf /var/lib/apt/lists/*
 
diff --git a/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh b/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh
index 6def3f9..08fc925 100755
--- a/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh
+++ b/3.4.0/scala2.12-java11-ubuntu/entrypoint.sh
@@ -15,23 +15,28 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-
-# Check whether there is a passwd entry for the container UID
-myuid=$(id -u)
-mygid=$(id -g)
-# turn off -e for getent because it will return error code in anonymous uid case
-set +e
-uidentry=$(getent passwd $myuid)
-set -e
-
-# If there is no passwd entry for the container UID, attempt to create one
-if [ -z "$uidentry" ] ; then
-    if [ -w /etc/passwd ] ; then
-        echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
-    else
-        echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID"
-    fi
-fi
+attempt_setup_fake_passwd_entry() {
+  # Check whether there is a passwd entry for the container UID
+  local myuid; myuid="$(id -u)"
+  # If there is no passwd entry for the container UID, attempt to fake one
+  # You can also refer to the https://github.com/docker-library/official-images/pull/13089#issuecomment-1534706523
+  # It's to resolve OpenShift random UID case.
+  # See also: https://github.com/docker-library/postgres/pull/448
+  if ! getent passwd "$myuid" &> /dev/null; then
+      local wrapper
+      for wrapper in {/usr,}/lib{/*,}/libnss_wrapper.so; do
+        if [ -s "$wrapper" ]; then
+          NSS_WRAPPER_PASSWD="$(mktemp)"
+          NSS_WRAPPER_GROUP="$(mktemp)"
+          export LD_PRELOAD="$wrapper" NSS_WRAPPER_PASSWD NSS_WRAPPER_GROUP
+          local mygid; mygid="$(id -g)"
+          printf 'spark:x:%s:%s:${SPARK_USER_NAME:-anonymous uid}:%s:/bin/false\n' "$myuid" "$mygid" "$SPARK_HOME" > "$NSS_WRAPPER_PASSWD"
+          printf 'spark:x:%s:\n' "$mygid" > "$NSS_WRAPPER_GROUP"
+          break
+        fi
+      done
+  fi
+}
 
 if [ -z "$JAVA_HOME" ]; then
   JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}')
@@ -85,6 +90,7 @@ case "$1" in
       --deploy-mode client
       "$@"
     )
+    attempt_setup_fake_passwd_entry
     # Execute the container CMD under tini for better hygiene
     exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}"
     ;;
@@ -105,6 +111,7 @@ case "$1" in
       --resourceProfileId $SPARK_RESOURCE_PROFILE_ID
       --podName $SPARK_EXECUTOR_POD_NAME
     )
+    attempt_setup_fake_passwd_entry
     # Execute the container CMD under tini for better hygiene
     exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}"
     ;;
diff --git a/Dockerfile.template b/Dockerfile.template
index d1188bc..fc67534 100644
--- a/Dockerfile.template
+++ b/Dockerfile.template
@@ -24,7 +24,7 @@ RUN groupadd --system --gid=${spark_uid} spark && \
 RUN set -ex; \
     apt-get update; \
     ln -s /lib /lib64; \
-    apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu; \
+    apt install -y gnupg2 wget bash tini libc6 libpam-modules krb5-user libnss3 procps net-tools gosu libnss-wrapper; \
     mkdir -p /opt/spark; \
     mkdir /opt/spark/python; \
     mkdir -p /opt/spark/examples; \
@@ -33,7 +33,6 @@ RUN set -ex; \
     touch /opt/spark/RELEASE; \
     chown -R spark:spark /opt/spark; \
     echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su; \
-    chgrp root /etc/passwd && chmod ug+rw /etc/passwd; \
     rm -rf /var/cache/apt/*; \
     rm -rf /var/lib/apt/lists/*
 
diff --git a/entrypoint.sh.template b/entrypoint.sh.template
index 6def3f9..08fc925 100644
--- a/entrypoint.sh.template
+++ b/entrypoint.sh.template
@@ -15,23 +15,28 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-
-# Check whether there is a passwd entry for the container UID
-myuid=$(id -u)
-mygid=$(id -g)
-# turn off -e for getent because it will return error code in anonymous uid case
-set +e
-uidentry=$(getent passwd $myuid)
-set -e
-
-# If there is no passwd entry for the container UID, attempt to create one
-if [ -z "$uidentry" ] ; then
-    if [ -w /etc/passwd ] ; then
-        echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
-    else
-        echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID"
-    fi
-fi
+attempt_setup_fake_passwd_entry() {
+  # Check whether there is a passwd entry for the container UID
+  local myuid; myuid="$(id -u)"
+  # If there is no passwd entry for the container UID, attempt to fake one
+  # You can also refer to the https://github.com/docker-library/official-images/pull/13089#issuecomment-1534706523
+  # It's to resolve OpenShift random UID case.
+  # See also: https://github.com/docker-library/postgres/pull/448
+  if ! getent passwd "$myuid" &> /dev/null; then
+      local wrapper
+      for wrapper in {/usr,}/lib{/*,}/libnss_wrapper.so; do
+        if [ -s "$wrapper" ]; then
+          NSS_WRAPPER_PASSWD="$(mktemp)"
+          NSS_WRAPPER_GROUP="$(mktemp)"
+          export LD_PRELOAD="$wrapper" NSS_WRAPPER_PASSWD NSS_WRAPPER_GROUP
+          local mygid; mygid="$(id -g)"
+          printf 'spark:x:%s:%s:${SPARK_USER_NAME:-anonymous uid}:%s:/bin/false\n' "$myuid" "$mygid" "$SPARK_HOME" > "$NSS_WRAPPER_PASSWD"
+          printf 'spark:x:%s:\n' "$mygid" > "$NSS_WRAPPER_GROUP"
+          break
+        fi
+      done
+  fi
+}
 
 if [ -z "$JAVA_HOME" ]; then
   JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}')
@@ -85,6 +90,7 @@ case "$1" in
       --deploy-mode client
       "$@"
     )
+    attempt_setup_fake_passwd_entry
     # Execute the container CMD under tini for better hygiene
     exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}"
     ;;
@@ -105,6 +111,7 @@ case "$1" in
       --resourceProfileId $SPARK_RESOURCE_PROFILE_ID
       --podName $SPARK_EXECUTOR_POD_NAME
     )
+    attempt_setup_fake_passwd_entry
     # Execute the container CMD under tini for better hygiene
     exec $(switch_spark_if_root) /usr/bin/tini -s -- "${CMD[@]}"
     ;;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org