You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by jo...@apache.org on 2021/10/07 08:21:21 UTC

[zeppelin] branch master updated: [ZEPPELIN-5548] Startup scripts should consistently load Hadoop classpaths

This is an automated email from the ASF dual-hosted git repository.

jongyoul pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/master by this push:
     new d423437  [ZEPPELIN-5548] Startup scripts should consistently load Hadoop classpaths
d423437 is described below

commit d4234375260c6e8a1f02b74dc1d00e99d8ca19eb
Author: llamasoft <ll...@rm-rf.email>
AuthorDate: Tue Oct 5 18:46:50 2021 -0400

    [ZEPPELIN-5548] Startup scripts should consistently load Hadoop classpaths
    
    ### What is this PR for?
    The `bin/zeppelin.sh` startup script doesn't fully load the Hadoop classpaths in the same way that `bin/zeppelin-daemon.sh` does. As a result, anything that depends on the Hadoop classpaths being loaded (e.g. Shiro's KnoxJwtRealm) will fail.
    
    ### What type of PR is it?
    Bug Fix/Improvement
    
    ### How should this be tested?
    * Verify that `hadoop classpath` values are added to Zeppelin's java process's classpath.
    * Verify that Shiro's KnoxJwtRealm loads correctly when running `USE_HADOOP=true HADOOP_CONF_DIR=/path/to/conf bin/zeppelin.sh`
    * Verify that the Zeppelin Docker image still works as expected.
    
    ### Notes:
    * The classpath loading logic was copied directly from `bin/zeppelin-daemon.sh`.
    * The Zeppelin Docker image uses `bin/zeppelin.sh` as its startup script.  This PR will make it easier for users to extend the Docker image by adding Hadoop to it.  Currently, adding Hadoop requires extensive modifications to get the classpath to load correctly.
    
    ### Questions:
    * `bin/zeppelin-daemon.sh` makes loading `HADOOP_CONF_DIR` conditional on `USE_HADOOP` but `bin/zeppelin.sh` previously didn't.  Should `zeppelin.sh` be made to work the same as `zeppelin-daemon.sh` or should it retain loading `HADOOP_CONF_DIR` without checking `USE_HADOOP` for backwards compatibility?
    * If this PR looks good, could you kindly apply the 'hacktoberfest-accepted' label? 😅
    
    Author: llamasoft <ll...@rm-rf.email>
    
    Closes #4244 from llamasoft/hadoop-classpath and squashes the following commits:
    
    68e174200 [llamasoft] Make Hadoop classpath loading consistent
---
 bin/zeppelin.sh | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/bin/zeppelin.sh b/bin/zeppelin.sh
index ef85eaa..efd3aae 100755
--- a/bin/zeppelin.sh
+++ b/bin/zeppelin.sh
@@ -30,7 +30,7 @@ if [ -f /proc/self/cgroup ] && [ -n "$(command -v getent)" ]; then
         set +e
         uidentry="$(getent passwd "$myuid")"
         set -e
-        
+
         # If there is no passwd entry for the container UID, attempt to create one
         if [ -z "$uidentry" ] ; then
             if [ -w /etc/passwd ] ; then
@@ -115,8 +115,18 @@ addJarInDir "${ZEPPELIN_HOME}/zeppelin-web-angular/target/lib"
 
 ZEPPELIN_CLASSPATH="$CLASSPATH:$ZEPPELIN_CLASSPATH"
 
-if [[ -n "${HADOOP_CONF_DIR}" ]] && [[ -d "${HADOOP_CONF_DIR}" ]]; then
-  ZEPPELIN_CLASSPATH+=":${HADOOP_CONF_DIR}"
+## Add hadoop jars when env USE_HADOOP is true
+if [[ "${USE_HADOOP}" != "false"  ]]; then
+  if [[ -z "${HADOOP_CONF_DIR}" ]]; then
+    echo "Please specify HADOOP_CONF_DIR if USE_HADOOP is true"
+  else
+    ZEPPELIN_CLASSPATH+=":${HADOOP_CONF_DIR}"
+    if ! [ -x "$(command -v hadoop)" ]; then
+      echo 'hadoop command is not in PATH when HADOOP_CONF_DIR is specified.'
+    else
+      ZEPPELIN_CLASSPATH+=":`hadoop classpath`"
+    fi
+  fi
 fi
 
 if [[ ! -d "${ZEPPELIN_LOG_DIR}" ]]; then