You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by cs...@apache.org on 2019/06/07 12:54:12 UTC
[impala] 01/02: Fix integration of kudu-hive.jar
This is an automated email from the ASF dual-hosted git repository.
csringhofer pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit 90d84425292532bd0e18aaa9851cc4ca01fd4bce
Author: Thomas Tauber-Marshall <tm...@cloudera.com>
AuthorDate: Thu Jun 6 10:53:32 2019 -0700
Fix integration of kudu-hive.jar
IMPALA-8503 added downloading kudu-hive.jar and adding it to
HADOOP_CLASSPATH in run-hive-server.sh to allow the Hive Metastore to
start with Kudu's HMS plugin.
There are two problems with this that are fixed by this patch:
- Previously, we fully specify the expected jar filename based on the
value of IMPALA_KUDU_JAVA_VERSION when adding it to HADOOP_CLASSPATH
but this is overly restrictive for users who may wish to override
this value in impala-config-branch.sh to build their own branch with
a different version of the kudu-hive.jar This patch relaxes this
restriction by adding any jar containing the string kudu-hive in
IMPALA_KUDU_JAVA_HOME to HADOOP_CLASSPATH
- In bootstrap_toolchain, we don't download a package if its directory
already exists. Since the 'kudu' and 'kudu-java' packages download
to the same directory, this led to a race condition where
'kudu-java' might not be downloaded if 'kudu' had already been
unpacked when it started. This patch fixes this by inspecting the
contents of the Kudu package directory to look for specific files
expected for each Kudu package.
Change-Id: I4ac79c3e9b8625ba54145dba23c69fd5117f35c7
Reviewed-on: http://gerrit.cloudera.org:8080/13542
Reviewed-by: Thomas Marshall <tm...@cloudera.com>
Reviewed-by: Hao Hao <ha...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
bin/bootstrap_toolchain.py | 19 +++++++++++++++++--
bin/impala-config.sh | 1 +
testdata/bin/run-hive-server.sh | 6 +++---
3 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/bin/bootstrap_toolchain.py b/bin/bootstrap_toolchain.py
index ae0b7d3..6be838f 100755
--- a/bin/bootstrap_toolchain.py
+++ b/bin/bootstrap_toolchain.py
@@ -40,6 +40,7 @@
#
# python bootstrap_toolchain.py
import logging
+import glob
import multiprocessing.pool
import os
import random
@@ -418,10 +419,24 @@ def download_cdh_components(toolchain_root, cdh_components, url_prefix):
component_name = component.name
if component.name == "kudu-java":
component_name = "kudu"
+
+ # Check if the diretory already exists, and skip downloading it if it does. Since
+ # the kudu and kudu-java tarballs unpack to the same directory, we check for files
+ # in that directory expected for each package. TODO: if we change how the Kudu
+ # tarballs are packaged we can remove this special case.
pkg_directory = package_directory(cdh_components_home, component_name,
component.version)
- if os.path.isdir(pkg_directory):
- return
+ if component.name == "kudu-java":
+ if len(glob.glob("%s/*jar" % pkg_directory)) > 0:
+ return
+ elif component.name == "kudu":
+ # Regardless of the actual build type, the 'kudu' tarball will always contain a
+ # 'debug' and a 'release' directory.
+ if os.path.exists(os.path.join(pkg_directory, "debug")):
+ return
+ else:
+ if os.path.isdir(pkg_directory):
+ return
platform_label = ""
# Kudu is the only component that's platform dependent.
diff --git a/bin/impala-config.sh b/bin/impala-config.sh
index 7851ff2..2b01c7b 100755
--- a/bin/impala-config.sh
+++ b/bin/impala-config.sh
@@ -669,6 +669,7 @@ else
export IMPALA_KUDU_VERSION=${IMPALA_KUDU_VERSION-"84086fe"}
export IMPALA_KUDU_HOME=${IMPALA_TOOLCHAIN}/kudu-$IMPALA_KUDU_VERSION
fi
+export IMPALA_KUDU_JAVA_HOME=${CDH_COMPONENTS_HOME}/kudu-$IMPALA_KUDU_VERSION
# Set $THRIFT_HOME to the Thrift directory in toolchain.
export THRIFT_HOME="${IMPALA_TOOLCHAIN}/thrift-${IMPALA_THRIFT_VERSION}"
diff --git a/testdata/bin/run-hive-server.sh b/testdata/bin/run-hive-server.sh
index daa7bad..e53c58f 100755
--- a/testdata/bin/run-hive-server.sh
+++ b/testdata/bin/run-hive-server.sh
@@ -102,9 +102,9 @@ fi
# Add kudu-hive.jar to the Hive Metastore classpath, so that Kudu's HMS
# plugin can be loaded.
-FILE_NAME="${CDH_COMPONENTS_HOME}/kudu-${IMPALA_KUDU_JAVA_VERSION}/\
-kudu-hive-${IMPALA_KUDU_JAVA_VERSION}.jar"
-export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${FILE_NAME}
+for file in ${IMPALA_KUDU_JAVA_HOME}/*kudu-hive*jar; do
+ export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${file}
+done
# Default to skip validation on Kudu tables if KUDU_SKIP_HMS_PLUGIN_VALIDATION
# is unset.
export KUDU_SKIP_HMS_PLUGIN_VALIDATION=${KUDU_SKIP_HMS_PLUGIN_VALIDATION:-1}