You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/09/28 02:39:00 UTC

[jira] [Commented] (IMPALA-10870) Add Apache Hive 3.1.2 to the minicluster

    [ https://issues.apache.org/jira/browse/IMPALA-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421126#comment-17421126 ] 

ASF subversion and git services commented on IMPALA-10870:
----------------------------------------------------------

Commit 1306c58f29dee4f8f4c0ab18327bc19557a9156e in impala's branch refs/heads/master from Fucun Chu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1306c58 ]

IMPALA-10870: Add Apache Hive 3.1.2 to the minicluster

This patch modifies the minicluster script to optionally use Apache
Hive 3.1.2 instead of CDP Hive 3.1.3.

In order to make sure that existing setups don't break this is
enabled via a environment variable override to bin/impala-config.sh.
When the environment variable USE_APACHE_HIVE is set to true the
bootstrap_toolchain script downloads Apache Hive 3.1.2 tarballs and
extracts it in the toolchain directory. These binaries are used to
start the Hive services (Hiveserver2 and metastore). The default is
CDP Hive 3.1.3

Since CDP Hive 3 uses some features of Apache Hive 4, this patch uses
a different database name so that it is easy to switch from working
from one environment which uses CDP Hive 3.1.3 metastore to another
which usese Apache Hive 3.1.2 metastore.

In order to start a minicluster which uses Apache Hive 3.1.2 users
should follow the steps below:

1. Make sure that minicluster, if running, is stopped before you run
the following commands.
2. Open a new terminal and run following commands.
> export USE_APACHE_HIVE=true
> source bin/impala-config.sh
> bin/bootstrap_toolchain.py
  The above command downloads the Apache Hive 3.1.2 tarballs and
extracts them in toolchain/apache_components directory.

> rm $HIVE_HOME/lib/guava-*jar
> cp $HADOOP_HOME/share/hadoop/hdfs/lib/guava-*.jar $HIVE_HOME/lib/
  The above command is to fix HIVE-22915

> bin/create-test-configuration.sh -create_metastore
  The above step should provide "-create-metastore" only the first time
so that a new metastore db is created and the Apache Hive 3.1.2 schema
is initialized.

> testdata/bin/run-all.sh

Follow-up:
 - Add MetastoreShim to support Apache Hive 3.x in IMPALA-10871

Tests:
 - Made sure that the cluster comes up with Apache Hive 3.1.2 when the
   steps above are performed.
 - Made sure that existing scripts work as they do currently when
   argument is not provided.

Change-Id: I1978909589ecacb15d32d874e97f050a85adf1f6
Reviewed-on: http://gerrit.cloudera.org:8080/17793
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add Apache Hive 3.1.2 to the minicluster
> ----------------------------------------
>
>                 Key: IMPALA-10870
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10870
>             Project: IMPALA
>          Issue Type: Task
>            Reporter: Quanlong Huang
>            Assignee: Fucun Chu
>            Priority: Major
>
> The minicluster uses Hive in CDP versions. To be able to build and test on Apache Hive 3.1.2, we can extend boostrap_toolchain.py to download Apache Hive release, or just add Apache Hive 3.1.2 to a fake GBN.
> It's still not clear that whether Apache Hive 3.1.2 has compatibility issues with Hadoop in CDP versions. If so, we need to add Apache Hadoop version as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org