You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2019/09/13 01:42:00 UTC

[jira] [Commented] (IMPALA-8586) bin/bootstrap_toolchain.py should support URL environment variables for CDP components

    [ https://issues.apache.org/jira/browse/IMPALA-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928969#comment-16928969 ] 

ASF subversion and git services commented on IMPALA-8586:
---------------------------------------------------------

Commit da0ab1d41ae777fbd7094d44628dfee1ff0fc8fe in impala's branch refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=da0ab1d ]

IMPALA-8586: Support download URLs for CDP

bin/bootstrap_toolchain.py has accumulated complexity over time.
CDH, CDP, and the native toolchain all use different download
machinery and naming. One feature that is needed on the CDP side
is the ability to specify the download URL in an IMPALA_*_URL
environment variable.

This adds that support and refactors CDH and native toolchain
downloads to use the new system. This is essentially a rewrite
of bin/bootstrap_toolchain.py.

Currently, there are multiple phases of downloads, each with their
own download functions and peculiarities to account for package
names and destinations for downloads. This changes the logic
so that a package will generate a DownloadUnpackTarball that is
completely resolved. It contains everything about what to download
and where to put it as well as a needs_download() function and a
download() function. Once there is a list of DownloadUnpackTarball
objects, they can all be downloaded and unpacked in a single phase.
This implements different types of packages as subclasses of
DownloadUnpackTarball. Since most subclasses want to be able to
construct URLs and archive names using templates, the
TemplatedDownloadUnpackTarball takes the same arguments as
DownloadUnpackTarball along with a map of template substitutions,
which are applied to all string arguments.

Kudu requires special handling and gets its own set of subclasses
to handle various subtleties like toolchain vs CDH Kudu, the Kudu
stub, and making sure that the "kudu" package and the "kudu-java"
package don't confuse each other.

As part of this change, USE_CDP_HIVE=true now uses the CDP version
of HBase rather than always using the CDH version.

Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Reviewed-on: http://gerrit.cloudera.org:8080/13432
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> bin/bootstrap_toolchain.py should support URL environment variables for CDP components
> --------------------------------------------------------------------------------------
>
>                 Key: IMPALA-8586
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8586
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 3.3.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Critical
>
> To make it easy to override the Hadoop components used for building and developing Impala, it is useful to be able to specify custom versions/URLs for downloading components. These already exist for CDH through is use of IMPALA_*_URL environment variables. The same should be possible with CDP components.
> This is also useful to allow trying out newer component packages before updating to a newer build number.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org