You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2020/05/29 00:42:40 UTC

[Impala-ASF-CR] WIP - IMPALA-9793: Impala quickstart cluster with docker-compose

Hello Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15966

to look at the new patch set (#6).

Change subject: WIP - IMPALA-9793: Impala quickstart cluster with docker-compose
......................................................................

WIP - IMPALA-9793: Impala quickstart cluster with docker-compose

What works:
* A single node cluster can be started up with docker-compose
* HMS data is stored in Derby database in a docker volume
* Filesystem data is stored in a shared docker volume, using the
  localfs support in the Hadoop client.
* A Kudu cluster with a single master can be optionally added on
  to the Impala cluster.
* TPC-DS data can be loaded automatically by a data loading container.

We need to set up a docker network called impala-quickstart-network,
purely because docker-compose insists on generating network names
with underscores, which are part of the FQDN and end up causing
problems with Java's URL parsing, which rejects these technically
invalid domain names.

How to run:

  docker network create -d bridge impala-quickstart-network
  export IMPALA_QUICKSTART_IP=$(docker network inspect impala-quickstart-network -f '{{(index .IPAM.Config 0).Gateway}}')
  # Set this if you want to use my prebuilt images from my DockerHub repo. Remove
  # to use locally built images.
  export IMPALA_QUICKSTART_IMAGE_PREFIX="timgarmstrong/"
  # To just start a cluster with no data and no Kudu.
  docker-compose -f docker/quickstart.yml up -d
  # To load data in background into Parquet and Kudu formats.
  docker-compose -f docker/quickstart.yml -f docker/quickstart-kudu.yml \
                 -f docker/quickstart-load-data.yml up -d
  # To follow data loading process
  docker logs -f docker_data-loader_1

  # To run queries.
  pip install impala-shell
  impala-shell --protocol=hs2 -i localhost

How to build containers:

  ./buildall.sh -release -noclean -notests -ninja
  ninja quickstart_hms_image quickstart_data_loader_image docker_images

Overrides:
* KUDU_QUICKSTART_VERSION - defaults to 1.12.0, can be overridden to a
  different tag
* IMPALA_QUICKSTART_VERSION - defaults to latest, can be overridden to a
  different tag.
* IMPALA_QUICKSTART_IMAGE_PREFIX - defaults to using local images, change to
  "timgarmstrong/" to use my prebuilt images.

TODO:
* Factor out some of the CMake into a docker build
  wrapper script, to set up tags, correctly
* Clean up and separate out patch for --hostname.
* Consider how to better integrate with Kudu quickstart
* Upload latest version of containers before merging.

Change-Id: Ifc0b862af40a368381ada7ec2a355fe4b0aa778c
---
M docker/CMakeLists.txt
M docker/daemon_entrypoint.sh
A docker/quickstart-kudu.yml
A docker/quickstart-load-data.yml
A docker/quickstart.yml
A docker/quickstart_conf/hive-site.xml
A docker/quickstart_data_loader/Dockerfile
A docker/quickstart_data_loader/data-load-entrypoint.sh
A docker/quickstart_data_loader/load_tpcds_kudu.sql
A docker/quickstart_data_loader/load_tpcds_parquet.sql
A docker/quickstart_hms/Dockerfile
A docker/quickstart_hms/hms-entrypoint.sh
12 files changed, 2,806 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15966/6
-- 
To view, visit http://gerrit.cloudera.org:8080/15966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifc0b862af40a368381ada7ec2a355fe4b0aa778c
Gerrit-Change-Number: 15966
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>