You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Antoni Ivanov <ai...@vmware.com> on 2019/10/25 12:59:01 UTC

Using Local or Embedded Impala for Testing

Hi, 

We’d like to test with Impala locally. So we can build integration tests against running (if mock) version of Impala
What options do we have ?

The way we've found so far is:  
* So far the best option we've seen are docker container with Impala. We have found a few:
   * https://github.com/tomwhite/docker-impala
   * https://hub.docker.com/r/parrotstream/impala 
But they are not necessarily well updated and also are a bit heavier in term of resources than necessary

Is there way to start embedded impala or something like https://github.com/sakserv/hadoop-mini-clusters 
that can be used for testing purposes 


Thanks,
Antoni 

Re: Using Local or Embedded Impala for Testing

Posted by Tim Armstrong <ta...@cloudera.com>.
The most actively maintained method of running a self-contained Impala
cluster is the development environment:
https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch.
That runs the Impala processes plus all the dependent services on a single
node (HDFS, HMS, Kudu, etc).

We test that automatically on Ubuntu 16.04 and all Impala developers
actively use it. That environment has the scripts to start up all the
required services and impala. The main catch is that it requires building
Impala, which is automated by the script but takes a while the first time
around just cause there's a lot of dependencies to download and C++ to
compile.

This page has instructions for how to create that dev environment *inside*
a docker container, if your host OS is not Ubuntu 16.04 or you want it
contained:
https://cwiki.apache.org/confluence/display/IMPALA/Impala+Development+Environment+inside+Docker

There's an alternative set of docker containers that puts each service in
its own containers, but that probably doesn't help you too much since it
would require having an existing HDFS, HMS, etc plus configuring the
containers yourself:
https://cwiki.apache.org/confluence/display/IMPALA/Build+and+Test+for+Daemon+Docker+Containers.
I built and pushed some docker containers a few months back off a random
commit on master on dockerhub in the following
repos: timgarmstrong/asf-master-{statestored,catalogd,impalad_coordinator,impalad_executor,impalad_coord_exec}.
Those are very much use at your own risk, but if you wanted to inspect the
containers that could get you started

On Fri, Oct 25, 2019 at 5:59 AM Antoni Ivanov <ai...@vmware.com> wrote:

> Hi,
>
> We’d like to test with Impala locally. So we can build integration tests
> against running (if mock) version of Impala
> What options do we have ?
>
> The way we've found so far is:
> * So far the best option we've seen are docker container with Impala. We
> have found a few:
>    * https://github.com/tomwhite/docker-impala
>    * https://hub.docker.com/r/parrotstream/impala
> But they are not necessarily well updated and also are a bit heavier in
> term of resources than necessary
>
> Is there way to start embedded impala or something like
> https://github.com/sakserv/hadoop-mini-clusters
> that can be used for testing purposes
>
>
> Thanks,
> Antoni
>

Re: Using Local or Embedded Impala for Testing

Posted by Tim Armstrong <ta...@cloudera.com>.
The most actively maintained method of running a self-contained Impala
cluster is the development environment:
https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch.
That runs the Impala processes plus all the dependent services on a single
node (HDFS, HMS, Kudu, etc).

We test that automatically on Ubuntu 16.04 and all Impala developers
actively use it. That environment has the scripts to start up all the
required services and impala. The main catch is that it requires building
Impala, which is automated by the script but takes a while the first time
around just cause there's a lot of dependencies to download and C++ to
compile.

This page has instructions for how to create that dev environment *inside*
a docker container, if your host OS is not Ubuntu 16.04 or you want it
contained:
https://cwiki.apache.org/confluence/display/IMPALA/Impala+Development+Environment+inside+Docker

There's an alternative set of docker containers that puts each service in
its own containers, but that probably doesn't help you too much since it
would require having an existing HDFS, HMS, etc plus configuring the
containers yourself:
https://cwiki.apache.org/confluence/display/IMPALA/Build+and+Test+for+Daemon+Docker+Containers.
I built and pushed some docker containers a few months back off a random
commit on master on dockerhub in the following
repos: timgarmstrong/asf-master-{statestored,catalogd,impalad_coordinator,impalad_executor,impalad_coord_exec}.
Those are very much use at your own risk, but if you wanted to inspect the
containers that could get you started

On Fri, Oct 25, 2019 at 5:59 AM Antoni Ivanov <ai...@vmware.com> wrote:

> Hi,
>
> We’d like to test with Impala locally. So we can build integration tests
> against running (if mock) version of Impala
> What options do we have ?
>
> The way we've found so far is:
> * So far the best option we've seen are docker container with Impala. We
> have found a few:
>    * https://github.com/tomwhite/docker-impala
>    * https://hub.docker.com/r/parrotstream/impala
> But they are not necessarily well updated and also are a bit heavier in
> term of resources than necessary
>
> Is there way to start embedded impala or something like
> https://github.com/sakserv/hadoop-mini-clusters
> that can be used for testing purposes
>
>
> Thanks,
> Antoni
>