You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by 徐传印 <xu...@hust.edu.cn> on 2018/03/16 15:21:46 UTC

questions about resource management of impala

Hi, community:

I'm wondering how Impala team consider the resource management of impala in a Hadoop cluster?Such as integration with yarn. I've learned that there is a feature called ‘LLAMA’ in earlier version of impala,but it had been removed for some unknown reason.

I have an impala cluster with 50 nodes now and it is quite idle sometime,so I want to deploy yarn on it to run some spark tasks. And then I have the upper question about the resource management.

Thanks~

Re: questions about resource management of impala

Posted by Tim Armstrong <ta...@cloudera.com>.
Hi,
 It's extremely common to run YARN and Impala on the same cluster.
Currently we recommend setting up the cluster to divide CPU and memory
resources between YARN and Impala statically. The idea is to set Impala's
and YARN's memory limits to share the available memory, and then optionally
to use Linux CGroups to control CPU usage. In practice I know a lot of
people use Cloudera Manager to set up such a configuration.

We don't currently have a solution that will dynamically scale up and down
Impala's memory usage. The LLAMA integration attempted to do that but was
removed because it did not work as intended. We're likely to have some
improvements in that area in the future but in the shorter term we're more
focused on improve admission control.

- Tim


On Fri, Mar 16, 2018 at 8:21 AM, 徐传印 <xu...@hust.edu.cn> wrote:

> Hi, community:
>
> I'm wondering how Impala team consider the resource management of impala
> in a Hadoop cluster?Such as integration with yarn. I've learned that there
> is a feature called ‘LLAMA’ in earlier version of impala,but it had been
> removed for some unknown reason.
>
> I have an impala cluster with 50 nodes now and it is quite idle
> sometime,so I want to deploy yarn on it to run some spark tasks. And then I
> have the upper question about the resource management.
>
> Thanks~

Re: questions about resource management of impala

Posted by Tim Armstrong <ta...@cloudera.com>.
Hi,
 It's extremely common to run YARN and Impala on the same cluster.
Currently we recommend setting up the cluster to divide CPU and memory
resources between YARN and Impala statically. The idea is to set Impala's
and YARN's memory limits to share the available memory, and then optionally
to use Linux CGroups to control CPU usage. In practice I know a lot of
people use Cloudera Manager to set up such a configuration.

We don't currently have a solution that will dynamically scale up and down
Impala's memory usage. The LLAMA integration attempted to do that but was
removed because it did not work as intended. We're likely to have some
improvements in that area in the future but in the shorter term we're more
focused on improve admission control.

- Tim


On Fri, Mar 16, 2018 at 8:21 AM, 徐传印 <xu...@hust.edu.cn> wrote:

> Hi, community:
>
> I'm wondering how Impala team consider the resource management of impala
> in a Hadoop cluster?Such as integration with yarn. I've learned that there
> is a feature called ‘LLAMA’ in earlier version of impala,but it had been
> removed for some unknown reason.
>
> I have an impala cluster with 50 nodes now and it is quite idle
> sometime,so I want to deploy yarn on it to run some spark tasks. And then I
> have the upper question about the resource management.
>
> Thanks~