You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Fawze Abujaber <fa...@gmail.com> on 2018/05/17 17:32:32 UTC

Impala resident memory

Hi Guys,

I have a cluster with 54 nodes, i configured the 35 GB for the impala daemo
memory limit, all the time there is ~ 10GB-20GB memory that each nodes
reserved as resident memory.
I'm aware that this memory will not impact the running queries and the
memory limit for impala daemon, but this is a memory that hold by the JVM,
and can impact other services running on the same node like HDFS and
NodeManager.

My Questions:

1) Is there a way to free up this memory without restarting the impala
service?
2) is it a configuration parameter that i can used to disable this? or it a
system params that related to the jvm?
3) What is the impact if i will limit the jvm memory per node? will this
impact the impala performance?
4) Where this memory used by impala? even it's name indicate it's useless
and cann't be used ( What is the purpose of this memory).


-- 
Take Care
Fawze Abujaber

Re: Impala resident memory

Posted by Fawze Abujaber <fa...@gmail.com>.
Hi Tim,

Thanks for you response and really appreciate this.

Please see my response inline.

On Fri, 18 May 2018 at 19:58 Tim Armstrong <ta...@cloudera.com> wrote:

> It depends if it's backend or JVM memory. Generally the backend memory
> will shrink down when queries aren't executing, but may still occupy a few
> GB.
>

— how I can know if it backend or JVM, the resident memory while no query
running cross these nodes is 850 GB.

How I know where this memory resident? And if it used by the coordinators.

Using a dedicated coordinators will impact the whole impala cluster and may
risk more queries to fail, is not it?

>
> If it's JVM memory, then if the memory is actually unused then its really
> up to the JVM to GC. It's possible that the memory is actually in use
> storing the catalog - all coordinators store a copy of the catalog cache in
> memory. If this is the problem, you can use our "dedicated coordinator"
> feature to reduce the number of impala daemons that are coordinators. If
> you have 54 nodes, making some impala daemons executors only prevents the
> catalog cache from being sent to those daemons. Clients won't be able to
> submit queries to the non-coordinators but I don't think 54 coordinators is
> necessary for most workloads. See
> https://impala.apache.org/docs/build/html/topics/impala_scalability.html
> for more info.
>
> > 2) is it a configuration parameter that i can used to disable this? or
> it a system params that related to the jvm?
> You can set some JVM parameters via JAVA_TOOL_OPTIONS, but this may not
> address the problem directly
>

—- is this a system configuration or I can do this cross the nodes using
cloudera manager?



> > 3) What is the impact if i will limit the jvm memory per node? will this
> impact the impala performance?
> If the JVM memory is too small to fit the catalog cache, you will run into
> problems - e.g. all queries failing
>
—- can i know how memiry the catalog cache is using ?

>
> >4) Where this memory used by impala? even it's name indicate it's useless
> and cann't be used ( What is the purpose of this memory).
> I don't quite understand this statement. What memory are you referring to?
> What name?
>

—- I refer to the resident memory

>
>
> On Thu, May 17, 2018 at 10:32 AM, Fawze Abujaber <fa...@gmail.com>
> wrote:
>
>> Hi Guys,
>>
>> I have a cluster with 54 nodes, i configured the 35 GB for the impala
>> daemo memory limit, all the time there is ~ 10GB-20GB memory that each
>> nodes reserved as resident memory.
>> I'm aware that this memory will not impact the running queries and the
>> memory limit for impala daemon, but this is a memory that hold by the JVM,
>> and can impact other services running on the same node like HDFS and
>> NodeManager.
>>
>> My Questions:
>>
>> 1) Is there a way to free up this memory without restarting the impala
>> service?
>> 2) is it a configuration parameter that i can used to disable this? or it
>> a system params that related to the jvm?
>> 3) What is the impact if i will limit the jvm memory per node? will this
>> impact the impala performance?
>> 4) Where this memory used by impala? even it's name indicate it's useless
>> and cann't be used ( What is the purpose of this memory).
>>
>>
>> --
>> Take Care
>> Fawze Abujaber
>>
>
> --
Take Care
Fawze Abujaber

Re: Impala resident memory

Posted by Tim Armstrong <ta...@cloudera.com>.
It depends if it's backend or JVM memory. Generally the backend memory will
shrink down when queries aren't executing, but may still occupy a few GB.

If it's JVM memory, then if the memory is actually unused then its really
up to the JVM to GC. It's possible that the memory is actually in use
storing the catalog - all coordinators store a copy of the catalog cache in
memory. If this is the problem, you can use our "dedicated coordinator"
feature to reduce the number of impala daemons that are coordinators. If
you have 54 nodes, making some impala daemons executors only prevents the
catalog cache from being sent to those daemons. Clients won't be able to
submit queries to the non-coordinators but I don't think 54 coordinators is
necessary for most workloads. See
https://impala.apache.org/docs/build/html/topics/impala_scalability.html
for more info.

> 2) is it a configuration parameter that i can used to disable this? or it
a system params that related to the jvm?
You can set some JVM parameters via JAVA_TOOL_OPTIONS, but this may not
address the problem directly

> 3) What is the impact if i will limit the jvm memory per node? will this
impact the impala performance?
If the JVM memory is too small to fit the catalog cache, you will run into
problems - e.g. all queries failing

>4) Where this memory used by impala? even it's name indicate it's useless
and cann't be used ( What is the purpose of this memory).
I don't quite understand this statement. What memory are you referring to?
What name?

On Thu, May 17, 2018 at 10:32 AM, Fawze Abujaber <fa...@gmail.com> wrote:

> Hi Guys,
>
> I have a cluster with 54 nodes, i configured the 35 GB for the impala
> daemo memory limit, all the time there is ~ 10GB-20GB memory that each
> nodes reserved as resident memory.
> I'm aware that this memory will not impact the running queries and the
> memory limit for impala daemon, but this is a memory that hold by the JVM,
> and can impact other services running on the same node like HDFS and
> NodeManager.
>
> My Questions:
>
> 1) Is there a way to free up this memory without restarting the impala
> service?
> 2) is it a configuration parameter that i can used to disable this? or it
> a system params that related to the jvm?
> 3) What is the impact if i will limit the jvm memory per node? will this
> impact the impala performance?
> 4) Where this memory used by impala? even it's name indicate it's useless
> and cann't be used ( What is the purpose of this memory).
>
>
> --
> Take Care
> Fawze Abujaber
>