You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Fawze Abujaber <fa...@gmail.com> on 2018/06/16 11:31:35 UTC

Impala query cann’t submitted if the estimated memory beyonds the configured memory

Hi Community ,

In the last impala versions, impala is estimating the memory required for
the query, and in case the estimated memory required beyonds the configured
memory or  the configured memory per pool, impala is not submitting this
query, taking the fact that many times and specially running query on
tables without stats there is a huge difference between the estimated and
the actual memory used, the estimated can be 31 GB per node and the actual
use is 1 or 2 GB, that’s mean to submit a query I need at least 1.5 T
memory configured which I see it too much.

I’m curios to know if there is an option making this to configuration ( Un
submitting query if we he estimated memory required beyond the configured
memory) as an optional choice.

Such issue can block using impala dynamic resource pools.
-- 
Take Care
Fawze Abujaber

Re: Impala query cann’t submitted if the estimated memory beyonds the configured memory

Posted by Jeszy <je...@gmail.com>.
That sounds weird (maybe a reporting issue?), please send a profile
that shows the mem_limit set and the error message referring to the
estimated memory instead - thanks!

On 19 June 2018 at 04:59, Fawze Abujaber <fa...@gmail.com> wrote:
> Hi Jezy,
>
> Thanks for your quick response.
>
> I’m using impala 2.10, I think since I’m using max memory memory per poo
> with the default memory limit, and here the estimate have to take place in
> order to estimate the max concurrent queries, and if the estimate beyond the
> max memory then the query will not submitted.
>
> In my case I set both values and getting errors of high memory needed for
> the query, it’s not occurring very often.
>
> On Mon, 18 Jun 2018 at 16:41 Jeszy <je...@gmail.com> wrote:
>>
>> Hey Fawze,
>>
>> Default Query Memory Limit applies here, yes.
>> If you submit a query to a pool with that setting, you should see
>> something like this in the profile:
>> Query Options (set by configuration): MEM_LIMIT=X
>>
>> (YMMV based on version - what version are you running on?)
>> If MEM_LIMIT is present in that line, Impala will (should) disregard
>> estimates.
>>
>> Thanks!
>>
>> On 17 June 2018 at 21:04, Fawze Abujaber <fa...@gmail.com> wrote:
>> > Hi Jeszy,
>> >
>> > Thanks for your response, Indeed this is what i was thinking about but,
>> > I
>> > have  Default Query Memory Limit and Max memory set per pool which i
>> > think
>> > should be enough to cover this, shouldnot it?  or i should pass the
>> > mem_limit in the default query options?
>> >
>> > On Sun, Jun 17, 2018 at 8:36 PM, Jeszy <je...@gmail.com> wrote:
>> >>
>> >> Hello Fawze,
>> >>
>> >> Disabling this, per se, is not an option, but an equally simple
>> >> workaround
>> >> is using MEM_LIMIT.
>> >>
>> >> The estimated stats are often very far from actual memory usage and
>> >> shouldn't be relied on - a best practice is to set MEM_LIMIT as a query
>> >> option (preferably have a default value set for each pool). Having that
>> >> set
>> >> will cause Impala to ignore the estimates and rely on this limit for
>> >> admission control purposes. This works decently for well-understood
>> >> workloads (ie. where the memory consumption is known to fit within
>> >> certain
>> >> limits). For ad-hoc workloads, if the query can't be executed within
>> >> the
>> >> default limit of the pool, you can override the limit on a per-query
>> >> basis
>> >> (just issue 'set MEM_LIMIT=...' before running the query).
>> >>
>> >> HTH
>> >>
>> >> On 16 June 2018 at 13:31, Fawze Abujaber <fa...@gmail.com> wrote:
>> >>>
>> >>> Hi Community ,
>> >>>
>> >>> In the last impala versions, impala is estimating the memory required
>> >>> for
>> >>> the query, and in case the estimated memory required beyonds the
>> >>> configured
>> >>> memory or  the configured memory per pool, impala is not submitting
>> >>> this
>> >>> query, taking the fact that many times and specially running query on
>> >>> tables
>> >>> without stats there is a huge difference between the estimated and the
>> >>> actual memory used, the estimated can be 31 GB per node and the actual
>> >>> use
>> >>> is 1 or 2 GB, that’s mean to submit a query I need at least 1.5 T
>> >>> memory
>> >>> configured which I see it too much.
>> >>>
>> >>> I’m curios to know if there is an option making this to configuration
>> >>> (
>> >>> Un submitting query if we he estimated memory required beyond the
>> >>> configured
>> >>> memory) as an optional choice.
>> >>>
>> >>> Such issue can block using impala dynamic resource pools.
>> >>> --
>> >>> Take Care
>> >>> Fawze Abujaber
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Take Care
>> > Fawze Abujaber
>
> --
> Take Care
> Fawze Abujaber

Re: Impala query cann’t submitted if the estimated memory beyonds the configured memory

Posted by Fawze Abujaber <fa...@gmail.com>.
Hi Jezy,

Thanks for your quick response.

I’m using impala 2.10, I think since I’m using max memory memory per poo
with the default memory limit, and here the estimate have to take place in
order to estimate the max concurrent queries, and if the estimate beyond
the max memory then the query will not submitted.

In my case I set both values and getting errors of high memory needed for
the query, it’s not occurring very often.

On Mon, 18 Jun 2018 at 16:41 Jeszy <je...@gmail.com> wrote:

> Hey Fawze,
>
> Default Query Memory Limit applies here, yes.
> If you submit a query to a pool with that setting, you should see
> something like this in the profile:
> Query Options (set by configuration): MEM_LIMIT=X
>
> (YMMV based on version - what version are you running on?)
> If MEM_LIMIT is present in that line, Impala will (should) disregard
> estimates.
>
> Thanks!
>
> On 17 June 2018 at 21:04, Fawze Abujaber <fa...@gmail.com> wrote:
> > Hi Jeszy,
> >
> > Thanks for your response, Indeed this is what i was thinking about but, I
> > have  Default Query Memory Limit and Max memory set per pool which i
> think
> > should be enough to cover this, shouldnot it?  or i should pass the
> > mem_limit in the default query options?
> >
> > On Sun, Jun 17, 2018 at 8:36 PM, Jeszy <je...@gmail.com> wrote:
> >>
> >> Hello Fawze,
> >>
> >> Disabling this, per se, is not an option, but an equally simple
> workaround
> >> is using MEM_LIMIT.
> >>
> >> The estimated stats are often very far from actual memory usage and
> >> shouldn't be relied on - a best practice is to set MEM_LIMIT as a query
> >> option (preferably have a default value set for each pool). Having that
> set
> >> will cause Impala to ignore the estimates and rely on this limit for
> >> admission control purposes. This works decently for well-understood
> >> workloads (ie. where the memory consumption is known to fit within
> certain
> >> limits). For ad-hoc workloads, if the query can't be executed within the
> >> default limit of the pool, you can override the limit on a per-query
> basis
> >> (just issue 'set MEM_LIMIT=...' before running the query).
> >>
> >> HTH
> >>
> >> On 16 June 2018 at 13:31, Fawze Abujaber <fa...@gmail.com> wrote:
> >>>
> >>> Hi Community ,
> >>>
> >>> In the last impala versions, impala is estimating the memory required
> for
> >>> the query, and in case the estimated memory required beyonds the
> configured
> >>> memory or  the configured memory per pool, impala is not submitting
> this
> >>> query, taking the fact that many times and specially running query on
> tables
> >>> without stats there is a huge difference between the estimated and the
> >>> actual memory used, the estimated can be 31 GB per node and the actual
> use
> >>> is 1 or 2 GB, that’s mean to submit a query I need at least 1.5 T
> memory
> >>> configured which I see it too much.
> >>>
> >>> I’m curios to know if there is an option making this to configuration (
> >>> Un submitting query if we he estimated memory required beyond the
> configured
> >>> memory) as an optional choice.
> >>>
> >>> Such issue can block using impala dynamic resource pools.
> >>> --
> >>> Take Care
> >>> Fawze Abujaber
> >>
> >>
> >
> >
> >
> > --
> > Take Care
> > Fawze Abujaber
>
-- 
Take Care
Fawze Abujaber

Re: Impala query cann’t submitted if the estimated memory beyonds the configured memory

Posted by Jeszy <je...@gmail.com>.
Hey Fawze,

Default Query Memory Limit applies here, yes.
If you submit a query to a pool with that setting, you should see
something like this in the profile:
Query Options (set by configuration): MEM_LIMIT=X

(YMMV based on version - what version are you running on?)
If MEM_LIMIT is present in that line, Impala will (should) disregard estimates.

Thanks!

On 17 June 2018 at 21:04, Fawze Abujaber <fa...@gmail.com> wrote:
> Hi Jeszy,
>
> Thanks for your response, Indeed this is what i was thinking about but, I
> have  Default Query Memory Limit and Max memory set per pool which i think
> should be enough to cover this, shouldnot it?  or i should pass the
> mem_limit in the default query options?
>
> On Sun, Jun 17, 2018 at 8:36 PM, Jeszy <je...@gmail.com> wrote:
>>
>> Hello Fawze,
>>
>> Disabling this, per se, is not an option, but an equally simple workaround
>> is using MEM_LIMIT.
>>
>> The estimated stats are often very far from actual memory usage and
>> shouldn't be relied on - a best practice is to set MEM_LIMIT as a query
>> option (preferably have a default value set for each pool). Having that set
>> will cause Impala to ignore the estimates and rely on this limit for
>> admission control purposes. This works decently for well-understood
>> workloads (ie. where the memory consumption is known to fit within certain
>> limits). For ad-hoc workloads, if the query can't be executed within the
>> default limit of the pool, you can override the limit on a per-query basis
>> (just issue 'set MEM_LIMIT=...' before running the query).
>>
>> HTH
>>
>> On 16 June 2018 at 13:31, Fawze Abujaber <fa...@gmail.com> wrote:
>>>
>>> Hi Community ,
>>>
>>> In the last impala versions, impala is estimating the memory required for
>>> the query, and in case the estimated memory required beyonds the configured
>>> memory or  the configured memory per pool, impala is not submitting this
>>> query, taking the fact that many times and specially running query on tables
>>> without stats there is a huge difference between the estimated and the
>>> actual memory used, the estimated can be 31 GB per node and the actual use
>>> is 1 or 2 GB, that’s mean to submit a query I need at least 1.5 T memory
>>> configured which I see it too much.
>>>
>>> I’m curios to know if there is an option making this to configuration (
>>> Un submitting query if we he estimated memory required beyond the configured
>>> memory) as an optional choice.
>>>
>>> Such issue can block using impala dynamic resource pools.
>>> --
>>> Take Care
>>> Fawze Abujaber
>>
>>
>
>
>
> --
> Take Care
> Fawze Abujaber

Re: Impala query cann’t submitted if the estimated memory beyonds the configured memory

Posted by Fawze Abujaber <fa...@gmail.com>.
Hi Jeszy,

Thanks for your response, Indeed this is what i was thinking about but, I
have  Default Query Memory Limit and Max memory set per pool which i think
should be enough to cover this, shouldnot it?  or i should pass the
mem_limit in the default query options?

On Sun, Jun 17, 2018 at 8:36 PM, Jeszy <je...@gmail.com> wrote:

> Hello Fawze,
>
> Disabling this, per se, is not an option, but an equally simple workaround
> is using MEM_LIMIT.
>
> The estimated stats are often very far from actual memory usage and
> shouldn't be relied on - a best practice is to set MEM_LIMIT as a query
> option (preferably have a default value set for each pool). Having that set
> will cause Impala to ignore the estimates and rely on this limit for
> admission control purposes. This works decently for well-understood
> workloads (ie. where the memory consumption is known to fit within certain
> limits). For ad-hoc workloads, if the query can't be executed within the
> default limit of the pool, you can override the limit on a per-query basis
> (just issue 'set MEM_LIMIT=...' before running the query).
>
> HTH
>
> On 16 June 2018 at 13:31, Fawze Abujaber <fa...@gmail.com> wrote:
>
>> Hi Community ,
>>
>> In the last impala versions, impala is estimating the memory required for
>> the query, and in case the estimated memory required beyonds the configured
>> memory or  the configured memory per pool, impala is not submitting this
>> query, taking the fact that many times and specially running query on
>> tables without stats there is a huge difference between the estimated and
>> the actual memory used, the estimated can be 31 GB per node and the actual
>> use is 1 or 2 GB, that’s mean to submit a query I need at least 1.5 T
>> memory configured which I see it too much.
>>
>> I’m curios to know if there is an option making this to configuration (
>> Un submitting query if we he estimated memory required beyond the
>> configured memory) as an optional choice.
>>
>> Such issue can block using impala dynamic resource pools.
>> --
>> Take Care
>> Fawze Abujaber
>>
>
>


-- 
Take Care
Fawze Abujaber

Re: Impala query cann’t submitted if the estimated memory beyonds the configured memory

Posted by Jeszy <je...@gmail.com>.
Hello Fawze,

Disabling this, per se, is not an option, but an equally simple workaround
is using MEM_LIMIT.

The estimated stats are often very far from actual memory usage and
shouldn't be relied on - a best practice is to set MEM_LIMIT as a query
option (preferably have a default value set for each pool). Having that set
will cause Impala to ignore the estimates and rely on this limit for
admission control purposes. This works decently for well-understood
workloads (ie. where the memory consumption is known to fit within certain
limits). For ad-hoc workloads, if the query can't be executed within the
default limit of the pool, you can override the limit on a per-query basis
(just issue 'set MEM_LIMIT=...' before running the query).

HTH

On 16 June 2018 at 13:31, Fawze Abujaber <fa...@gmail.com> wrote:

> Hi Community ,
>
> In the last impala versions, impala is estimating the memory required for
> the query, and in case the estimated memory required beyonds the configured
> memory or  the configured memory per pool, impala is not submitting this
> query, taking the fact that many times and specially running query on
> tables without stats there is a huge difference between the estimated and
> the actual memory used, the estimated can be 31 GB per node and the actual
> use is 1 or 2 GB, that’s mean to submit a query I need at least 1.5 T
> memory configured which I see it too much.
>
> I’m curios to know if there is an option making this to configuration ( Un
> submitting query if we he estimated memory required beyond the configured
> memory) as an optional choice.
>
> Such issue can block using impala dynamic resource pools.
> --
> Take Care
> Fawze Abujaber
>