You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Fawze Abujaber <fa...@gmail.com> on 2018/05/20 11:09:08 UTC

Does Impala DDL and SET operations consume resources

Hi Community,

Does the DDL operations like alter, drop and create consume resources? and
does the set operations like set resource_pool=xxx also consume resources?

Yes, i'm aware these operations are quick but once they are running from
interfaces like Hue or MSTR through ODBC it's running till it get timeout
.... which may exceed few minutes

-- 
Take Care
Fawze Abujaber

Re: Does Impala DDL and SET operations consume resources

Posted by Jim Apple <jb...@cloudera.com>.
If this is specific to the CLoudera-provided xDBC dirvers, probably
the CLoudera forums are the best place to proceed:

community.cloudera.com/t5/Interactive-Short-cycle-SQL/bd-p/Impala

On Sat, Jun 16, 2018 at 4:17 AM, Fawze Abujaber <fa...@gmail.com> wrote:
> Hi Mostafa,
>
> Can these pre statements avoided?
>
> Any configuration or any set statement that can pypass these statements?
>
> On Fri, 15 Jun 2018 at 18:48 Mostafa Mokhtar <mm...@cloudera.com> wrote:
>>
>> @Lars Volker
>> Many JDBC/ODBC drivers issue show tables & describe statements ahead of
>> executing a query by default.
>>
>>
>>
>> On Fri, Jun 15, 2018 at 8:45 AM Lars Volker <lv...@cloudera.com> wrote:
>>>
>>> As far as I know the driver should not generate additional statements.
>>> Can you share what software you're using to connect to Impala through the
>>> driver? I suspect that that software generated these queries, possibly to do
>>> some schema discovery.
>>>
>>> Cheers, Lars
>>>
>>> On Thu, Jun 14, 2018 at 10:14 PM Jim Apple <jb...@cloudera.com> wrote:
>>>>
>>>> I don’t think I understand the statement. Under what conditions are
>>>> additional DDL statements generated by the driver? What exact query did you
>>>> enter and what was generated instead?
>>>>
>>>> On Thu, Jun 14, 2018 at 5:44 PM Sunil Parmar <su...@gmail.com>
>>>> wrote:
>>>>>
>>>>> The Impala JDBC driver generates additional DDL statements.
>>>>>
>>>>> select column1,column2 from table limit 0
>>>>> or
>>>>> show tables
>>>>> or
>>>>> use dwh;
>>>>> or
>>>>> describe table
>>>>>
>>>>> If DDL are expensive; is there a way to avoid this ?
>>>>>
>>>>> Sunil Parmar
>>>>>
>>>>>
>>>>> On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <ta...@cloudera.com>
>>>>> wrote:
>>>>>>
>>>>>> SET is very cheap because it just changes a value in the user's
>>>>>> session. There's no interaction with any other services.
>>>>>>
>>>>>> DDL operations can be a lot more expensive, although they don't
>>>>>> compete with executing queries for resources. For the most part those DDL
>>>>>> operations you mentioned consume resources in Java, generate load on
>>>>>> metadata services like the HDFS namenode and Hive Metastore, and can block
>>>>>> other DDL operations. We don't have great visibility at the moment into
>>>>>> those resources consumed by metadata operation.
>>>>>>
>>>>>> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks Tim,
>>>>>>>
>>>>>>> If i got your point right, then SET operation is affecting the client
>>>>>>> Java memory and not considered as part of the impala daemon memory limit,
>>>>>>> right?
>>>>>>>
>>>>>>> Is this correct also for invalidate meta data and Refresh or alter
>>>>>>> table ... recover partitions? Are all of these client operations? Are they
>>>>>>> use any resources assigned for impala daemon or impala resource pools?
>>>>>>>
>>>>>>> If they are client operations then I can use the used resources using
>>>>>>> the Linux TOP command,  if they are taking any resources from impala daemon
>>>>>>> memory limit or resource pool, I will be happy to know where I can track the
>>>>>>> resource usage of these DDL operations.
>>>>>>>
>>>>>>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> "SET" is very cheap - it only affects the client session on the
>>>>>>>> Impala server that you're connected to. DDL operations are often more
>>>>>>>> expensive because they require updating metadata globally. That can
>>>>>>>> sometimes involve a bit of work (e.g. gather metadata about existing files
>>>>>>>> on HDFS) or can involve the operation getting queued behind other metadata
>>>>>>>> operations.
>>>>>>>>
>>>>>>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Community,
>>>>>>>>>
>>>>>>>>> Does the DDL operations like alter, drop and create consume
>>>>>>>>> resources? and does the set operations like set resource_pool=xxx also
>>>>>>>>> consume resources?
>>>>>>>>>
>>>>>>>>> Yes, i'm aware these operations are quick but once they are running
>>>>>>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>>>>>>> timeout .... which may exceed few minutes
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Take Care
>>>>>>>>> Fawze Abujaber
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Take Care
>>>>>>> Fawze Abujaber
>>>>>>
>>>>>>
> --
> Take Care
> Fawze Abujaber

Re: Does Impala DDL and SET operations consume resources

Posted by Fawze Abujaber <fa...@gmail.com>.
Hi Mostafa,

Can these pre statements avoided?

Any configuration or any set statement that can pypass these statements?

On Fri, 15 Jun 2018 at 18:48 Mostafa Mokhtar <mm...@cloudera.com> wrote:

> @Lars Volker <lv...@cloudera.com>
> Many JDBC/ODBC drivers issue show tables & describe statements ahead of
> executing a query by default.
>
>
>
> On Fri, Jun 15, 2018 at 8:45 AM Lars Volker <lv...@cloudera.com> wrote:
>
>> As far as I know the driver should not generate additional statements.
>> Can you share what software you're using to connect to Impala through the
>> driver? I suspect that that software generated these queries, possibly to
>> do some schema discovery.
>>
>> Cheers, Lars
>>
>> On Thu, Jun 14, 2018 at 10:14 PM Jim Apple <jb...@cloudera.com> wrote:
>>
>>> I don’t think I understand the statement. Under what conditions are
>>> additional DDL statements generated by the driver? What exact query did you
>>> enter and what was generated instead?
>>>
>>> On Thu, Jun 14, 2018 at 5:44 PM Sunil Parmar <su...@gmail.com>
>>> wrote:
>>>
>>>> The Impala JDBC driver generates additional DDL statements.
>>>>
>>>> select column1,column2 from table limit 0
>>>> or
>>>> show tables
>>>> or
>>>> use dwh;
>>>> or
>>>> describe table
>>>>
>>>> If DDL are expensive; is there a way to avoid this ?
>>>>
>>>> Sunil Parmar
>>>>
>>>>
>>>> On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <ta...@cloudera.com>
>>>> wrote:
>>>>
>>>>> SET is very cheap because it just changes a value in the user's
>>>>> session. There's no interaction with any other services.
>>>>>
>>>>> DDL operations can be a lot more expensive, although they don't
>>>>> compete with executing queries for resources. For the most part those DDL
>>>>> operations you mentioned consume resources in Java, generate load on
>>>>> metadata services like the HDFS namenode and Hive Metastore, and can block
>>>>> other DDL operations. We don't have great visibility at the moment into
>>>>> those resources consumed by metadata operation.
>>>>>
>>>>> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks Tim,
>>>>>>
>>>>>> If i got your point right, then SET operation is affecting the client
>>>>>> Java memory and not considered as part of the impala daemon memory limit,
>>>>>> right?
>>>>>>
>>>>>> Is this correct also for invalidate meta data and Refresh or alter
>>>>>> table ... recover partitions? Are all of these client operations? Are they
>>>>>> use any resources assigned for impala daemon or impala resource pools?
>>>>>>
>>>>>> If they are client operations then I can use the used resources using
>>>>>> the Linux TOP command,  if they are taking any resources from impala daemon
>>>>>> memory limit or resource pool, I will be happy to know where I can track
>>>>>> the resource usage of these DDL operations.
>>>>>>
>>>>>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
>>>>>> wrote:
>>>>>>
>>>>>>> "SET" is very cheap - it only affects the client session on the
>>>>>>> Impala server that you're connected to. DDL operations are often more
>>>>>>> expensive because they require updating metadata globally. That can
>>>>>>> sometimes involve a bit of work (e.g. gather metadata about existing files
>>>>>>> on HDFS) or can involve the operation getting queued behind other metadata
>>>>>>> operations.
>>>>>>>
>>>>>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Community,
>>>>>>>>
>>>>>>>> Does the DDL operations like alter, drop and create consume
>>>>>>>> resources? and does the set operations like set resource_pool=xxx also
>>>>>>>> consume resources?
>>>>>>>>
>>>>>>>> Yes, i'm aware these operations are quick but once they are running
>>>>>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>>>>>> timeout .... which may exceed few minutes
>>>>>>>>
>>>>>>>> --
>>>>>>>> Take Care
>>>>>>>> Fawze Abujaber
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>> Take Care
>>>>>> Fawze Abujaber
>>>>>>
>>>>>
>>>>> --
Take Care
Fawze Abujaber

Re: Does Impala DDL and SET operations consume resources

Posted by Mostafa Mokhtar <mm...@cloudera.com>.
@Lars Volker <lv...@cloudera.com>
Many JDBC/ODBC drivers issue show tables & describe statements ahead of
executing a query by default.



On Fri, Jun 15, 2018 at 8:45 AM Lars Volker <lv...@cloudera.com> wrote:

> As far as I know the driver should not generate additional statements. Can
> you share what software you're using to connect to Impala through the
> driver? I suspect that that software generated these queries, possibly to
> do some schema discovery.
>
> Cheers, Lars
>
> On Thu, Jun 14, 2018 at 10:14 PM Jim Apple <jb...@cloudera.com> wrote:
>
>> I don’t think I understand the statement. Under what conditions are
>> additional DDL statements generated by the driver? What exact query did you
>> enter and what was generated instead?
>>
>> On Thu, Jun 14, 2018 at 5:44 PM Sunil Parmar <su...@gmail.com>
>> wrote:
>>
>>> The Impala JDBC driver generates additional DDL statements.
>>>
>>> select column1,column2 from table limit 0
>>> or
>>> show tables
>>> or
>>> use dwh;
>>> or
>>> describe table
>>>
>>> If DDL are expensive; is there a way to avoid this ?
>>>
>>> Sunil Parmar
>>>
>>>
>>> On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <ta...@cloudera.com>
>>> wrote:
>>>
>>>> SET is very cheap because it just changes a value in the user's
>>>> session. There's no interaction with any other services.
>>>>
>>>> DDL operations can be a lot more expensive, although they don't compete
>>>> with executing queries for resources. For the most part those DDL
>>>> operations you mentioned consume resources in Java, generate load on
>>>> metadata services like the HDFS namenode and Hive Metastore, and can block
>>>> other DDL operations. We don't have great visibility at the moment into
>>>> those resources consumed by metadata operation.
>>>>
>>>> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks Tim,
>>>>>
>>>>> If i got your point right, then SET operation is affecting the client
>>>>> Java memory and not considered as part of the impala daemon memory limit,
>>>>> right?
>>>>>
>>>>> Is this correct also for invalidate meta data and Refresh or alter
>>>>> table ... recover partitions? Are all of these client operations? Are they
>>>>> use any resources assigned for impala daemon or impala resource pools?
>>>>>
>>>>> If they are client operations then I can use the used resources using
>>>>> the Linux TOP command,  if they are taking any resources from impala daemon
>>>>> memory limit or resource pool, I will be happy to know where I can track
>>>>> the resource usage of these DDL operations.
>>>>>
>>>>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> "SET" is very cheap - it only affects the client session on the
>>>>>> Impala server that you're connected to. DDL operations are often more
>>>>>> expensive because they require updating metadata globally. That can
>>>>>> sometimes involve a bit of work (e.g. gather metadata about existing files
>>>>>> on HDFS) or can involve the operation getting queued behind other metadata
>>>>>> operations.
>>>>>>
>>>>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Community,
>>>>>>>
>>>>>>> Does the DDL operations like alter, drop and create consume
>>>>>>> resources? and does the set operations like set resource_pool=xxx also
>>>>>>> consume resources?
>>>>>>>
>>>>>>> Yes, i'm aware these operations are quick but once they are running
>>>>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>>>>> timeout .... which may exceed few minutes
>>>>>>>
>>>>>>> --
>>>>>>> Take Care
>>>>>>> Fawze Abujaber
>>>>>>>
>>>>>>
>>>>>> --
>>>>> Take Care
>>>>> Fawze Abujaber
>>>>>
>>>>
>>>>

Re: Does Impala DDL and SET operations consume resources

Posted by Lars Volker <lv...@cloudera.com>.
As far as I know the driver should not generate additional statements. Can
you share what software you're using to connect to Impala through the
driver? I suspect that that software generated these queries, possibly to
do some schema discovery.

Cheers, Lars

On Thu, Jun 14, 2018 at 10:14 PM Jim Apple <jb...@cloudera.com> wrote:

> I don’t think I understand the statement. Under what conditions are
> additional DDL statements generated by the driver? What exact query did you
> enter and what was generated instead?
>
> On Thu, Jun 14, 2018 at 5:44 PM Sunil Parmar <su...@gmail.com>
> wrote:
>
>> The Impala JDBC driver generates additional DDL statements.
>>
>> select column1,column2 from table limit 0
>> or
>> show tables
>> or
>> use dwh;
>> or
>> describe table
>>
>> If DDL are expensive; is there a way to avoid this ?
>>
>> Sunil Parmar
>>
>>
>> On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <ta...@cloudera.com>
>> wrote:
>>
>>> SET is very cheap because it just changes a value in the user's session.
>>> There's no interaction with any other services.
>>>
>>> DDL operations can be a lot more expensive, although they don't compete
>>> with executing queries for resources. For the most part those DDL
>>> operations you mentioned consume resources in Java, generate load on
>>> metadata services like the HDFS namenode and Hive Metastore, and can block
>>> other DDL operations. We don't have great visibility at the moment into
>>> those resources consumed by metadata operation.
>>>
>>> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com>
>>> wrote:
>>>
>>>> Thanks Tim,
>>>>
>>>> If i got your point right, then SET operation is affecting the client
>>>> Java memory and not considered as part of the impala daemon memory limit,
>>>> right?
>>>>
>>>> Is this correct also for invalidate meta data and Refresh or alter
>>>> table ... recover partitions? Are all of these client operations? Are they
>>>> use any resources assigned for impala daemon or impala resource pools?
>>>>
>>>> If they are client operations then I can use the used resources using
>>>> the Linux TOP command,  if they are taking any resources from impala daemon
>>>> memory limit or resource pool, I will be happy to know where I can track
>>>> the resource usage of these DDL operations.
>>>>
>>>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
>>>> wrote:
>>>>
>>>>> "SET" is very cheap - it only affects the client session on the Impala
>>>>> server that you're connected to. DDL operations are often more expensive
>>>>> because they require updating metadata globally. That can sometimes involve
>>>>> a bit of work (e.g. gather metadata about existing files on HDFS) or can
>>>>> involve the operation getting queued behind other metadata operations.
>>>>>
>>>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Community,
>>>>>>
>>>>>> Does the DDL operations like alter, drop and create consume
>>>>>> resources? and does the set operations like set resource_pool=xxx also
>>>>>> consume resources?
>>>>>>
>>>>>> Yes, i'm aware these operations are quick but once they are running
>>>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>>>> timeout .... which may exceed few minutes
>>>>>>
>>>>>> --
>>>>>> Take Care
>>>>>> Fawze Abujaber
>>>>>>
>>>>>
>>>>> --
>>>> Take Care
>>>> Fawze Abujaber
>>>>
>>>
>>>

Re: Does Impala DDL and SET operations consume resources

Posted by Jim Apple <jb...@cloudera.com>.
I don’t think I understand the statement. Under what conditions are
additional DDL statements generated by the driver? What exact query did you
enter and what was generated instead?

On Thu, Jun 14, 2018 at 5:44 PM Sunil Parmar <su...@gmail.com> wrote:

> The Impala JDBC driver generates additional DDL statements.
>
> select column1,column2 from table limit 0
> or
> show tables
> or
> use dwh;
> or
> describe table
>
> If DDL are expensive; is there a way to avoid this ?
>
> Sunil Parmar
>
>
> On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <ta...@cloudera.com>
> wrote:
>
>> SET is very cheap because it just changes a value in the user's session.
>> There's no interaction with any other services.
>>
>> DDL operations can be a lot more expensive, although they don't compete
>> with executing queries for resources. For the most part those DDL
>> operations you mentioned consume resources in Java, generate load on
>> metadata services like the HDFS namenode and Hive Metastore, and can block
>> other DDL operations. We don't have great visibility at the moment into
>> those resources consumed by metadata operation.
>>
>> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com>
>> wrote:
>>
>>> Thanks Tim,
>>>
>>> If i got your point right, then SET operation is affecting the client
>>> Java memory and not considered as part of the impala daemon memory limit,
>>> right?
>>>
>>> Is this correct also for invalidate meta data and Refresh or alter table
>>> ... recover partitions? Are all of these client operations? Are they use
>>> any resources assigned for impala daemon or impala resource pools?
>>>
>>> If they are client operations then I can use the used resources using
>>> the Linux TOP command,  if they are taking any resources from impala daemon
>>> memory limit or resource pool, I will be happy to know where I can track
>>> the resource usage of these DDL operations.
>>>
>>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
>>> wrote:
>>>
>>>> "SET" is very cheap - it only affects the client session on the Impala
>>>> server that you're connected to. DDL operations are often more expensive
>>>> because they require updating metadata globally. That can sometimes involve
>>>> a bit of work (e.g. gather metadata about existing files on HDFS) or can
>>>> involve the operation getting queued behind other metadata operations.
>>>>
>>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Community,
>>>>>
>>>>> Does the DDL operations like alter, drop and create consume resources?
>>>>> and does the set operations like set resource_pool=xxx also consume
>>>>> resources?
>>>>>
>>>>> Yes, i'm aware these operations are quick but once they are running
>>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>>> timeout .... which may exceed few minutes
>>>>>
>>>>> --
>>>>> Take Care
>>>>> Fawze Abujaber
>>>>>
>>>>
>>>> --
>>> Take Care
>>> Fawze Abujaber
>>>
>>
>>

Re: Does Impala DDL and SET operations consume resources

Posted by Sunil Parmar <su...@gmail.com>.
The Impala JDBC driver generates additional DDL statements.

select column1,column2 from table limit 0
or
show tables
or
use dwh;
or
describe table

If DDL are expensive; is there a way to avoid this ?

Sunil Parmar


On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <ta...@cloudera.com>
wrote:

> SET is very cheap because it just changes a value in the user's session.
> There's no interaction with any other services.
>
> DDL operations can be a lot more expensive, although they don't compete
> with executing queries for resources. For the most part those DDL
> operations you mentioned consume resources in Java, generate load on
> metadata services like the HDFS namenode and Hive Metastore, and can block
> other DDL operations. We don't have great visibility at the moment into
> those resources consumed by metadata operation.
>
> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com>
> wrote:
>
>> Thanks Tim,
>>
>> If i got your point right, then SET operation is affecting the client
>> Java memory and not considered as part of the impala daemon memory limit,
>> right?
>>
>> Is this correct also for invalidate meta data and Refresh or alter table
>> ... recover partitions? Are all of these client operations? Are they use
>> any resources assigned for impala daemon or impala resource pools?
>>
>> If they are client operations then I can use the used resources using the
>> Linux TOP command,  if they are taking any resources from impala daemon
>> memory limit or resource pool, I will be happy to know where I can track
>> the resource usage of these DDL operations.
>>
>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
>> wrote:
>>
>>> "SET" is very cheap - it only affects the client session on the Impala
>>> server that you're connected to. DDL operations are often more expensive
>>> because they require updating metadata globally. That can sometimes involve
>>> a bit of work (e.g. gather metadata about existing files on HDFS) or can
>>> involve the operation getting queued behind other metadata operations.
>>>
>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>>> wrote:
>>>
>>>> Hi Community,
>>>>
>>>> Does the DDL operations like alter, drop and create consume resources?
>>>> and does the set operations like set resource_pool=xxx also consume
>>>> resources?
>>>>
>>>> Yes, i'm aware these operations are quick but once they are running
>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>> timeout .... which may exceed few minutes
>>>>
>>>> --
>>>> Take Care
>>>> Fawze Abujaber
>>>>
>>>
>>> --
>> Take Care
>> Fawze Abujaber
>>
>
>

Re: Does Impala DDL and SET operations consume resources

Posted by Tim Armstrong <ta...@cloudera.com>.
SET is very cheap because it just changes a value in the user's session.
There's no interaction with any other services.

DDL operations can be a lot more expensive, although they don't compete
with executing queries for resources. For the most part those DDL
operations you mentioned consume resources in Java, generate load on
metadata services like the HDFS namenode and Hive Metastore, and can block
other DDL operations. We don't have great visibility at the moment into
those resources consumed by metadata operation.

On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <fa...@gmail.com> wrote:

> Thanks Tim,
>
> If i got your point right, then SET operation is affecting the client Java
> memory and not considered as part of the impala daemon memory limit, right?
>
> Is this correct also for invalidate meta data and Refresh or alter table
> ... recover partitions? Are all of these client operations? Are they use
> any resources assigned for impala daemon or impala resource pools?
>
> If they are client operations then I can use the used resources using the
> Linux TOP command,  if they are taking any resources from impala daemon
> memory limit or resource pool, I will be happy to know where I can track
> the resource usage of these DDL operations.
>
> On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com>
> wrote:
>
>> "SET" is very cheap - it only affects the client session on the Impala
>> server that you're connected to. DDL operations are often more expensive
>> because they require updating metadata globally. That can sometimes involve
>> a bit of work (e.g. gather metadata about existing files on HDFS) or can
>> involve the operation getting queued behind other metadata operations.
>>
>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com>
>> wrote:
>>
>>> Hi Community,
>>>
>>> Does the DDL operations like alter, drop and create consume resources?
>>> and does the set operations like set resource_pool=xxx also consume
>>> resources?
>>>
>>> Yes, i'm aware these operations are quick but once they are running from
>>> interfaces like Hue or MSTR through ODBC it's running till it get timeout
>>> .... which may exceed few minutes
>>>
>>> --
>>> Take Care
>>> Fawze Abujaber
>>>
>>
>> --
> Take Care
> Fawze Abujaber
>

Re: Does Impala DDL and SET operations consume resources

Posted by Fawze Abujaber <fa...@gmail.com>.
Thanks Tim,

If i got your point right, then SET operation is affecting the client Java
memory and not considered as part of the impala daemon memory limit, right?

Is this correct also for invalidate meta data and Refresh or alter table
... recover partitions? Are all of these client operations? Are they use
any resources assigned for impala daemon or impala resource pools?

If they are client operations then I can use the used resources using the
Linux TOP command,  if they are taking any resources from impala daemon
memory limit or resource pool, I will be happy to know where I can track
the resource usage of these DDL operations.

On Mon, 21 May 2018 at 20:45 Tim Armstrong <ta...@cloudera.com> wrote:

> "SET" is very cheap - it only affects the client session on the Impala
> server that you're connected to. DDL operations are often more expensive
> because they require updating metadata globally. That can sometimes involve
> a bit of work (e.g. gather metadata about existing files on HDFS) or can
> involve the operation getting queued behind other metadata operations.
>
> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com> wrote:
>
>> Hi Community,
>>
>> Does the DDL operations like alter, drop and create consume resources?
>> and does the set operations like set resource_pool=xxx also consume
>> resources?
>>
>> Yes, i'm aware these operations are quick but once they are running from
>> interfaces like Hue or MSTR through ODBC it's running till it get timeout
>> .... which may exceed few minutes
>>
>> --
>> Take Care
>> Fawze Abujaber
>>
>
> --
Take Care
Fawze Abujaber

Re: Does Impala DDL and SET operations consume resources

Posted by Tim Armstrong <ta...@cloudera.com>.
"SET" is very cheap - it only affects the client session on the Impala
server that you're connected to. DDL operations are often more expensive
because they require updating metadata globally. That can sometimes involve
a bit of work (e.g. gather metadata about existing files on HDFS) or can
involve the operation getting queued behind other metadata operations.

On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <fa...@gmail.com> wrote:

> Hi Community,
>
> Does the DDL operations like alter, drop and create consume resources? and
> does the set operations like set resource_pool=xxx also consume resources?
>
> Yes, i'm aware these operations are quick but once they are running from
> interfaces like Hue or MSTR through ODBC it's running till it get timeout
> .... which may exceed few minutes
>
> --
> Take Care
> Fawze Abujaber
>