You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Huang Meilong <im...@outlook.com> on 2016/12/01 14:15:36 UTC

Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

Hi all,


I found an enum TableType in package org.apache.hadoop.hive.metastore. What's the difference between MANAGED_TABLE and EXTERNAL_TABLE?


Will the table be an EXTERNAL TABLE with setting table type EXTERNAL_TABLE when creating table?


I found the code to determine whether a table is an external table in MetaStoreUtils.java

https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L1425


I'm confused what is EXTERNAL_TABLE in TableType for?




Re: 答复: Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

Posted by Patrick Duin <pa...@gmail.com>.
Hi,

I've noticed the same thing we set the table parameter as well to make sure
the table is External.
replica.putToParameters("EXTERNAL", "TRUE")

Not sure if the tableType is actually used anywhere, we set it anyway as
well as the table parameter just to be sure when using the Metastore API.

No clue if it is a bug but setting the table parameter will work.

Patrick

2016-12-02 8:38 GMT+01:00 Huang Meilong <im...@outlook.com>:

> Thanks for your detailed explanations.
>
>
> I'm not asking what the concepts of INTERNAL TABLE and EXTERNAL TABLE in
> hive, I'm just confused what does the value "EXTERNAL_TABLE" in class
> org.apache.hadoop.hive.TableType mean?
>
>
> I create a table setting table type TableType.EXTERNAL_TABLE by calling
> hive metastore api like this:
>
>
> final HiveMetaStoreClient client = new HiveMetaStoreClient(c);
>
> Table newTable = new Table();
>
> newTable.setTableType(TableType.EXTERNAL_TABLE.toString());
>
> ...
>
> client.createTable(newTable);
>
>
> then I describe the created table with hive metastore api:
>
> client.getTable("dbName", "tableName");
>
>
> I can see that the table type is still "MANAGED_TABLE", is this a bug of
> metastore api?
>
>
>
>
>
> ------------------------------
> *发件人:* Mich Talebzadeh <mi...@gmail.com>
> *发送时间:* 2016年12月2日 5:29
> *收件人:* user
> *主题:* Re: Difference between MANAGED_TABLE and EXTERNAL_TABLE in
> org.apache.hadoop.hive.metastore.TableType
>
> Adding to Alan's points external tables are used often as a staging area.
> For example, ingesting data from HDFS location on a daily basis and putting
> that data into Hive managed tables. That location of that external table
> can change pointing to a new HDFS directory created by say Flume etc
> through ALTER table like below
>
> ALTER TABLE ${DATABASE}.EXTERNALMARKETDATA set location
> 'hdfs://rhes564:9000/data/prices/${TODAY}';
>
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
> <http://talebzadehmich.wordpress.com/>
> Mich Talebzadeh <http://talebzadehmich.wordpress.com/>
> talebzadehmich.wordpress.com
> Technical Architecture, Big Data, Oracle, Sybase, CEP, IMDB and Data Grid
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 1 December 2016 at 17:53, Alan Gates <al...@gmail.com> wrote:
>
>> Hive does not assume that it owns the data for an external table.  Thus
>> when an external table is dropped, the data is not deleted.  People often
>> use this as a way to load data into a directory in HDFS and then “cast” a
>> table structure over it by creating an external table with that directory
>> as its location.
>>
>> Alan.
>>
>> > On Dec 1, 2016, at 06:15, Huang Meilong <im...@outlook.com> wrote:
>> >
>> > Hi all,
>> >
>> > I found an enum TableType in package org.apache.hadoop.hive.metastore.
>> What's the difference between MANAGED_TABLE and EXTERNAL_TABLE?
>> >
>> > Will the table be an EXTERNAL TABLE with setting table type
>> EXTERNAL_TABLE when creating table?
>> >
>> > I found the code to determine whether a table is an external table in
>> MetaStoreUtils.java
>> > https://github.com/apache/hive/blob/master/metastore/src/
>> java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L1425
>> >
>> > I'm confused what is EXTERNAL_TABLE in TableType for?
>>
>>
>

答复: Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

Posted by Huang Meilong <im...@outlook.com>.
Thanks for your detailed explanations.


I'm not asking what the concepts of INTERNAL TABLE and EXTERNAL TABLE in hive, I'm just confused what does the value "EXTERNAL_TABLE" in class org.apache.hadoop.hive.TableType mean?


I create a table setting table type TableType.EXTERNAL_TABLE by calling hive metastore api like this:


final HiveMetaStoreClient client = new HiveMetaStoreClient(c);

Table newTable = new Table();

newTable.setTableType(TableType.EXTERNAL_TABLE.toString());

...

client.createTable(newTable);


then I describe the created table with hive metastore api:

client.getTable("dbName", "tableName");


I can see that the table type is still "MANAGED_TABLE", is this a bug of metastore api?





________________________________
发件人: Mich Talebzadeh <mi...@gmail.com>
发送时间: 2016年12月2日 5:29
收件人: user
主题: Re: Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

Adding to Alan's points external tables are used often as a staging area. For example, ingesting data from HDFS location on a daily basis and putting that data into Hive managed tables. That location of that external table can change pointing to a new HDFS directory created by say Flume etc through ALTER table like below

ALTER TABLE ${DATABASE}.EXTERNALMARKETDATA set location 'hdfs://rhes564:9000/data/prices/${TODAY}';


HTH


Dr Mich Talebzadeh



LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com

[https://secure.gravatar.com/blavatar/eeed9e1445477d955b8dd2470c042649?s=200&ts=1480663659]<http://talebzadehmich.wordpress.com/>

Mich Talebzadeh<http://talebzadehmich.wordpress.com/>
talebzadehmich.wordpress.com
Technical Architecture, Big Data, Oracle, Sybase, CEP, IMDB and Data Grid




Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.



On 1 December 2016 at 17:53, Alan Gates <al...@gmail.com>> wrote:
Hive does not assume that it owns the data for an external table.  Thus when an external table is dropped, the data is not deleted.  People often use this as a way to load data into a directory in HDFS and then “cast” a table structure over it by creating an external table with that directory as its location.

Alan.

> On Dec 1, 2016, at 06:15, Huang Meilong <im...@outlook.com>> wrote:
>
> Hi all,
>
> I found an enum TableType in package org.apache.hadoop.hive.metastore. What's the difference between MANAGED_TABLE and EXTERNAL_TABLE?
>
> Will the table be an EXTERNAL TABLE with setting table type EXTERNAL_TABLE when creating table?
>
> I found the code to determine whether a table is an external table in MetaStoreUtils.java
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L1425
>
> I'm confused what is EXTERNAL_TABLE in TableType for?



Re: Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

Posted by Mich Talebzadeh <mi...@gmail.com>.
Adding to Alan's points external tables are used often as a staging area.
For example, ingesting data from HDFS location on a daily basis and putting
that data into Hive managed tables. That location of that external table
can change pointing to a new HDFS directory created by say Flume etc
through ALTER table like below

ALTER TABLE ${DATABASE}.EXTERNALMARKETDATA set location
'hdfs://rhes564:9000/data/prices/${TODAY}';


HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 1 December 2016 at 17:53, Alan Gates <al...@gmail.com> wrote:

> Hive does not assume that it owns the data for an external table.  Thus
> when an external table is dropped, the data is not deleted.  People often
> use this as a way to load data into a directory in HDFS and then “cast” a
> table structure over it by creating an external table with that directory
> as its location.
>
> Alan.
>
> > On Dec 1, 2016, at 06:15, Huang Meilong <im...@outlook.com> wrote:
> >
> > Hi all,
> >
> > I found an enum TableType in package org.apache.hadoop.hive.metastore.
> What's the difference between MANAGED_TABLE and EXTERNAL_TABLE?
> >
> > Will the table be an EXTERNAL TABLE with setting table type
> EXTERNAL_TABLE when creating table?
> >
> > I found the code to determine whether a table is an external table in
> MetaStoreUtils.java
> > https://github.com/apache/hive/blob/master/metastore/
> src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L1425
> >
> > I'm confused what is EXTERNAL_TABLE in TableType for?
>
>

Re: Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

Posted by Alan Gates <al...@gmail.com>.
Hive does not assume that it owns the data for an external table.  Thus when an external table is dropped, the data is not deleted.  People often use this as a way to load data into a directory in HDFS and then “cast” a table structure over it by creating an external table with that directory as its location.

Alan.

> On Dec 1, 2016, at 06:15, Huang Meilong <im...@outlook.com> wrote:
> 
> Hi all,
> 
> I found an enum TableType in package org.apache.hadoop.hive.metastore. What's the difference between MANAGED_TABLE and EXTERNAL_TABLE?
> 
> Will the table be an EXTERNAL TABLE with setting table type EXTERNAL_TABLE when creating table? 
> 
> I found the code to determine whether a table is an external table in MetaStoreUtils.java
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L1425
> 
> I'm confused what is EXTERNAL_TABLE in TableType for?