You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafodion.apache.org by Rohit Jain <ro...@esgyn.com> on 2015/07/06 21:33:44 UTC

RE: [jira] [Created] (TRAFODION-19) External tables for HIVE

Can I talk to you? (510) 651-7183 or different number?

-----Original Message-----
From: Roberta Marton (JIRA) [mailto:jira@apache.org] 
Sent: Monday, July 6, 2015 2:09 PM
To: issues@trafodion.incubator.apache.org
Subject: [jira] [Created] (TRAFODION-19) External tables for HIVE

Roberta Marton created TRAFODION-19:
---------------------------------------

             Summary: External tables for HIVE
                 Key: TRAFODION-19
                 URL: https://issues.apache.org/jira/browse/TRAFODION-19
             Project: Apache Trafodion
          Issue Type: New Feature
            Reporter: Roberta Marton


Trafodion supports selecting, loading from, listing, and describing native HIVE tables.  HIVE tables are identified by specifying a special catalog and schema named HIVE.HIVE. 

To select from a HIVE table named t; specify an implicit or explicit name, such as HIVE.HIVE.t, in a Trafodion SQL statement. 

set schema HIVE.HIVE;
select * from t; -- implicit table name set schema trafodion.seabase; select * from HIVE.HIVE.t; -- explicit table name

Trafodion interprets the special HIVE.HIVE catalog and schema name to be a native HIVE table.  During preparation of an SQL statement referencing a native HIVE table, Trafodion contacts HIVE and obtains a description of the table.  It then creates an internal description (NATable) of the HIVE table.  The NATable definition is used by the compiler and code generation process to prepare the plan. Trafodion does not store any details in Trafodion metadata.
 
Several Trafodion commands today, would work more effectively if we allow native HIVE table to be partially described in Trafodion.  That is, store their definitions in Trafodion metadata.  This JIRA describes a proposal to allow HIVE tables to be registered in Trafodion metadata by specifying the “CREATE EXTERNAL <table> TABLE …” syntax.

Proposal

Allow HIVE tables to be registered in Trafodion metadata through the EXTERNAL TABLE create option.

CREATE EXTERNAL TABLE [IF NOT EXISTS] table LIKE hive-source-table; DROP EXTERNAL TABLE [IF EXISTS] table;

hive-source-table  - native HIVE table to be registered in the Trafodion metadata.  The hive-source-table has to exist.

table - table stored in Trafodion metadata.  Initially, the table name should be the same as the hive-source-table name.  
<Is there any added value to make the table name different than the hive-source-table name?>

The default catalog and schema names for HIVE tables are HIVE.HIVE (defined in ComSmallDefs as HIVE_SYSTEM_CATALOG and HIVE_SYSTEM_SCHEMA).

To change the description, the external table needs to be dropped then recreated.  ALTER EXTERNAL TABLE is not supported.

The following command’s behavior changes for external tables:
•	UPDATE STATISTICS – an external table can be used to gather statistics for HIVE tables
•	SHOWDDL – will now be allowed on external tables
•	GRANT and REVOKE – privileges can be specified for external tables
•	SELECT and LOAD – will be allowed on external tables

Design notes:

When an external table is created, Trafodion:
    Reads its metadata to see if the table already exists. 
          If exists and IF NOT EXISTS is specified, returns success
          If exists and IF NOT EXISTS is not specified, returns an error
    Gets the table description from HIVE
          If the source table exists, it reads HIVE metadata through existing interfaces
          If the source table does not exist, returns an error
    Stores the description in its metadata

When an external table is dropped, Trafodion:
    Reads its metadata to see if the table exists
        If it does not exists and IF EXISTS is specified, returns success
        If it does not exists and IF EXISTS is not specied, returns success
    Removes the table description from its metadata found in the system schema (_MD_) and privilege manager schema (“_PRIVMGR_MD_”), and removing data from the histograms table (HIVE).

When a query is executed, Trafodion:
    Gets the description of the table by reading metadata
        If not defined then behavior is the same as today, information is retrieved directly from HIVE.
        If it is defined then metadata information is retrieved from Trafodion.
    Creates a NATable structure from the table description, a new flag will be added to indicate whether or not it is an external table.
    Gets histogram information
        If histograms exist, then they are used during plan generation
        If no histograms exist, then default histograms are generated.
        Only external tables can have histograms
    Generates and executes the query – similar to current behavior

External tables can be the target of an UPDATE STATISTICS command.   When an external table is specified in an UPDATE STATISTICS command, Trafodion:
    gets the table description via the NATable structure as defined above
    returns an error If it is not an external table
    gathers statistics by accessing the HIVE table 
    stores the results in histogram metadata in the HIVE schema

The NATable structure contains privilege information that is checked during compilation.  If authorization is enabled, the generation of the NATable structure gathers privileges:
    if it is an external table
        If privileges exist in Trafodion metadata, they are returned
        If not, then HIVE metadata is accessed to retrieve privileges
    If it is not an external table, HIVE metadata is accessed to retrieve privileges

Today, SHOWDDL returns an error if a HIVE table is specified.  For external tables, SHOWDDL displays the description of HIVE tables from data gathered from Trafodion metadata.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (TRAFODION-19) External tables for HIVE

Posted by Roberta Marton <ro...@esgyn.com>.
Sure, what to try one of the new Google tools or call me at 510-220-8517?

   Roberta

On Mon, Jul 6, 2015 at 12:33 PM, Rohit Jain <ro...@esgyn.com> wrote:

> Can I talk to you? (510) 651-7183 or different number?
>
> -----Original Message-----
> From: Roberta Marton (JIRA) [mailto:jira@apache.org]
> Sent: Monday, July 6, 2015 2:09 PM
> To: issues@trafodion.incubator.apache.org
> Subject: [jira] [Created] (TRAFODION-19) External tables for HIVE
>
> Roberta Marton created TRAFODION-19:
> ---------------------------------------
>
>              Summary: External tables for HIVE
>                  Key: TRAFODION-19
>                  URL: https://issues.apache.org/jira/browse/TRAFODION-19
>              Project: Apache Trafodion
>           Issue Type: New Feature
>             Reporter: Roberta Marton
>
>
> Trafodion supports selecting, loading from, listing, and describing native
> HIVE tables.  HIVE tables are identified by specifying a special catalog
> and schema named HIVE.HIVE.
>
> To select from a HIVE table named t; specify an implicit or explicit name,
> such as HIVE.HIVE.t, in a Trafodion SQL statement.
>
> set schema HIVE.HIVE;
> select * from t; -- implicit table name set schema trafodion.seabase;
> select * from HIVE.HIVE.t; -- explicit table name
>
> Trafodion interprets the special HIVE.HIVE catalog and schema name to be a
> native HIVE table.  During preparation of an SQL statement referencing a
> native HIVE table, Trafodion contacts HIVE and obtains a description of the
> table.  It then creates an internal description (NATable) of the HIVE
> table.  The NATable definition is used by the compiler and code generation
> process to prepare the plan. Trafodion does not store any details in
> Trafodion metadata.
>
> Several Trafodion commands today, would work more effectively if we allow
> native HIVE table to be partially described in Trafodion.  That is, store
> their definitions in Trafodion metadata.  This JIRA describes a proposal to
> allow HIVE tables to be registered in Trafodion metadata by specifying the
> “CREATE EXTERNAL <table> TABLE …” syntax.
>
> Proposal
>
> Allow HIVE tables to be registered in Trafodion metadata through the
> EXTERNAL TABLE create option.
>
> CREATE EXTERNAL TABLE [IF NOT EXISTS] table LIKE hive-source-table; DROP
> EXTERNAL TABLE [IF EXISTS] table;
>
> hive-source-table  - native HIVE table to be registered in the Trafodion
> metadata.  The hive-source-table has to exist.
>
> table - table stored in Trafodion metadata.  Initially, the table name
> should be the same as the hive-source-table name.
> <Is there any added value to make the table name different than the
> hive-source-table name?>
>
> The default catalog and schema names for HIVE tables are HIVE.HIVE
> (defined in ComSmallDefs as HIVE_SYSTEM_CATALOG and HIVE_SYSTEM_SCHEMA).
>
> To change the description, the external table needs to be dropped then
> recreated.  ALTER EXTERNAL TABLE is not supported.
>
> The following command’s behavior changes for external tables:
> •       UPDATE STATISTICS – an external table can be used to gather
> statistics for HIVE tables
> •       SHOWDDL – will now be allowed on external tables
> •       GRANT and REVOKE – privileges can be specified for external tables
> •       SELECT and LOAD – will be allowed on external tables
>
> Design notes:
>
> When an external table is created, Trafodion:
>     Reads its metadata to see if the table already exists.
>           If exists and IF NOT EXISTS is specified, returns success
>           If exists and IF NOT EXISTS is not specified, returns an error
>     Gets the table description from HIVE
>           If the source table exists, it reads HIVE metadata through
> existing interfaces
>           If the source table does not exist, returns an error
>     Stores the description in its metadata
>
> When an external table is dropped, Trafodion:
>     Reads its metadata to see if the table exists
>         If it does not exists and IF EXISTS is specified, returns success
>         If it does not exists and IF EXISTS is not specied, returns success
>     Removes the table description from its metadata found in the system
> schema (_MD_) and privilege manager schema (“_PRIVMGR_MD_”), and removing
> data from the histograms table (HIVE).
>
> When a query is executed, Trafodion:
>     Gets the description of the table by reading metadata
>         If not defined then behavior is the same as today, information is
> retrieved directly from HIVE.
>         If it is defined then metadata information is retrieved from
> Trafodion.
>     Creates a NATable structure from the table description, a new flag
> will be added to indicate whether or not it is an external table.
>     Gets histogram information
>         If histograms exist, then they are used during plan generation
>         If no histograms exist, then default histograms are generated.
>         Only external tables can have histograms
>     Generates and executes the query – similar to current behavior
>
> External tables can be the target of an UPDATE STATISTICS command.   When
> an external table is specified in an UPDATE STATISTICS command, Trafodion:
>     gets the table description via the NATable structure as defined above
>     returns an error If it is not an external table
>     gathers statistics by accessing the HIVE table
>     stores the results in histogram metadata in the HIVE schema
>
> The NATable structure contains privilege information that is checked
> during compilation.  If authorization is enabled, the generation of the
> NATable structure gathers privileges:
>     if it is an external table
>         If privileges exist in Trafodion metadata, they are returned
>         If not, then HIVE metadata is accessed to retrieve privileges
>     If it is not an external table, HIVE metadata is accessed to retrieve
> privileges
>
> Today, SHOWDDL returns an error if a HIVE table is specified.  For
> external tables, SHOWDDL displays the description of HIVE tables from data
> gathered from Trafodion metadata.
>
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
>