You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Roberta Marton (JIRA)" <ji...@apache.org> on 2015/10/16 21:43:05 UTC

[jira] [Commented] (TRAFODION-19) External tables for HIVE

    [ https://issues.apache.org/jira/browse/TRAFODION-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961261#comment-14961261 ] 

Roberta Marton commented on TRAFODION-19:
-----------------------------------------

The work for the JIRA has been completed.
JIRA 1536 has been written to handle maintaining consistency between the native table with its Trafodion counterpart.

> External tables for HIVE
> ------------------------
>
>                 Key: TRAFODION-19
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-19
>             Project: Apache Trafodion
>          Issue Type: New Feature
>          Components: sql-general
>            Reporter: Roberta Marton
>            Assignee: Roberta Marton
>         Attachments: external_table.docx
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Trafodion supports selecting, loading from, listing, and describing native HIVE tables.  HIVE tables are identified by specifying a special catalog and schema named HIVE.HIVE. 
> To select from a HIVE table named t; specify an implicit or explicit name, such as HIVE.HIVE.t, in a Trafodion SQL statement. 
> set schema HIVE.HIVE; 
> select * from t; -- implicit table name 
> set schema trafodion.seabase; 
> select * from HIVE.HIVE.t; -- explicit table name
> Trafodion interprets the special HIVE.HIVE catalog and schema name to be a native HIVE table.  During preparation of an SQL statement referencing a native HIVE table, Trafodion contacts HIVE and obtains a description of the table.  It then creates an internal description (NATable) of the HIVE table.  The NATable definition is used by the compiler and code generation process to prepare the plan. Trafodion does not store any details in Trafodion metadata.
>  
> Several Trafodion commands today, would work more effectively if we allow native HIVE table to be partially described in Trafodion.  That is, store their definitions in Trafodion metadata.  This JIRA describes a proposal to allow HIVE tables to be registered in Trafodion metadata by specifying the “CREATE EXTERNAL <table> TABLE …” syntax.
> Proposal
> Allow HIVE tables to be registered in Trafodion metadata through the EXTERNAL TABLE create option.
> CREATE EXTERNAL TABLE [IF NOT EXISTS] table LIKE hive-source-table;
> DROP EXTERNAL TABLE [IF EXISTS] table;
> hive-source-table  - native HIVE table to be registered in the Trafodion metadata.  The hive-source-table has to exist.
> table - table stored in Trafodion metadata.  Initially, the table name should be the same as the hive-source-table name.  
> <Is there any added value to make the table name different than the hive-source-table name?>
> The default catalog and schema names for HIVE tables are HIVE.HIVE (defined in ComSmallDefs as HIVE_SYSTEM_CATALOG and HIVE_SYSTEM_SCHEMA).
> To change the description, the external table needs to be dropped then recreated.  ALTER EXTERNAL TABLE is not supported.
> The following command’s behavior changes for external tables:
> •	UPDATE STATISTICS – an external table can be used to gather statistics for HIVE tables
> •	SHOWDDL – will now be allowed on external tables
> •	GRANT and REVOKE – privileges can be specified for external tables
> •	SELECT and LOAD – will be allowed on external tables
> Design notes:
> When an external table is created, Trafodion:
>     Reads its metadata to see if the table already exists. 
>           If exists and IF NOT EXISTS is specified, returns success
>           If exists and IF NOT EXISTS is not specified, returns an error
>     Gets the table description from HIVE
>           If the source table exists, it reads HIVE metadata through existing interfaces
>           If the source table does not exist, returns an error
>     Stores the description in its metadata
> When an external table is dropped, Trafodion:
>     Reads its metadata to see if the table exists
>         If it does not exists and IF EXISTS is specified, returns success
>         If it does not exists and IF EXISTS is not specied, returns success
>     Removes the table description from its metadata found in the system schema (_MD_) and privilege manager schema (“_PRIVMGR_MD_”), and removing data from the histograms table (HIVE).
> When a query is executed, Trafodion:
>     Gets the description of the table by reading metadata
>         If not defined then behavior is the same as today, information is retrieved directly from HIVE.
>         If it is defined then metadata information is retrieved from Trafodion.
>     Creates a NATable structure from the table description, a new flag will be added to indicate whether or not it is an external table.
>     Gets histogram information
>         If histograms exist, then they are used during plan generation
>         If no histograms exist, then default histograms are generated.
>         Only external tables can have histograms
>     Generates and executes the query – similar to current behavior
> External tables can be the target of an UPDATE STATISTICS command.   When an external table is specified in an UPDATE STATISTICS command, Trafodion:
>     gets the table description via the NATable structure as defined above
>     returns an error If it is not an external table
>     gathers statistics by accessing the HIVE table 
>     stores the results in histogram metadata in the HIVE schema
> The NATable structure contains privilege information that is checked during compilation.  If authorization is enabled, the generation of the NATable structure gathers privileges:
>     if it is an external table
>         If privileges exist in Trafodion metadata, they are returned
>         If not, then HIVE metadata is accessed to retrieve privileges
>     If it is not an external table, HIVE metadata is accessed to retrieve privileges
> Today, SHOWDDL returns an error if a HIVE table is specified.  For external tables, SHOWDDL displays the description of HIVE tables from data gathered from Trafodion metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)