You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/09/04 19:38:45 UTC

[jira] [Commented] (TRAFODION-19) External tables for HIVE

    [ https://issues.apache.org/jira/browse/TRAFODION-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731119#comment-14731119 ] 

ASF GitHub Bot commented on TRAFODION-19:
-----------------------------------------

GitHub user robertamarton opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/74

    Support for features described in JIRA TRAFODION-19 && TRAFODION-1483

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/robertamarton/incubator-trafodion master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/74.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #74
    
----
commit 2e0b1ddf10efbd09fe027ce3af00aef7ba821271
Author: Roberta Marton <ro...@esgyn.com>
Date:   2015-09-04T17:09:32Z

    JIRA TRAFODION-19 && TRAFODION-1483
    
    JIRA TRAFODION-19:
    
    Made the following changes:
    
    Removed support for locklength from the parser
    Test and fix issues with create table variants
    Made recommended changes as mentioned in the initial delivery
      Rename some of the variable names to make them more informative
      Assume prefix length for HIVE and HBASE tables can be different and verify code works correctly
      Fix commented code to be correct and remove lines no longer needed
      Update error text for messages 1180 to include external table type
    
    Beefed up the new regression test (hive/TEST009)
    Check column list from hive descriptor with external table description and return an error if there is a mismatch
    Add an error if the user specifies a different target schema name than the source
    Allow someone with privileges to drop HIVE exernal schemas
    
    JIRA TRAFODION-1483:
    
    Added support for external HBase tables similar to Hive table support

commit 569045b4fb816e598f5ec1e1bed45894aebf199d
Author: Roberta Marton <ro...@esgyn.com>
Date:   2015-09-04T17:11:13Z

    Merge branch 'master' of github.com:apache/incubator-trafodion

----


> External tables for HIVE
> ------------------------
>
>                 Key: TRAFODION-19
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-19
>             Project: Apache Trafodion
>          Issue Type: New Feature
>            Reporter: Roberta Marton
>            Assignee: Roberta Marton
>         Attachments: external_table.docx
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Trafodion supports selecting, loading from, listing, and describing native HIVE tables.  HIVE tables are identified by specifying a special catalog and schema named HIVE.HIVE. 
> To select from a HIVE table named t; specify an implicit or explicit name, such as HIVE.HIVE.t, in a Trafodion SQL statement. 
> set schema HIVE.HIVE; 
> select * from t; -- implicit table name 
> set schema trafodion.seabase; 
> select * from HIVE.HIVE.t; -- explicit table name
> Trafodion interprets the special HIVE.HIVE catalog and schema name to be a native HIVE table.  During preparation of an SQL statement referencing a native HIVE table, Trafodion contacts HIVE and obtains a description of the table.  It then creates an internal description (NATable) of the HIVE table.  The NATable definition is used by the compiler and code generation process to prepare the plan. Trafodion does not store any details in Trafodion metadata.
>  
> Several Trafodion commands today, would work more effectively if we allow native HIVE table to be partially described in Trafodion.  That is, store their definitions in Trafodion metadata.  This JIRA describes a proposal to allow HIVE tables to be registered in Trafodion metadata by specifying the “CREATE EXTERNAL <table> TABLE …” syntax.
> Proposal
> Allow HIVE tables to be registered in Trafodion metadata through the EXTERNAL TABLE create option.
> CREATE EXTERNAL TABLE [IF NOT EXISTS] table LIKE hive-source-table;
> DROP EXTERNAL TABLE [IF EXISTS] table;
> hive-source-table  - native HIVE table to be registered in the Trafodion metadata.  The hive-source-table has to exist.
> table - table stored in Trafodion metadata.  Initially, the table name should be the same as the hive-source-table name.  
> <Is there any added value to make the table name different than the hive-source-table name?>
> The default catalog and schema names for HIVE tables are HIVE.HIVE (defined in ComSmallDefs as HIVE_SYSTEM_CATALOG and HIVE_SYSTEM_SCHEMA).
> To change the description, the external table needs to be dropped then recreated.  ALTER EXTERNAL TABLE is not supported.
> The following command’s behavior changes for external tables:
> •	UPDATE STATISTICS – an external table can be used to gather statistics for HIVE tables
> •	SHOWDDL – will now be allowed on external tables
> •	GRANT and REVOKE – privileges can be specified for external tables
> •	SELECT and LOAD – will be allowed on external tables
> Design notes:
> When an external table is created, Trafodion:
>     Reads its metadata to see if the table already exists. 
>           If exists and IF NOT EXISTS is specified, returns success
>           If exists and IF NOT EXISTS is not specified, returns an error
>     Gets the table description from HIVE
>           If the source table exists, it reads HIVE metadata through existing interfaces
>           If the source table does not exist, returns an error
>     Stores the description in its metadata
> When an external table is dropped, Trafodion:
>     Reads its metadata to see if the table exists
>         If it does not exists and IF EXISTS is specified, returns success
>         If it does not exists and IF EXISTS is not specied, returns success
>     Removes the table description from its metadata found in the system schema (_MD_) and privilege manager schema (“_PRIVMGR_MD_”), and removing data from the histograms table (HIVE).
> When a query is executed, Trafodion:
>     Gets the description of the table by reading metadata
>         If not defined then behavior is the same as today, information is retrieved directly from HIVE.
>         If it is defined then metadata information is retrieved from Trafodion.
>     Creates a NATable structure from the table description, a new flag will be added to indicate whether or not it is an external table.
>     Gets histogram information
>         If histograms exist, then they are used during plan generation
>         If no histograms exist, then default histograms are generated.
>         Only external tables can have histograms
>     Generates and executes the query – similar to current behavior
> External tables can be the target of an UPDATE STATISTICS command.   When an external table is specified in an UPDATE STATISTICS command, Trafodion:
>     gets the table description via the NATable structure as defined above
>     returns an error If it is not an external table
>     gathers statistics by accessing the HIVE table 
>     stores the results in histogram metadata in the HIVE schema
> The NATable structure contains privilege information that is checked during compilation.  If authorization is enabled, the generation of the NATable structure gathers privileges:
>     if it is an external table
>         If privileges exist in Trafodion metadata, they are returned
>         If not, then HIVE metadata is accessed to retrieve privileges
>     If it is not an external table, HIVE metadata is accessed to retrieve privileges
> Today, SHOWDDL returns an error if a HIVE table is specified.  For external tables, SHOWDDL displays the description of HIVE tables from data gathered from Trafodion metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)