You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Rohit Jain (JIRA)" <ji...@apache.org> on 2015/09/04 19:46:46 UTC

[jira] [Commented] (TRAFODION-1483) External tables for Native HBase tables

    [ https://issues.apache.org/jira/browse/TRAFODION-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731144#comment-14731144 ] 

Rohit Jain commented on TRAFODION-1483:
---------------------------------------

One of the challenges when mapping Trafodion tables to HBase is that HBase may have rows that do not quite follow the relational model.  That is, a row key may not just represent a single logical row in a single relational table.  

One use case is where there may be repeating groups that more logically need to be split out into another table.  I am working currently with a customer that essentially has a transaction table that has two such repeating groups.  That is, this HBase table would map to three relational tables joined by the row key.  The groups may vary in their repetition or may be fixed (perhaps representing columns for each quarter of the year).  There could also just be a single repeating column instead of a group of columns.  The latter is supported in ANSI using collections (arrays and lists). 

Another scenario is where one could have completely independent columns for different rows in the table.  For example, if you consider an items table for an online seller, they might have a Trafodion table to represent the structured aspects of their data, such as description, price, etc.  They might then store other attributes for the item in an HBase table depending on what the items is.  So a TV may have attributes like type of TV, screen size, resolution,etc. that are very different from a book with attributes such as ISBN, author, etc.  We need to be able to support such tables as external tables, as well, in order to provide a seamless integration with Trafodion.  But this cannot be done in this use case by mapping HBase column names to a Trafodion relational representation of those columns, since you could literally have thousands of such column names, that are being created relatively dynamically (as the online seller adds more types of items to sell or variants with different attributes).

We need to address how in the short and the long run we would address this scenario, while making it easier for users to integrate HBase tables with Trafodion using external table syntax.

> External tables for Native HBase tables
> ---------------------------------------
>
>                 Key: TRAFODION-1483
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1483
>             Project: Apache Trafodion
>          Issue Type: New Feature
>          Components: sql-general
>    Affects Versions: 2.0-incubating
>            Reporter: Roberta Marton
>            Assignee: Roberta Marton
>             Fix For: 2.0-incubating
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Trafodion supports selecting, loading from, listing, and describing native HBase tables. HBase tables are identified by specifying a special catalog called HBASE and schemas called "_ROW_" and "_CELL_".
> Example to select from a native HBase table:
> select left(row_id, 10) as row_id, left(column_display(column_details, ('teams:team_number', 'games:visitor_team', 'games:game_time')), 100) as cols from hbase."_ROW_"."baseball";
> Trafodion interprets the special HBASE catalog be a native HBase table. During preparation of an SQL statement referencing a native HBase table, Trafodion contacts HBase and obtains a description of the table. It then creates an internal description (NATable) of the HBase table. The NATable definition is used by the compiler and code generation process to prepare the plan. Trafodion does not store any details in Trafodion metadata.
> Several Trafodion commands today, would work more effectively if we allow native HBase tables to be partially described in Trafodion. That is, store their definitions in Trafodion metadata. This JIRA describes a proposal to allow native HBase tables to be registered in Trafodion metadata by specifying the “CREATE EXTERNAL <table> TABLE …” syntax.
> Proposal
> Allow native HBase tables to be registered in Trafodion metadata through the EXTERNAL TABLE create option.
> CREATE EXTERNAL TABLE [IF NOT EXISTS] table FOR hbase-source-table;
> DROP EXTERNAL TABLE [IF EXISTS] table FOR hbase-source_table
> hbase-source-table - native HBase table to be registered in the Trafodion metadata. The hbase-source-table has to exist.
> table - table stored in Trafodion metadata. Initially, the table name should be the same as the hbase-source-table name. 
> The default catalog name for HBase tables are HBASE (defined in ComSmallDefs as HBASE_SYSTEM_CATALOG).
> To change the description, the external table needs to be dropped then recreated. ALTER EXTERNAL TABLE is not supported.
> The following command’s behavior changes for external tables:
> •	UPDATE STATISTICS – an external table can be used to gather statistics for HIVE tables
> •	SHOWDDL – will now be allowed on external tables
> •	GRANT and REVOKE – privileges can be specified for external tables
> •	SELECT and LOAD – will be allowed on external tables
> This is a sister project to JIRA TRAFODION-19 and implemented in much the same way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)