You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Hans Zeller (JIRA)" <ji...@apache.org> on 2016/05/12 00:22:13 UTC

[jira] [Updated] (TRAFODION-50) LP Blueprint: cmp-presplit-unsalted-tables - Add SQL syntax to pre-split unsalted tables into regions

     [ https://issues.apache.org/jira/browse/TRAFODION-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hans Zeller updated TRAFODION-50:
---------------------------------
    Fix Version/s:     (was: 2.0-incubating)

> LP Blueprint: cmp-presplit-unsalted-tables - Add SQL syntax to pre-split unsalted tables into regions
> -----------------------------------------------------------------------------------------------------
>
>                 Key: TRAFODION-50
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-50
>             Project: Apache Trafodion
>          Issue Type: New Feature
>          Components: sql-cmp
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>            Priority: Critical
>
> Currently, Trafodion creates tables as a single-region HBase table, unless the table is salted. Salted tables have one region per salt bucket. We would like to add SQL syntax to allow pre-splitting of unsalted tables as well. We would like to offer two new table options, SPLIT BY and PARTITION BY. Both allow specification of split keys. SPLIT BY will simply pre-split the table, with no special split policy. PARTITION BY will add a prefix split policy that ensures that all rows with common PARTITION BY column values will remain in the same table.
> Example:
> create table lineitem(
>    l_orderkey int not null,
>    l_linenum int not null,
>    ...
>    primary key (l_orderkey, l_linenum))
> PARTITION BY (l_orderkey)
> (add first key (10000),
>  add first key (20000)
> );
> This will create three regions containing for values [<min>, 10000), [10000, 20000) and [20000, <max>) it will add a prefix split policy for a 4 byte prefix, the length of a non-nullable INT column. HBase will still be able to split the regions further, but we will have a guarantee that all rows for a given l_orderkey are located in the same region.
> If SPLIT BY is used instead of PARTITION BY, the same regions will be created, but no custom split policy will be added. Note that the PARTITION BY / SPLIT BY column(s) have to be a prefix of the clustering key of the table. PARTITION BY and SPLIT BY are not allowed for salted tables. We may or may not support pre-splitting of divisioned tables in the first release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)