You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2019/08/28 15:27:00 UTC
[jira] [Resolved] (KUDU-2224) Kudu Partition Dynamic Creation on
Insertion
[ https://issues.apache.org/jira/browse/KUDU-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke resolved KUDU-2224.
-------------------------------
Fix Version/s: NA
Resolution: Duplicate
> Kudu Partition Dynamic Creation on Insertion
> --------------------------------------------
>
> Key: KUDU-2224
> URL: https://issues.apache.org/jira/browse/KUDU-2224
> Project: Kudu
> Issue Type: New Feature
> Affects Versions: 1.4.0
> Reporter: Sailesh Patel
> Assignee: HeLifu
> Priority: Minor
> Fix For: NA
>
>
> Option to specify a more simplistic directive for partitioning where by Kudu will create partitions on the fly instead of manual intervention of creating additional partitions as described in:
> https://kudu.apache.org/2016/08/23/new-range-partitioning-features.html
>
> https://kudu.apache.org/docs/kudu_impala_integration.html#partitioning_tables
> "Non-Covering Range Partitions"
>
> +Requirement:+
> When creating partitioning, a partitioning rule is specified, whereby the granularity size is specified and a new partition is created :
> -at insert time when one does not exist for that value.
> e.g proposal
> CREATE TABLE sample_table (ts TIMESTAMP, eventid BIGINT, somevalue STRING, PRIMARY KEY(ts,eventid) )
> PARTITION BY
> RANGE(ts) GRANULARITY= 86400000000000 START = 1104537600000000
> STORED AS KUDU;
> - Maybe an optional END
> - The start is to show were there partition granularity builds from
> -----
> Use case
> - time series data where timestamps arrive out of order, can catch up from sometimes years in the past and and for unpredictable timestamps. Event information is either a timestamp (say epoch nano or epoch millisecond) with partitions based upon a range value of that timestamp (typically day or hour granularity)
> Currently, we script up the creation of partitions in advance of our received data but if they fail for any reason the insert fails. Also, if we receive unexpected data from a timestamp way in the past that if there is no partition for the insert will fail.
> Opening this Jira enhancement for discussion.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)