You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Sailesh Patel (JIRA)" <ji...@apache.org> on 2017/11/23 23:49:01 UTC
[jira] [Created] (KUDU-2224) Kudu Partition Dynamic Creation on
Insertion
Sailesh Patel created KUDU-2224:
-----------------------------------
Summary: Kudu Partition Dynamic Creation on Insertion
Key: KUDU-2224
URL: https://issues.apache.org/jira/browse/KUDU-2224
Project: Kudu
Issue Type: New Feature
Affects Versions: 1.4.0
Reporter: Sailesh Patel
Priority: Minor
Option to specify a more simplistic directive for partitioning where by Kudu will create partitions on the fly instead of manual intervention of creating additional partitions as described in:
https://kudu.apache.org/2016/08/23/new-range-partitioning-features.html
https://kudu.apache.org/docs/kudu_impala_integration.html#partitioning_tables
"Non-Covering Range Partitions"
+Requirement:+
When creating partitioning, a partitioning rule is specified, whereby the granularity size is specified and a new partition is created :
-at insert time when one does not exist for that value.
e.g proposal
CREATE TABLE sample_table (ts TIMESTAMP, eventid BIGINT, somevalue STRING, PRIMARY KEY(ts,eventid) )
PARTITION BY
RANGE(ts) GRANULARITY= 86400000000000 START = 1104537600000000
STORED AS KUDU;
- Maybe an optional END
- The start is to show were there partition granularity builds from
-----
Use case
- time series data where timestamps arrive out of order, can catch up from sometimes years in the past and and for unpredictable timestamps. Event information is either a timestamp (say epoch nano or epoch millisecond) with partitions based upon a range value of that timestamp (typically day or hour granularity)
Currently, we script up the creation of partitions in advance of our received data but if they fail for any reason the insert fails. Also, if we receive unexpected data from a timestamp way in the past that if there is no partition for the insert will fail.
Opening this Jira enhancement for discussion.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)