You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Alan Jackoway (JIRA)" <ji...@apache.org> on 2017/05/04 20:24:04 UTC
[jira] [Created] (KUDU-1994) Automatically Create New Range
Partitions When Needed
Alan Jackoway created KUDU-1994:
-----------------------------------
Summary: Automatically Create New Range Partitions When Needed
Key: KUDU-1994
URL: https://issues.apache.org/jira/browse/KUDU-1994
Project: Kudu
Issue Type: Improvement
Affects Versions: 1.3.0
Reporter: Alan Jackoway
We have a few Kudu tables where we use a range-partitioned timestamp as part of the key. The intention of this is to keep data locality for data that is likely to be scanned together, such as events in a timeseries.
Currently we create these with a partitions that look like this:
{noformat}
RANGE (ts) (
PARTITION 0 <= VALUES < 1420088400000,
PARTITION 1420088400000 <= VALUES < 1427860800000,
PARTITION 1427860800000 <= VALUES < 1435723200000,
PARTITION 1435723200000 <= VALUES < 1443672000000,
PARTITION 1443672000000 <= VALUES < 1451624400000,
PARTITION 1451624400000 <= VALUES < 1459483200000,
PARTITION 1459483200000 <= VALUES < 1467345600000,
PARTITION 1467345600000 <= VALUES < 1475294400000,
PARTITION 1475294400000 <= VALUES < 1483246800000,
PARTITION 1483246800000 <= VALUES < 1491033600000,
PARTITION 1491033600000 <= VALUES < 1498896000000,
PARTITION 1498896000000 <= VALUES < 1506844800000
)
{noformat}
The problem is that as time goes on we have to choose to either create empty partitions in advance of when we are writing data or risk forgetting to create a partition and having writes of new data fail.
Ideally, Kudu would have a way to indicate the size of the partitions (in this example 3 months converted to milliseconds) and then automatically create new partitions when new data comes in that needs the partition.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)