You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/01/13 07:52:00 UTC

[jira] [Commented] (KUDU-1945) Support generation of surrogate primary keys (or tables with no PK)

    [ https://issues.apache.org/jira/browse/KUDU-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17676507#comment-17676507 ] 

ASF subversion and git services commented on KUDU-1945:
-------------------------------------------------------

Commit 6ac6578d9763c3e7856b313f50aa117acef6299c in kudu's branch refs/heads/master from Abhishek Chennaka
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=6ac6578d9 ]

KUDU-1945 Auto-Incrementing Column

This patch adds a new column specification named auto_incrementing.
The ColumnSchema of this new column is INT64 and not UINT64 as
impala doesn't support UINT64. These columns are populated on the
server side with a monotonically increasing counter. This counter
is local to every tablet i.e. each tablet has a separate auto
incrementing counter. This is a step towards having tables with
non unique primary keys or in case of tables with just one tablet,
a table wide unique key.
Upon receiving a write request the leader replica:
1. Fills in the replicate message with the auto incrementing counter
   value for the first write op.
2. Populates the auto incrementing key into the rows being inserted
   during the prepare phase.
3. Sends out the replicate message with the auto incrementing counter
   and the original write request. The followers perform the same
   set of steps to populate the auto incrementing column.
It also adds the c++ client side changes needed for basic writes
and reads of this column to perform end-to-end tests.

Change-Id: I1dbde9095da78f6d1bd00adcc0a6e7dd63082bbc
Reviewed-on: http://gerrit.cloudera.org:8080/19097
Reviewed-by: Wenzhe Zhou <wz...@cloudera.com>
Tested-by: Alexey Serbin <al...@apache.org>
Reviewed-by: Alexey Serbin <al...@apache.org>


> Support generation of surrogate primary keys (or tables with no PK)
> -------------------------------------------------------------------
>
>                 Key: KUDU-1945
>                 URL: https://issues.apache.org/jira/browse/KUDU-1945
>             Project: Kudu
>          Issue Type: New Feature
>          Components: client, master, tablet
>            Reporter: Todd Lipcon
>            Priority: Major
>              Labels: roadmap-candidate
>
> Many use cases have data where there is no "natural" primary key. For example, a web log use case mostly cares about partitioning and not about precise sorting by timestamp, and timestamps themselves are not necessarily unique. Rather than forcing users to come up with their own surrogate primary keys, Kudu should support some kind of "auto_increment" equivalent which generates primary keys on insertion. Alternatively, Kudu could support tables which are partitioned but not internally sorted.
> The advantages would be:
> - Kudu can pick primary keys on insertion to guarantee that there is no compaction required on the table (eg always assign a new key higher than any existing key in the local tablet). This can improve write throughput substantially, especially compared to naive PK generation schemes that a user might pick such as UUID, which would generate a uniform random-insert workload (worst case for performance)
> - Make Kudu easier to use for such use cases (no extra client code necessary)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)