You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/10/20 01:43:00 UTC
[jira] [Commented] (KUDU-3353) Support setnx semantic on column

    [ https://issues.apache.org/jira/browse/KUDU-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620672#comment-17620672 ] 

ASF subversion and git services commented on KUDU-3353:
-------------------------------------------------------

Commit f20dcf57ad76e9b1bb57fe60b27ea3a8f02df233 in kudu's branch refs/heads/master from Yingchun Lai
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=f20dcf57a ]

KUDU-3353 [schema] Add an immutable attribute on column schema (part 3)

This is a follow-up to b6eedb224f715ad86378a92d25f09c2084b0e2b7.
This patch contains the Java client-side changes
of the "new column attribute IMMUTABLE" feature,
including:
1. Adds a new 'immutable(boolean immutable)' method to
   class ColumnSchemaBuilder to add/remove IMMUTABLE
   attribute to/from a column.
2. Adds a new 'isImmutable()' method to class
   ColumnSchema to check if the attribute is set for
   a column schema.
3. Adds a new 'hasImmutableColumns()' method to class
   Schema to check if there's at least one immutable
   column for a table schema.
4. Adds a new 'changeImmutable(String name, boolean immutable)'
   method to class AlterTableOptions to change the
   immutable attribute for a column.
5. Adds a new UpsertIgnore operation in the client API:
   use the newly added KuduTable.newUpsertIgnore() to
   create a new instance of such operation.
   Both UpsertIgnore and UpdateIgnore operations can be used
   to ignore errors on updating cells of immutable columns.
6. Adds unit tests to cover the newly introduced functionality.

Change-Id: Ifdfdcd123296803a3b5e856ec5eaac49c05b7f8d
Reviewed-on: http://gerrit.cloudera.org:8080/18993
Tested-by: Alexey Serbin <al...@apache.org>
Reviewed-by: Alexey Serbin <al...@apache.org>


> Support setnx semantic on column
> --------------------------------
>
>                 Key: KUDU-3353
>                 URL: https://issues.apache.org/jira/browse/KUDU-3353
>             Project: Kudu
>          Issue Type: New Feature
>          Components: api, server
>            Reporter: Yingchun Lai
>            Assignee: Yingchun Lai
>            Priority: Major
>
> h1. motivation
> In some usage scenarios, Kudu table has a column with semantic of "create time", which means it represent the create timestamp of the row. The other columns have the similar semantic as before, for example, the user properties like age, address, and etc.
> Upstream and Kudu user doesn't know whether a row is exist or not, and every cell data is the lastest ingested from, for example, event stream.
> If without the "create time" column, Kudu user can use UPSERT operations to write data to the table, every columns with data will overwrite the old data. But if with the "create time" column, the cell data will be overwrote by the following UPSERT ops, which is not what we expect.
> To achive the goal, we have to read the column out to judge whether the column is NULL or not, if it's NULL, we can fill the row with the cell, if not NULL, we will drop it from the data before UPSERT, to avoid overwite "create time".
> It's expensive, is there a way to avoid a read from Kudu?
> h1. Resolvation
> We can implement column schema with semantic of "update if null". That means cell data in changelist will update the base data if the latter is NULL, and will ignore updates if it is not NULL.
> So we can use Kudu similarly as before, but only defined the column as "update if null" when create table or add column.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)