You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Fangliang Liu (Jira)" <ji...@apache.org> on 2022/06/30 08:26:00 UTC

[jira] [Comment Edited] (FLINK-27275) Support partial insert in flink-connector-jdbc

    [ https://issues.apache.org/jira/browse/FLINK-27275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532669#comment-17532669 ] 

Fangliang Liu edited comment on FLINK-27275 at 6/30/22 8:25 AM:
----------------------------------------------------------------

The data structure of reduceBuffer in `TableBufferReducedStatementExecutor` needs to be changed, otherwise only the last one will be kept for multiple updates of the same primary key, and  partial insert cannot be guaranteed when the batch is greater than 1


was (Author: liufangliang):
In addition, the `reduceBuffer` in `TableBufferReducedStatementExecutor` needs to optimize the data structure to save multiple records of the same primary key.
I have implemented a partial update and it works fine in production.

> Support partial insert in flink-connector-jdbc
> ----------------------------------------------
>
>                 Key: FLINK-27275
>                 URL: https://issues.apache.org/jira/browse/FLINK-27275
>             Project: Flink
>          Issue Type: New Feature
>          Components: Connectors / JDBC
>    Affects Versions: 1.14.3
>            Reporter: Fangliang Liu
>            Priority: Major
>
> I use the following statement to create a flink job.
> a field in the source data is null, but the field in mysql is not null, then the field will be overwritten by null after the record is written to the database. If I don't want this historical value to be overwritten by null, it is necessary to take out the historical value in the database in this task, and then update it to the database again, which is very costly for us.
> So, i think the choice of whether to update to the database if a field is null can actually be left to the use.
> {code:java}
>  CREATE TABLE IF NOT EXISTS t_source (
>     `user_id`  bigint,
>     `A` string,
>     `B` string,
>     `C` string,
>     `flag` varchar(256)
> )WITH (
>     'connector' = 'kafka',
>     'format' = 'canal-json',
>     'scan.startup.mode' = 'latest-offset',
>     ... ...
> );
> CREATE TABLE IF NOT EXISTS t_sink (
>     `user_id`  bigint,
>     `A` string,
>     `B` string,
>     `C` string,
>     `flag` varchar(256),
>     PRIMARY KEY (`user_id`) NOT ENFORCED
> )WITH (
>     'connector' = 'jdbc',
>     'url' = 'jdbc:mysql://xx.xx.xx.xx:xxx/test',
>     'table-name' = 'user',
>     ... ...
>  ); 
> INSERT INTO t_sink(
>     `user_id`,
>     `A`,
>     `B`,
>     `C`,
>     `flag`
> ) SELECT  `user_id`, `A`, `B`, `C`, `flag` FROM t_source;
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)