You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@hudi.apache.org by vtygoss <vt...@126.com> on 2021/07/12 11:38:15 UTC
hudi hard deletion in flink sql AND detect deletions of upstream
Hi,
I have two problems:
1. How to specify hard deletion in hudi-flink-bundle-0.9.0?
2. How to detect the deletion events in downstream hudi-flink sql streaming? The down streams need to detect the deletions of input hudi table and act accordingly.
I tried to use org.apache.hudi.common.model. EmptyHoodieRecordPayload, but it seems like that EmptyHoodieRecordPayload is not really deletion but emits null value of none primary key? i am not sure. BTW, klass EmptyHoodieRecordPayload is lack of a constructor of parameter klass “org.apache.hudi.common.util.Option".
please offer some advices, thank you very much!
Best Regards!
```
CREATE TABLE t3(
uuid VARCHAR(20),
name VARCHAR(10),
age INT,
ts TIMESTAMP(3),
`partition` VARCHAR(20),
primary key(uuid) not enforced
)
PARTITIONED BY (`partition`)
WITH (
'connector' = 'hudi',
'path' = 'hdfs://bruneihealth/user/data/db/hudi_flink/t3',
'table.type' = 'MERGE_ON_READ',
'read.tasks' = '1',
'read.streaming.enabled' = 'true',
'read.streaming.check-interval' = '1',
'hoodie.datasource.write.partitionpath.field'='_hoodie_partition_path',
'write.payload.class' = 'org.apache.hudi.common.model.EmptyHoodieRecordPayload',
'compaction.async.enabled'='false'
);
```
Re: hudi hard deletion in flink sql AND detect deletions of upstream
Posted by Danny Chan <da...@apache.org>.
Hi vtygoss ~
By default, when consuming cdc stream DELETEs, the flink writer would
nullify the payload instant so that the write handle would recognize these
DELETEs and do HARD delete: do not write anything in the file.
If you want to detect DELETEs downstream, you may need to wait for the
HUDI-1771, which would keep DELETEs with proper change flags.
Best,
Danny Chan
vtygoss <vt...@126.com> 于2021年7月12日周一 下午7:38写道:
> Hi,
>
>
> I have two problems:
>
>
> 1. How to specify hard deletion in hudi-flink-bundle-0.9.0?
>
> 2. How to detect the deletion events in downstream hudi-flink sql
> streaming? The down streams need to detect the deletions of input hudi
> table and act accordingly.
>
>
>
> I tried to use org.apache.hudi.common.model. EmptyHoodieRecordPayload, but
> it seems like that EmptyHoodieRecordPayload is not really deletion but
> emits null value of none primary key? i am not sure. BTW, klass
> EmptyHoodieRecordPayload is lack of a constructor of parameter klass “
> org.apache.hudi.common.util.Option".
>
>
> please offer some advices, thank you very much!
>
>
> Best Regards!
>
>
>
> ```
>
> CREATE TABLE t3(
>
> uuid VARCHAR(20),
>
> name VARCHAR(10),
>
> age INT,
>
> ts TIMESTAMP(3),
>
> `partition` VARCHAR(20),
>
> primary key(uuid) not enforced
>
> )
>
> PARTITIONED BY (`partition`)
>
> WITH (
>
> 'connector' = 'hudi',
>
> 'path' = 'hdfs://bruneihealth/user/data/db/hudi_flink/t3',
>
> 'table.type' = 'MERGE_ON_READ',
>
> 'read.tasks' = '1',
>
> 'read.streaming.enabled' = 'true',
>
> 'read.streaming.check-interval' = '1',
>
> 'hoodie.datasource.write.partitionpath.field'='_hoodie_partition_path',
>
> 'write.payload.class' =
> 'org.apache.hudi.common.model.EmptyHoodieRecordPayload',
>
> 'compaction.async.enabled'='false'
>
> );
>
> ```
>