You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Luke Chen (Jira)" <ji...@apache.org> on 2022/05/23 02:34:00 UTC

[jira] [Commented] (KAFKA-13926) Proposal to have "HasField" predicate for kafka connect

    [ https://issues.apache.org/jira/browse/KAFKA-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17540736#comment-17540736 ] 

Luke Chen commented on KAFKA-13926:
-----------------------------------

[~kumudkumartirupati] , one reminder, if you're going to add a field to the public API, you should create KIP and make sure it's publicly discussed and voted. Please check this for more detail: [https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals]

 

Please let me know if you have any problem. Thanks.

> Proposal to have "HasField" predicate for kafka connect
> -------------------------------------------------------
>
>                 Key: KAFKA-13926
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13926
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Kumud Kumar Srivatsava Tirupati
>            Assignee: Kumud Kumar Srivatsava Tirupati
>            Priority: Major
>
> Hello,
> Today's connect predicates enables checks on the record metadata. However, this can be limiting considering {*}many inbuilt and custom transformations that we (community) use are more key/value centric{*}.
> Some use-cases this can solve:
>  * Data type conversions of certain pre-identified fields for records coming across datasets only if those fields exist. [Ex: TimestampConverter can be run only if the specified date field exists irrespective of the record metadata]
>  * Skip running certain transform if a given field does/does not exist. A lot of inbuilt transforms raise exceptions (Ex: InsertField transform if the field already exists) thereby breaking the task. Giving this control enable users to consciously configure for such cases.
>  * Even though some inbuilt transforms explicitly handle these cases, it would still be an unnecessary pass-through loop.
>  * Considering each connector usually deals with multiple datasets (Even 100s for a database CDC connector), metadata-centric predicate checking will be somewhat limiting when we talk about such pre-identified custom metadata fields in the records.
> I know some of these cases can be handled within the transforms itself but that defeats the purpose of having predicates.
> We have built this predicate for us and it is found to be extremely helpful. Please let me know your thoughts on the same so that I can raise a PR.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)