You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Sarah Story <sa...@sarahstoryengineering.com> on 2021/11/24 16:26:01 UTC

[Discuss] KIP-803: Add Task ID and Connector Name to Connect Task Context

I have written a KIP for adding Task ID and Connector name to the Connect
API task context.

Here is the doc: https://cwiki.apache.org/confluence/x/4pKqCw

Looking forward to hearing all of your thoughts!

Re: [Discuss] KIP-803: Add Task ID and Connector Name to Connect Task Context

Posted by Jordan Bull <jo...@gmail.com>.
Hi all,

Would like to chime in with my interest in seeing this change! At the risk
of getting a bit off topic, I had been planning to propose a similar KIP,
but for exposing the connector name and task ID in converters and
transformers as they run into the same issue of not being able to easily
denote the task any logging and telemetry they generate should be
associated with. Any change to expose this info to the task should also be
the method for exposing them to the converters and transformers (if
approved). The reason this becomes particularly relevant to this
conversation, is that the workaround that Chris mentioned in his second
bullet point cannot be performed by the connector for converters and
transformers in all cases since converters are sometimes instantiated using
the worker config and other times using the connector config. I don't want
to derail this KIP with a discussion around exposing these fields to
converters and transformers, but I think it's worth considering if
advocating the workaround would block these other use-cases.

Thanks for the KIP Sarah (and Ryanne in 438)!
Jordan

On Tue, Dec 7, 2021 at 2:27 PM Chris Egerton <ch...@confluent.io.invalid>
wrote:

> Hi Sarah,
>
> Thanks for the KIP! I have two major thoughts:
>
> 1. Adding new methods like this to the Connect API comes with the risk that
> connectors that invoke them become incompatible with older versions of
> Connect. For example, if I updated my connector to use the newly-proposed
> SourceTaskContext::connector method, it would fail with a NoSuchMethodError
> when run on Connect version 3.1 (which will not have this feature). We've
> been careful to document this limitation in any newly-introduced methods in
> the past (see KIP-610 [1] and KIP-618 [2], for example), but even then,
> there's still risk with anything like this and we may not want to expand
> the API if the benefits don't outweigh the costs.
>
> 2. It's already possible to implement this logic directly in a connector
> today, without any changes to the Connect framework. Every connector can
> learn its own name in Connector::start by reading the "name" configuration
> property, and can then choose to pass that information along to its tasks
> as part of the configs it generates in Connector::taskConfigs. And, in the
> same way, connectors can choose to provide IDs for each task in the task
> configs that they generate. If this isn't sufficient for your use cases, it
> should be documented as a rejected alternative.
>
> BTW, it looks like this aims to accomplish something very similar or even
> identical to KIP-438 [3]. Ryanne Dolan (the author of that KIP) may want to
> weigh in here; I've CC'd them.
>
> [1] -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors#KIP610:ErrorReportinginSinkConnectors-Method
> (see javadocs for SinkTaskContext::errantRecordReporter, paragraph starting
> with "This method was added in Apache Kafka 2.6")
> [2] -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-ConnectorAPIexpansions
> (see javadocs for SourceTaskContext::transactionContext, paragraph starting
> with "This method was added in Apache Kafka 3.0")
> [3] -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-438%3A+Expose+task%2C+connector+IDs+in+Connect+API
>
> Cheers,
>
> Chris
>
> On Tue, Dec 7, 2021 at 10:07 AM Sarah Story <
> sarah@sarahstoryengineering.com>
> wrote:
>
> > Hi all! Just wanted to bump this KIP for adding Task ID and Connector
> Name
> > to the task context. It's a small change, and I'd love some feedback on
> it!
> >
> > Thanks!
> > Sarah
> >
> > On Wed, Nov 24, 2021 at 10:26 AM Sarah Story <
> > sarah@sarahstoryengineering.com> wrote:
> >
> > > I have written a KIP for adding Task ID and Connector name to the
> Connect
> > > API task context.
> > >
> > > Here is the doc: https://cwiki.apache.org/confluence/x/4pKqCw
> > >
> > > Looking forward to hearing all of your thoughts!
> > >
> >
>

Re: [Discuss] KIP-803: Add Task ID and Connector Name to Connect Task Context

Posted by Chris Egerton <ch...@confluent.io.INVALID>.
Hi Sarah,

Thanks for the KIP! I have two major thoughts:

1. Adding new methods like this to the Connect API comes with the risk that
connectors that invoke them become incompatible with older versions of
Connect. For example, if I updated my connector to use the newly-proposed
SourceTaskContext::connector method, it would fail with a NoSuchMethodError
when run on Connect version 3.1 (which will not have this feature). We've
been careful to document this limitation in any newly-introduced methods in
the past (see KIP-610 [1] and KIP-618 [2], for example), but even then,
there's still risk with anything like this and we may not want to expand
the API if the benefits don't outweigh the costs.

2. It's already possible to implement this logic directly in a connector
today, without any changes to the Connect framework. Every connector can
learn its own name in Connector::start by reading the "name" configuration
property, and can then choose to pass that information along to its tasks
as part of the configs it generates in Connector::taskConfigs. And, in the
same way, connectors can choose to provide IDs for each task in the task
configs that they generate. If this isn't sufficient for your use cases, it
should be documented as a rejected alternative.

BTW, it looks like this aims to accomplish something very similar or even
identical to KIP-438 [3]. Ryanne Dolan (the author of that KIP) may want to
weigh in here; I've CC'd them.

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors#KIP610:ErrorReportinginSinkConnectors-Method
(see javadocs for SinkTaskContext::errantRecordReporter, paragraph starting
with "This method was added in Apache Kafka 2.6")
[2] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-ConnectorAPIexpansions
(see javadocs for SourceTaskContext::transactionContext, paragraph starting
with "This method was added in Apache Kafka 3.0")
[3] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-438%3A+Expose+task%2C+connector+IDs+in+Connect+API

Cheers,

Chris

On Tue, Dec 7, 2021 at 10:07 AM Sarah Story <sa...@sarahstoryengineering.com>
wrote:

> Hi all! Just wanted to bump this KIP for adding Task ID and Connector Name
> to the task context. It's a small change, and I'd love some feedback on it!
>
> Thanks!
> Sarah
>
> On Wed, Nov 24, 2021 at 10:26 AM Sarah Story <
> sarah@sarahstoryengineering.com> wrote:
>
> > I have written a KIP for adding Task ID and Connector name to the Connect
> > API task context.
> >
> > Here is the doc: https://cwiki.apache.org/confluence/x/4pKqCw
> >
> > Looking forward to hearing all of your thoughts!
> >
>

Re: [Discuss] KIP-803: Add Task ID and Connector Name to Connect Task Context

Posted by Sarah Story <sa...@sarahstoryengineering.com>.
Hi all! Just wanted to bump this KIP for adding Task ID and Connector Name
to the task context. It's a small change, and I'd love some feedback on it!

Thanks!
Sarah

On Wed, Nov 24, 2021 at 10:26 AM Sarah Story <
sarah@sarahstoryengineering.com> wrote:

> I have written a KIP for adding Task ID and Connector name to the Connect
> API task context.
>
> Here is the doc: https://cwiki.apache.org/confluence/x/4pKqCw
>
> Looking forward to hearing all of your thoughts!
>