You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Sandeep Tamhankar (JIRA)" <ji...@apache.org> on 2018/05/25 18:28:00 UTC

[jira] [Commented] (KAFKA-6811) Tasks should have access to connector and task metadata

    [ https://issues.apache.org/jira/browse/KAFKA-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491129#comment-16491129 ] 

Sandeep Tamhankar commented on KAFKA-6811:
------------------------------------------

I'd suggest taking this one step further: I am working on a sink connector to push data to an external system. Establishing a connection/session in that system has a non-trivial cost, but once established, that session can be shared by all tasks. Currently, the task has no reference to the owning connector object, nor does the connector have the ability to send non-string objects to the task (e.g. Connector.taskConfigs only supports string values in the map).

Thus, a Connector developer has two choices:
 # Have each task create its own session to the external system.
 # Have a static member (session) on the Connector class that the Task can access when needed.

Option 1 is inefficient and unnecessarily uses resources in the Connector process as well as the external system.

Option 2 gets really ugly really fast: say you have two instances of the Connector with different configurations (connecting to different instances of the external service, for different topics). That single static member will no longer be appropriate – you need a static map, keyed on some unique identifier (specified in the connector config) to distinguish the session from one Connector instance from another. You need to expose a static method to allow tasks to access the session that is appropriate for them. Doable, yes, but quite a bit of arm-twisting.

> Tasks should have access to connector and task metadata
> -------------------------------------------------------
>
>                 Key: KAFKA-6811
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6811
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Jeremy Custenborder
>            Priority: Major
>
> As a connector developer it would be nice to have access to more metadata about within a (Source|Sink)Task. For example I could use this to log task specific data within the log. There are several connectors where I only run a single task but would be able to do taskId() % totalTasks() for partitioning.
> High level I'm thinking something like this.
> {code:java}
> String connectorName();
> int taskId();
> int totalTasks();
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)