You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Paul Davidson (JIRA)" <ji...@apache.org> on 2018/10/10 20:30:01 UTC

[jira] [Commented] (KAFKA-5061) client.id should be set for Connect producers/consumers

    [ https://issues.apache.org/jira/browse/KAFKA-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645507#comment-16645507 ] 

Paul Davidson commented on KAFKA-5061:
--------------------------------------

This bug became an urgent issue at Salesforce as we needed full task-level monitoring in production for Mirus (our open source Kafka Source Connector).  As a work-around we created a patch, to meet our specific needs, which has been running successfully in production since March. I would really like to see this bug resolved, so I have submitted a Pull Request ([https://github.com/apache/kafka/pull/5775]) with our battle-tested patch.

The change is similar to the one Satyajit submitted, but instead of using a default based on group id, it appends the task id to the client id when a “unique.client.id” property is set to true (false by default). This allows us to control the client id independent of the group id, so we can use a custom prefix on our producer client ids.  This is useful for monitoring. 

[~ewencp] this may not address all your concerns, but it certainly solved an urgent issue for us in a real-world use case.  Let me know where the gaps are and I will try to help to find a general solution. If the new Worker property in the PR requires a KIP then let me know and I can help create one.  There are no unit tests at present, but I am happy to add those if the change looks otherwise OK.

> client.id should be set for Connect producers/consumers
> -------------------------------------------------------
>
>                 Key: KAFKA-5061
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5061
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 0.10.2.1
>            Reporter: Ewen Cheslack-Postava
>            Priority: Major
>              Labels: needs-kip, newbie++
>
> In order to properly monitor individual tasks using the producer and consumer metrics, we need to have the framework disambiguate them. Currently when we create producers (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L362) and create consumers (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L371-L394) the client ID is not being set. You can override it for the entire worker via worker-level producer/consumer overrides, but you can't get per-task metrics.
> There are a couple of things we might want to consider doing here:
> 1. Provide default client IDs based on the worker group ID + task ID (providing uniqueness for multiple connect clusters up to the scope of the Kafka cluster they are operating on). This seems ideal since it's a good default; however it is a public-facing change and may need a KIP. Normally I would be less worried about this, but some folks may be relying on picking up metrics without this being set, in which case such a change would break their monitoring.
> 2. Allow overriding client.id on a per-connector basis. I'm not sure if this will really be useful or not -- it lets you differentiate between metrics for different connectors' tasks, but within a connector, all metrics would go to a single client.id. On the other hand, this makes the tasks act as a single group from the perspective of broker handling of client IDs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)