You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2022/08/08 14:54:39 UTC

[GitHub] [kafka] C0urante commented on a diff in pull request #12490: KAFKA-14147: Prevent maps from growing monotonically in KafkaConfigBackingStore

C0urante commented on code in PR #12490:
URL: https://github.com/apache/kafka/pull/12490#discussion_r940338098


##########
connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java:
##########
@@ -853,6 +853,9 @@ private void processConnectorConfigRecord(String connectorName, SchemaAndValue v
                 connectorConfigs.remove(connectorName);
                 connectorTaskCounts.remove(connectorName);
                 taskConfigs.keySet().removeIf(taskId -> taskId.connector().equals(connectorName));
+                deferredTaskUpdates.remove(connectorName);
+                connectorTaskCountRecords.remove(connectorName);

Review Comment:
   Unfortunately, we can't safely remove this entry. There may still be zombie tasks running for this connector that would need to be fenced out before bringing up new tasks if it got recreated.
   
   One of the [rejected alternatives in KIP-618](https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-Cleanupoftaskcountrecordsonconnectordeletion) was to perform a final round of fencing for source connectors after they were deleted. The rationale was that the work would be unnecessary if the connector were never recreated, and would provide little advantage if it were (since the total number of required fencings would remain unchanged).
   
   The point that the these task count records can grow monotonically is an interesting one, but I'm not sure it's enough to tip the balance towards a fence-on-delete approach. We don't wipe offsets for deleted source connectors, and although this has been a pain point for many users who would like to reset those offsets without having to rename their connectors, to my knowledge, the storage cost for these offsets has not been an issue.
   
   Have you observed practical fallout from this issue, or is this more of a general exercise in code cleanliness?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org