You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2022/07/01 07:35:14 UTC
[GitHub] [kafka] cadonna commented on a diff in pull request #12337: KAFKA-10199: Remove main consumer from store changelog reader

cadonna commented on code in PR #12337:
URL: https://github.com/apache/kafka/pull/12337#discussion_r911669806


##########
streams/src/main/java/org/apache/kafka/streams/processor/internals/StoreChangelogReader.java:
##########
@@ -693,23 +689,34 @@ private void maybeInitTaskTimeoutOrThrow(final Set<Task> tasks,
         tasks.forEach(t -> t.maybeInitTaskTimeoutOrThrow(now, cause));
     }
 
-    private Map<TopicPartition, Long> committedOffsetForChangelogs(final Map<TaskId, Task> tasks,
-                                                                   final Set<TopicPartition> partitions) {
-        final Map<TopicPartition, Long> committedOffsets;
+    private Map<TopicPartition, Long> committedOffsetForChangelogs(final Map<TaskId, Task> tasks, final Set<TopicPartition> partitions) {
+        if (partitions.isEmpty()) {
+            return Collections.emptyMap();
+        }
+
         try {
-            committedOffsets = fetchCommittedOffsets(partitions, mainConsumer);
+            // those which do not have a committed offset would default to 0
+            final ListConsumerGroupOffsetsOptions options = new ListConsumerGroupOffsetsOptions();
+            options.topicPartitions(new ArrayList<>(partitions));
+            options.requireStable(true);
+            final Map<TopicPartition, Long> committedOffsets = adminClient.listConsumerGroupOffsets(groupId, options)
+                    .partitionsToOffsetAndMetadata().get().entrySet()
+                    .stream()
+                    .collect(Collectors.toMap(Map.Entry::getKey, e -> e.getValue() == null ? 0L : e.getValue().offset()));
+
             clearTaskTimeout(getTasksFromPartitions(tasks, partitions));
-        } catch (final TimeoutException timeoutException) {
-            log.debug("Could not fetch all committed offsets for {}, will retry in the next run loop", partitions);
-            maybeInitTaskTimeoutOrThrow(getTasksFromPartitions(tasks, partitions), timeoutException);
+            return committedOffsets;
+        } catch (final TimeoutException | InterruptedException | ExecutionException retriableException) {
+            log.debug("Could not retrieve the committed offset for partitions {} due to {}, will retry in the next run loop",

Review Comment:
   This typo was introduced from my earlier suggestion. Sorry for that!
   ```suggestion
               log.debug("Could not retrieve the committed offsets for partitions {} due to {}, will retry in the next run loop",
   ```



##########
streams/src/main/java/org/apache/kafka/streams/processor/internals/StoreChangelogReader.java:
##########
@@ -697,7 +693,7 @@ private Map<TopicPartition, Long> committedOffsetForChangelogs(final Map<TaskId,
                                                                    final Set<TopicPartition> partitions) {
         final Map<TopicPartition, Long> committedOffsets;
         try {
-            committedOffsets = fetchCommittedOffsets(partitions, mainConsumer);

Review Comment:
   I think re-throwing the `InterruptedException` is cleaner. Otherwise we defeat the purpose of interrupting a thread. Especially, in this shutdown case were we use interruption specifically to interrupt any actions where possible. The old code path should not be affected since we control the stream thread and we never interrupt it.
   The only drawback that I see is that we need to add the `throws` expression to the calls in the stacktrace and handle (i.e., ignore) the `InterruptedException` in the old code path in the stream thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org