You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/07/10 12:52:22 UTC

[GitHub] [kafka] dajac commented on a change in pull request #10973: KAFKA-13033: COORDINATOR_NOT_AVAILABLE should be unmapped

dajac commented on a change in pull request #10973:
URL: https://github.com/apache/kafka/pull/10973#discussion_r667335981



##########
File path: clients/src/main/java/org/apache/kafka/clients/admin/internals/AlterConsumerGroupOffsetsHandler.java
##########
@@ -136,21 +142,28 @@ private void handleError(
     ) {
         switch (error) {
             case GROUP_AUTHORIZATION_FAILED:
-                log.error("Received authorization failure for group {} in `OffsetCommit` response", groupId,
-                        error.exception());
+                log.error("Received authorization failure for group {} in `{}` response", groupId,
+                        apiName(), error.exception());
                 failed.put(groupId, error.exception());
                 break;
             case COORDINATOR_LOAD_IN_PROGRESS:
+                // If the coordinator is in the middle of loading, then we just need to retry
+                log.debug("`{}` request for group {} failed because the coordinator" +
+                    " is still in the process of loading state. Will retry.", apiName(), groupId);
+                break;
             case COORDINATOR_NOT_AVAILABLE:
             case NOT_COORDINATOR:
-                log.debug("OffsetCommit request for group {} returned error {}. Will retry", groupId, error);
+                // If the coordinator is unavailable or there was a coordinator change, then we unmap
+                // the key so that we retry the `FindCoordinator` request
+                log.debug("`{}` request for group {} returned error {}. " +
+                    "Will attempt to find the coordinator again and retry.", apiName(), groupId, error);
                 unmapped.add(groupId);
                 break;
             default:
-                log.error("Received unexpected error for group {} in `OffsetCommit` response",
-                        groupId, error.exception());
-                failed.put(groupId, error.exception(
-                        "Received unexpected error for group " + groupId + " in `OffsetCommit` response"));
+                final String unexpectedErrorMsg = String.format("Received unexpected error for group %s in `%s` response",
+                    groupId, apiName());
+                log.error(unexpectedErrorMsg, error.exception());
+                failed.put(groupId, error.exception(unexpectedErrorMsg));

Review comment:
       The error handling bugs me a bit. `failed` will result in failing the future eventually, right? The issue is that we received errors either for the group and for the partition in the response and we don't really differentiate them here. For instance, imagine that a partition has an `UNKNOWN_TOPIC_OR_PARTITION` error. Putting it to `failed` with the `groupId` could fail the future and thus results in providing that error for all partitions. This is not correct.
   
   It seems to me that we should differentiate the group level errors from the partition level errors here or we should consider all of them as partition level errors. What do you think?
   
   Also, I think that we should handle all the expected errors here. The default error message here is wrong. There are many errors which expect but which are not handled.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org