You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/09/16 23:58:11 UTC

[GitHub] [kafka] ableegoldman opened a new pull request #9295: MINOR: standardize rebalance related logging for easy discovery & debugging

ableegoldman opened a new pull request #9295:
URL: https://github.com/apache/kafka/pull/9295


   Some minor logging adjustments to standardize the grammar of rebalance related messages and make it easy to query the logs for quick debugging results


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] guozhangwang commented on a change in pull request #9295: MINOR: standardize rebalance related logging for easy discovery & debugging

Posted by GitBox <gi...@apache.org>.
guozhangwang commented on a change in pull request #9295:
URL: https://github.com/apache/kafka/pull/9295#discussion_r491484721



##########
File path: clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java
##########
@@ -598,11 +598,12 @@ public void handle(JoinGroupResponse joinResponse, RequestFuture<ByteBuffer> fut
                     }
                 }
             } else if (error == Errors.COORDINATOR_LOAD_IN_PROGRESS) {
-                log.debug("Attempt to join group rejected since coordinator {} is loading the group.", coordinator());
+                log.debug("JoinGroup failed: Coordinator {} is loading the group.", coordinator());
                 // backoff and retry
                 future.raise(error);
             } else if (error == Errors.UNKNOWN_MEMBER_ID) {
-                log.debug("Attempt to join group failed due to unknown member id with {}.", sentGeneration);
+                log.info("JoinGroup failed: {} Need to re-join the group. Sent generation was {}",

Review comment:
       Why promote it from debug to info while leaving others debug?

##########
File path: clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java
##########
@@ -769,24 +771,27 @@ public void handle(SyncGroupResponse syncResponse,
                 if (error == Errors.GROUP_AUTHORIZATION_FAILED) {
                     future.raise(GroupAuthorizationException.forGroupId(rebalanceConfig.groupId));
                 } else if (error == Errors.REBALANCE_IN_PROGRESS) {
-                    log.debug("SyncGroup failed because the group began another rebalance");
+                    log.info("SyncGroup failed: The group began another rebalance. Need to re-join the group. " +
+                                 "Sent generation was {}", sentGeneration);
                     future.raise(error);
                 } else if (error == Errors.FENCED_INSTANCE_ID) {
                     // for sync-group request, even if the generation has changed we would not expect the instance id
                     // gets fenced, and hence we always treat this as a fatal error
-                    log.error("SyncGroup with {} failed because the group instance id {} has been fenced by another instance",
-                        sentGeneration, rebalanceConfig.groupInstanceId);
+                    log.error("SyncGroup failed: The group instance id {} has been fenced by another instance. " +
+                        "Sent generation was {}", rebalanceConfig.groupInstanceId, sentGeneration);
                     future.raise(error);
                 } else if (error == Errors.UNKNOWN_MEMBER_ID
                         || error == Errors.ILLEGAL_GENERATION) {
-                    log.info("SyncGroup with {} failed: {}, would request re-join", sentGeneration, error.message());
+                    log.info("SyncGroup failed: {} Need to re-join the group. Sent generation was {}",
+                            error.message(), sentGeneration);
                     if (generationUnchanged())
                         resetGenerationOnResponseError(ApiKeys.SYNC_GROUP, error);
 
                     future.raise(error);
                 } else if (error == Errors.COORDINATOR_NOT_AVAILABLE
                         || error == Errors.NOT_COORDINATOR) {
-                    log.debug("SyncGroup failed: {}, marking coordinator unknown", error.message());
+                    log.debug("SyncGroup failed: {} Marking coordinator unknown. Sent generation was {}",

Review comment:
       Ditto here, if we think we should pay attention to any errors excluding things like coordinator loading in progress let's just make them all info.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9295: MINOR: standardize rebalance related logging for easy discovery & debugging

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9295:
URL: https://github.com/apache/kafka/pull/9295#discussion_r495292893



##########
File path: clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java
##########
@@ -769,24 +771,27 @@ public void handle(SyncGroupResponse syncResponse,
                 if (error == Errors.GROUP_AUTHORIZATION_FAILED) {
                     future.raise(GroupAuthorizationException.forGroupId(rebalanceConfig.groupId));
                 } else if (error == Errors.REBALANCE_IN_PROGRESS) {
-                    log.debug("SyncGroup failed because the group began another rebalance");
+                    log.info("SyncGroup failed: The group began another rebalance. Need to re-join the group. " +
+                                 "Sent generation was {}", sentGeneration);
                     future.raise(error);
                 } else if (error == Errors.FENCED_INSTANCE_ID) {
                     // for sync-group request, even if the generation has changed we would not expect the instance id
                     // gets fenced, and hence we always treat this as a fatal error
-                    log.error("SyncGroup with {} failed because the group instance id {} has been fenced by another instance",
-                        sentGeneration, rebalanceConfig.groupInstanceId);
+                    log.error("SyncGroup failed: The group instance id {} has been fenced by another instance. " +
+                        "Sent generation was {}", rebalanceConfig.groupInstanceId, sentGeneration);
                     future.raise(error);
                 } else if (error == Errors.UNKNOWN_MEMBER_ID
                         || error == Errors.ILLEGAL_GENERATION) {
-                    log.info("SyncGroup with {} failed: {}, would request re-join", sentGeneration, error.message());
+                    log.info("SyncGroup failed: {} Need to re-join the group. Sent generation was {}",
+                            error.message(), sentGeneration);
                     if (generationUnchanged())
                         resetGenerationOnResponseError(ApiKeys.SYNC_GROUP, error);
 
                     future.raise(error);
                 } else if (error == Errors.COORDINATOR_NOT_AVAILABLE
                         || error == Errors.NOT_COORDINATOR) {
-                    log.debug("SyncGroup failed: {}, marking coordinator unknown", error.message());
+                    log.debug("SyncGroup failed: {} Marking coordinator unknown. Sent generation was {}",

Review comment:
       Sounds good




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] guozhangwang merged pull request #9295: MINOR: standardize rebalance related logging for easy discovery & debugging

Posted by GitBox <gi...@apache.org>.
guozhangwang merged pull request #9295:
URL: https://github.com/apache/kafka/pull/9295


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9295: MINOR: standardize rebalance related logging for easy discovery & debugging

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9295:
URL: https://github.com/apache/kafka/pull/9295#issuecomment-695141073


   Unrelated `kafka.api.ConsumerBounceTest failed`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9295: MINOR: standardize rebalance related logging for easy discovery & debugging

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9295:
URL: https://github.com/apache/kafka/pull/9295#issuecomment-693727341


   @guozhangwang 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org