You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/11/10 01:38:07 UTC

[GitHub] [kafka] ableegoldman opened a new pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

ableegoldman opened a new pull request #9489:
URL: https://github.com/apache/kafka/pull/9489


   This message absolutely floods the logs, especially in an eos application where the commit interval is just 100ms. It's definitely a useful message but I don't think there's any justification for it being at the INFO level when it's logged 10 times a second. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman closed pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman closed pull request #9489:
URL: https://github.com/apache/kafka/pull/9489


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-757068925


   Cherrypicked to 2.7


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman merged pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman merged pull request #9489:
URL: https://github.com/apache/kafka/pull/9489


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] wcarlson5 commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
wcarlson5 commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r520961368



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,14 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this when we actually processed some records to avoid flooding INFO-level logs when there are
+            // no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Finished processing {} records, going to poll again", totalProcessed);
+            } else {
+                log.debug("Processed zero records, going to poll again");

Review comment:
       Is it important to make the distinction between "Processed zero records, going to poll again" and "Finished processing 0 records, going to poll again"?
   
   Also if we are not saying its processed anything maybe we should log what it is pulling from




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-727043278


   Merged to trunk. Waiting to cherrypick to 2.7 branch until 2.7.0 release has completed


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] vvcephei commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
vvcephei commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r522203482



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,16 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this at INFO when we actually processed some records to avoid flooding INFO-level logs when
+            // there are no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Processed {} records for tasks {}, polling for more records now",
+                         totalProcessed, taskManager.activeTaskIds());
+            } else {
+                log.debug("Processed {} records for tasks {}, polling for more records now",

Review comment:
       This line looks about the same as L665. I'm wondering if we really need the "else" in this case.

##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,16 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this at INFO when we actually processed some records to avoid flooding INFO-level logs when
+            // there are no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Processed {} records for tasks {}, polling for more records now",

Review comment:
       Since this is at the end of runOnce, I'm wondering if it also makes sense to log whether we committed or punctuated, and whether/how many records we polled at the beginning of the method.
   
   Basically, it seems like, if it's a good idea to log some information once per cycle, then it's probably a good idea to summarize everything you'd want to know.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-724391259


   That's fair. Ok I'll go forward with demoting this and just add a new INFO-level log that's less frequent, maybe at the max poll interval


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-727042137


   Two flaky tests: 
   `PurgeRepartitionTopicIntegrationTest.shouldRestoreState`
   `SslSelectorTest.testCloseOldestConnection`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ghmulti commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ghmulti commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-723911887


   There is a lot of useful information on an INFO level, but as you mentioned - having that line from several threads with 100ms frequency makes it way too noisy (especially when container is running on a cluster with log forwarder, which leads to waste of resources). Even though it is not complicated to change the log level configuration for that particular class, it is kind of inconvenient and prevents from using out-of-the box default configs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman removed a comment on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman removed a comment on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-715639072


   @guozhangwang @vvcephei  WDYT? I'm generally all for more logs but this is pretty extreme 🙂 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] abbccdda commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
abbccdda commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r520959174



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,14 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this when we actually processed some records to avoid flooding INFO-level logs when there are
+            // no records on the input topics
+            if (totalProcessed > 0) {

Review comment:
       Could we log the owned task here as well?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] wcarlson5 commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
wcarlson5 commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r521055615



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,14 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this when we actually processed some records to avoid flooding INFO-level logs when there are
+            // no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Finished processing {} records, going to poll again", totalProcessed);
+            } else {
+                log.debug("Processed zero records, going to poll again");

Review comment:
       Looks good




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-757068925


   Cherrypicked to 2.7


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r522579641



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,16 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this at INFO when we actually processed some records to avoid flooding INFO-level logs when
+            // there are no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Processed {} records for tasks {}, polling for more records now",
+                         totalProcessed, taskManager.activeTaskIds());
+            } else {
+                log.debug("Processed {} records for tasks {}, polling for more records now",

Review comment:
       Fair enough




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-725045320


   Thoughts? @wcarlson5 @vvcephei @abbccdda 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-715639072


   @guozhangwang @vvcephei  WDYT? I'm generally all for more logs but this is pretty extreme 🙂 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r522591013



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,16 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this at INFO when we actually processed some records to avoid flooding INFO-level logs when
+            // there are no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Processed {} records for tasks {}, polling for more records now",

Review comment:
       Yeah good call. I promoted one of the debug logs you added earlier to log the number of records polled at INFO. I think this should be reasonable but we can always demote it again if it's still showing up too frequently




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-723347370


   Eh, I'm going back and forth on this. It's definitely valuable information to have in the absence of any other logging within the main Streams loop at the INFO level


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ghmulti commented on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ghmulti commented on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-723038111


   hello @ableegoldman, I wonder why did you close the pr?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ghmulti edited a comment on pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ghmulti edited a comment on pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#issuecomment-723911887


   There is a lot of useful information on an INFO level, but as you mentioned - having that line from several threads with 100ms frequency makes it way too noisy (especially when container is running on a cluster with log forwarder, which leads to waste of resources). Even though it is not complicated to change the log level configuration for that particular class, it is kind of inconvenient and prevents from using out-of-the-box default configs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9489: MINOR: demote "Committing task offsets" log to DEBUG

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9489:
URL: https://github.com/apache/kafka/pull/9489#discussion_r520970972



##########
File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -703,6 +703,14 @@ void runOnce() {
             // we record the ratio out of the while loop so that the accumulated latency spans over
             // multiple iterations with reasonably large max.num.records and hence is less vulnerable to outliers
             taskManager.recordTaskProcessRatio(totalProcessLatency, now);
+
+            // Only log this when we actually processed some records to avoid flooding INFO-level logs when there are
+            // no records on the input topics
+            if (totalProcessed > 0) {
+                log.info("Finished processing {} records, going to poll again", totalProcessed);
+            } else {
+                log.debug("Processed zero records, going to poll again");

Review comment:
       Reworked it a bit, lmk what you think




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org