You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/08/11 20:04:26 UTC

[GitHub] [kafka] lbradstreet opened a new pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

lbradstreet opened a new pull request #11199:
URL: https://github.com/apache/kafka/pull/11199


   When the high watermark is contained in a non-active segment, we are not correctly bounding it by the hwm. This means that uncommitted records may overwrite committed data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] junrao merged pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

Posted by GitBox <gi...@apache.org>.
junrao merged pull request #11199:
URL: https://github.com/apache/kafka/pull/11199


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] lbradstreet commented on pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

Posted by GitBox <gi...@apache.org>.
lbradstreet commented on pull request #11199:
URL: https://github.com/apache/kafka/pull/11199#issuecomment-898750464


   > @lbradstreet : Thanks for the updated PR. Were the java 16 tests ok?
   
   @junrao it looks like it is one of the issues where System.exit was called without the Exit.exit override.
   ```
   [2021-08-13T19:08:33.299Z] * What went wrong:
   [2021-08-13T19:08:33.299Z] Execution failed for task ':core:integrationTest'.
   [2021-08-13T19:08:33.299Z] > Process 'Gradle Test Executor 129' finished with non-zero exit value 1
   [2021-08-13T19:08:33.299Z]   This problem might be caused by incorrect test process configuration.
   [2021-08-13T19:08:33.299Z]   Please refer to the test execution section in the User Manual at https://docs.gradle.org/7.1.1/userguide/java_testing.html#sec:test_execution
   ```
   I don't think it's related to this change but if we want we can rerun it to be sure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] junrao commented on a change in pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

Posted by GitBox <gi...@apache.org>.
junrao commented on a change in pull request #11199:
URL: https://github.com/apache/kafka/pull/11199#discussion_r687993862



##########
File path: core/src/main/scala/kafka/log/LogCleanerManager.scala
##########
@@ -595,8 +595,8 @@ private[log] object LogCleanerManager extends Logging {
     // may be cleaned
     val firstUncleanableDirtyOffset: Long = Seq(
 
-      // we do not clean beyond the first unstable offset
-      log.firstUnstableOffset,
+      // we do not clean beyond the last stable offset

Review comment:
       This is an existing issue. But could we update the comment in line 593 to include last stable offset too?

##########
File path: core/src/test/scala/unit/kafka/log/LogCleanerManagerTest.scala
##########
@@ -541,6 +541,29 @@ class LogCleanerManagerTest extends Logging {
     while(log.numberOfSegments < 8)
       log.appendAsLeader(records(log.logEndOffset.toInt, log.logEndOffset.toInt, time.milliseconds()), leaderEpoch = 0)
 
+    log.updateHighWatermark(50)
+
+    val lastCleanOffset = Some(0L)
+    val cleanableOffsets = LogCleanerManager.cleanableOffsets(log, lastCleanOffset, time.milliseconds)
+    assertEquals(0L, cleanableOffsets.firstDirtyOffset, "The first cleanable offset starts at the beginning of the log.")
+    assertEquals(log.highWatermark, cleanableOffsets.firstUncleanableDirtyOffset, "The first uncleanable offset is bounded by the hwm.")

Review comment:
       Since the description of the test says bounded by LSO, should we change log.highWatermark to log.lastStableOffset and the error message accordingly? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] lbradstreet commented on pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

Posted by GitBox <gi...@apache.org>.
lbradstreet commented on pull request #11199:
URL: https://github.com/apache/kafka/pull/11199#issuecomment-898057637


   > @lbradstreet : Thanks for the updated PR. LGTM. Are the 29 test failures related to this PR?
   
   @junrao yeah, they're looking related. Let me clean those up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] lbradstreet commented on a change in pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

Posted by GitBox <gi...@apache.org>.
lbradstreet commented on a change in pull request #11199:
URL: https://github.com/apache/kafka/pull/11199#discussion_r688051673



##########
File path: core/src/main/scala/kafka/log/LogCleanerManager.scala
##########
@@ -595,8 +595,8 @@ private[log] object LogCleanerManager extends Logging {
     // may be cleaned
     val firstUncleanableDirtyOffset: Long = Seq(
 
-      // we do not clean beyond the first unstable offset
-      log.firstUnstableOffset,
+      // we do not clean beyond the last stable offset

Review comment:
       Makes sense. I've updated it with a better explanation of what we bound by.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] lbradstreet commented on a change in pull request #11199: KAFKA-13194: bound cleaning by both LSO and HWM when firstUnstableOffsetMetadata is None

Posted by GitBox <gi...@apache.org>.
lbradstreet commented on a change in pull request #11199:
URL: https://github.com/apache/kafka/pull/11199#discussion_r688051998



##########
File path: core/src/test/scala/unit/kafka/log/LogCleanerManagerTest.scala
##########
@@ -541,6 +541,29 @@ class LogCleanerManagerTest extends Logging {
     while(log.numberOfSegments < 8)
       log.appendAsLeader(records(log.logEndOffset.toInt, log.logEndOffset.toInt, time.milliseconds()), leaderEpoch = 0)
 
+    log.updateHighWatermark(50)
+
+    val lastCleanOffset = Some(0L)
+    val cleanableOffsets = LogCleanerManager.cleanableOffsets(log, lastCleanOffset, time.milliseconds)
+    assertEquals(0L, cleanableOffsets.firstDirtyOffset, "The first cleanable offset starts at the beginning of the log.")
+    assertEquals(log.highWatermark, cleanableOffsets.firstUncleanableDirtyOffset, "The first uncleanable offset is bounded by the hwm.")

Review comment:
       This is what I intended since the lastStableOffset should equal the highWatermark, but I think it makes sense to keep the intent. Instead I have checked that log.highWatermark == log.lastStableOffset in the line prior to it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org