You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/03/24 00:52:00 UTC

[GitHub] [kafka] junrao commented on a change in pull request #10388: KAFKA-12520: Ensure log loading does not truncate producer state unless required

junrao commented on a change in pull request #10388:
URL: https://github.com/apache/kafka/pull/10388#discussion_r600057668



##########
File path: core/src/main/scala/kafka/log/Log.scala
##########
@@ -700,11 +700,13 @@ class Log(@volatile private var _dir: File,
           case _: NoSuchFileException =>
             error(s"Could not find offset index file corresponding to log file ${segment.log.file.getAbsolutePath}, " +
               "recovering segment and rebuilding index files...")
-            recoverSegment(segment)
+            if (segment.validateSegmentAndRebuildIndices() > 0)
+              throw new KafkaStorageException("Found invalid or corrupted messages in segment " + segment.log.file);

Review comment:
       Perhaps we could report the number of invalid bytes in the exception? Ditto below and in `completeSwapOperations()`.

##########
File path: core/src/main/scala/kafka/log/LogSegment.scala
##########
@@ -322,17 +323,14 @@ class LogSegment private[log] (val log: FileRecords,
      offsetIndex.fetchUpperBoundOffset(startOffsetPosition, fetchSize).map(_.offset)
 
   /**
-   * Run recovery on the given segment. This will rebuild the index from the log file and lop off any invalid bytes
-   * from the end of the log and index.
+   * Ensure batches in the segment are valid and rebuild all corresponding indices.
    *
-   * @param producerStateManager Producer state corresponding to the segment's base offset. This is needed to recover
-   *                             the transaction index.
-   * @param leaderEpochCache Optionally a cache for updating the leader epoch during recovery.
-   * @return The number of bytes truncated from the log
+   * @param batchCallbackOpt Optional callback invoked for all valid batches in segment
+   * @return The number of invalid bytes at the end of the segment
    * @throws LogSegmentOffsetOverflowException if the log segment contains an offset that causes the index offset to overflow
    */
   @nonthreadsafe
-  def recover(producerStateManager: ProducerStateManager, leaderEpochCache: Option[LeaderEpochFileCache] = None): Int = {
+  def validateSegmentAndRebuildIndices(batchCallbackOpt: Option[FileChannelRecordBatch => Unit] = None) : Int = {

Review comment:
       It seems this method needs to the logic to trim the indexes at the end?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org