You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Luke Chen (Jira)" <ji...@apache.org> on 2020/08/12 10:58:00 UTC

[jira] [Commented] (KAFKA-8362) LogCleaner gets stuck after partition move between log directories

    [ https://issues.apache.org/jira/browse/KAFKA-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176257#comment-17176257 ] 

Luke Chen commented on KAFKA-8362:
----------------------------------

Currently, we'll append a tombstone record into the source(old) diretory when moved, which is expected. I think we can add a collection to keep the source(old) directories, and filter out these records in *allCleanerCheckpoints* method.

> LogCleaner gets stuck after partition move between log directories
> ------------------------------------------------------------------
>
>                 Key: KAFKA-8362
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8362
>             Project: Kafka
>          Issue Type: Bug
>          Components: jbod, log cleaner
>            Reporter: Julio Ng
>            Assignee: Luke Chen
>            Priority: Major
>
> When a partition is moved from one directory to another, their checkpoint entry in cleaner-offset-checkpoint file is not removed from the source directory.
> As a consequence when we read the last firstDirtyOffset, we might get a stale value from the old checkpoint file.
> Basically, we need clean up the entry from the check point file in the source directory when the move is completed
> The current issue is that the code in LogCleanerManager:
> {noformat}
> /**
>  * @return the position processed for all logs.
>  */
> def allCleanerCheckpoints: Map[TopicPartition, Long] = {
>   inLock(lock) {
>     checkpoints.values.flatMap(checkpoint => {
>       try {
>         checkpoint.read()
>       } catch {
>         case e: KafkaStorageException =>
>           error(s"Failed to access checkpoint file ${checkpoint.file.getName} in dir ${checkpoint.file.getParentFile.getAbsolutePath}", e)
>           Map.empty[TopicPartition, Long]
>       }
>     }).toMap
>   }
> }{noformat}
> collapses the offsets when multiple entries exist for the topicPartition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)