You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/05/13 16:21:00 UTC

[jira] [Commented] (KAFKA-8351) Log cleaner must handle transactions spanning multiple segments

    [ https://issues.apache.org/jira/browse/KAFKA-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838672#comment-16838672 ] 

ASF GitHub Bot commented on KAFKA-8351:
---------------------------------------

hachikuji commented on pull request #6722: KAFKA-8351; Cleaner should handle transactions spanning multiple segments
URL: https://github.com/apache/kafka/pull/6722
 
 
   When cleaning transactional data, we need to keep track of which transactions still have data associated with them so that we do not remove the markers. We had logic to do this, but it was not being carried over when beginning cleaning for a new set of segments. This could cause the cleaner to incorrectly believe a transaction marker was no longer needed. The fix here carries the transactional state between groups of segments to be cleaned.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Log cleaner must handle transactions spanning multiple segments
> ---------------------------------------------------------------
>
>                 Key: KAFKA-8351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8351
>             Project: Kafka
>          Issue Type: Bug
>          Components: log cleaner
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>
> When cleaning transactions, we have to do some bookkeeping to keep track of which transactions still have data left around. As long as there is still data, we cannot remove the transaction marker. The problem is that we do this tracking at the segment level. We do not carry over the ongoing transaction state between segments. So if the first entry in a segment is a marker, we incorrectly clean it. In the worst case, data from a committed transaction could become aborted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)