You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/02/13 19:16:00 UTC

[jira] [Commented] (KAFKA-6559) Iterate record sets before calling Log.append

    [ https://issues.apache.org/jira/browse/KAFKA-6559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362898#comment-16362898 ] 

ASF GitHub Bot commented on KAFKA-6559:
---------------------------------------

toddpalino opened a new pull request #4567: KAFKA-6559: Iterate record sets before calling Log.append 
URL: https://github.com/apache/kafka/pull/4567
 
 
   If a Produce request contains multiple record sets for a single topic-partition, it is better to iterate these before calling Log.append. This is because append will process all the sets together, and therefore will need to reassign offsets even if the offsets for an individual record set are properly formed. By iterating the record sets before calling append, each set can be considered on its own and potentially be appended without reassigning offsets.
   
   3 tests added to cover this:
   - Append a single MemoryRecords that contains multiple batches
   - Append a single MemoryRecords that contains no batches
   - Append a single MemoryRecords that has an empty batch in the middle of valid batches
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Iterate record sets before calling Log.append
> ---------------------------------------------
>
>                 Key: KAFKA-6559
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6559
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.0.0
>            Reporter: Todd Palino
>            Assignee: Todd Palino
>            Priority: Major
>              Labels: performance
>
> If a Produce request contains multiple record sets for a single topic-partition, it is better to iterate these before calling Log.append. This is because append will process all the sets together, and therefore will need to reassign offsets even if the offsets for an individual record set are properly formed. By iterating the record sets before calling append, each set can be considered on its own and potentially be appended without reassigning offsets.
> While the core Java producer client does not current operate this way, it is permitted by the protocol and may be used by other clients that aggregate multiple batches together to produce them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)