You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/03/12 07:14:00 UTC
[jira] [Commented] (KAFKA-9703) ProducerBatch.split takes up too many resources if the bigBatch is huge

    [ https://issues.apache.org/jira/browse/KAFKA-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057652#comment-17057652 ] 

ASF GitHub Bot commented on KAFKA-9703:
---------------------------------------

jiameixie commented on pull request #8286: KAFKA-9703:Free up resources when splitting huge batches
URL: https://github.com/apache/kafka/pull/8286
 
 
   Method split takes up too many resources and might
   cause outOfMemory error when the bigBatch is huge.
   Call closeForRecordAppends() to free up resources
   like compression buffers.
   
   Change-Id: Iac6519fcc2e432330b8af2d9f68a8d4d4a07646b
   Signed-off-by: Jiamei Xie <ji...@arm.com>
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> ProducerBatch.split takes up too many resources if the bigBatch is huge
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-9703
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9703
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: jiamei xie
>            Priority: Major
>
> ProducerBatch.split takes up too many resources  and might cause outOfMemory error if the bigBatch is huge. About how I found this issue is in https://lists.apache.org/list.html?users@kafka.apache.org:lte=1M:MESSAGE_TOO_LARGE
> Following is the code which takes a lot of resources.
> {code:java}
>  for (Record record : recordBatch) {
>             assert thunkIter.hasNext();
>             Thunk thunk = thunkIter.next();
>             if (batch == null)
>                 batch = createBatchOffAccumulatorForRecord(record, splitBatchSize);
>             // A newly created batch can always host the first message.
>             if (!batch.tryAppendForSplit(record.timestamp(), record.key(), record.value(), record.headers(), thunk)) {
>                 batches.add(batch);
>                 batch = createBatchOffAccumulatorForRecord(record, splitBatchSize);
>                 batch.tryAppendForSplit(record.timestamp(), record.key(), record.value(), record.headers(), thunk);
>             }
> {code}
> Refer to RecordAccumulator#tryAppend, we can call closeForRecordAppends() after a batch is full.
> {code:java}
>     private RecordAppendResult tryAppend(long timestamp, byte[] key, byte[] value, Header[] headers,
>                                          Callback callback, Deque<ProducerBatch> deque, long nowMs) {
>         ProducerBatch last = deque.peekLast();
>         if (last != null) {
>             FutureRecordMetadata future = last.tryAppend(timestamp, key, value, headers, callback, nowMs);
>             if (future == null)
>                 last.closeForRecordAppends();
>             else
>                 return new RecordAppendResult(future, deque.size() > 1 || last.isFull(), false, false);
>         }
>         return null;
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)