You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/03/12 07:13:00 UTC

[jira] [Commented] (KAFKA-9700) Negative estimatedCompressionRatio leads to misjudgment about if there is no room

    [ https://issues.apache.org/jira/browse/KAFKA-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057651#comment-17057651 ] 

ASF GitHub Bot commented on KAFKA-9700:
---------------------------------------

jiameixie commented on pull request #8285: KAFKA-9700:Fix negative estimatedCompressionRatio issue
URL: https://github.com/apache/kafka/pull/8285
 
 
   There are cases that currentEstimation is smaller than
   COMPRESSION_RATIO_IMPROVING_STEP and it will get negative
   estimatedCompressionRatio,which leads to misjudgment
   about if there is no room and MESSAGE_TOO_LARGE might occur.
   
   Change-Id: I0932a2a6ca669f673ab5d862d3fe7b2bb6d96ff6
   Signed-off-by: Jiamei Xie <ji...@arm.com>
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Negative estimatedCompressionRatio leads to misjudgment about if there is no room
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-9700
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9700
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>            Reporter: jiamei xie
>            Priority: Major
>
> * When I run the following command 
> bin/kafka-producer-perf-test.sh --topic test --num-records 50000000 --throughput -1 --record-size 5000 --producer-props bootstrap.servers=server04:9092 acks=1 buffer.memory=67108864 batch.size 65536 compression.type=zstd
> There was a warning:
> [2020-03-06 17:36:50,216] WARN [Producer clientId=producer-1] Got error produce response in correlation id 3261 on topic-partition test-1, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender)
> * The batch size(65536) is smaller than max.message.bytes (1048588) .  So it's not the root cause.
> * I added some logs in CompressionRatioEstimator.updateEstimation and found there were negative currentEstimation values.  The following were logs I added
> public static float updateEstimation(String topic, CompressionType type, float observedRatio) {
>     float[] compressionRatioForTopic = getAndCreateEstimationIfAbsent(topic);
>     float currentEstimation = compressionRatioForTopic[type.id];
>     synchronized (compressionRatioForTopic) {
>         if (observedRatio > currentEstimation)
>         {
>                 compressionRatioForTopic[type.id] = Math.max(currentEstimation + COMPRESSION_RATIO_DETERIORATE_STEP, observedRatio);
>         }
>         else if (observedRatio < currentEstimation) {
>                   compressionRatioForTopic[type.id] = currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP;
>                   log.warn("####currentEstimation is {} , COMPRESSION_RATIO_IMPROVING_STEP is {} , compressionRatioForTopic[type.id] is {}, type.id is {}", currentEstimation, COMPRESSION_RATIO_IMPROVING_STEP,compressionRatioForTopic[type.id], type.id);
>         }
>     }
>      return compressionRatioForTopic[type.id];
> }
> The observedRatio is smaller than COMPRESSION_RATIO_IMPROVING_STEP in some cases.  Some I think the else if block should be changed into 
> else if (observedRatio < currentEstimation) {
>                   compressionRatioForTopic[type.id] = Math.max(currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP, observedRatio);
>               }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)