You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "jiamei xie (Jira)" <ji...@apache.org> on 2020/03/11 02:01:00 UTC

[jira] [Created] (KAFKA-9700) Negative estimatedCompressionRatio leads to misjudgment about if there is no room

jiamei xie created KAFKA-9700:
---------------------------------

             Summary: Negative estimatedCompressionRatio leads to misjudgment about if there is no room
                 Key: KAFKA-9700
                 URL: https://issues.apache.org/jira/browse/KAFKA-9700
             Project: Kafka
          Issue Type: Bug
          Components: clients
            Reporter: jiamei xie


* When I run the following command 
bin/kafka-producer-perf-test.sh --topic test --num-records 50000000 --throughput -1 --record-size 5000 --producer-props bootstrap.servers=server04:9092 acks=1 buffer.memory=67108864 batch.size 65536 compression.type=zstd
There was a warning:
[2020-03-06 17:36:50,216] WARN [Producer clientId=producer-1] Got error produce response in correlation id 3261 on topic-partition test-1, splitting and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE (org.apache.kafka.clients.producer.internals.Sender)

* The batch size(65536) is smaller than max.message.bytes (1048588) .  So it's not the root cause.


* I added some logs in CompressionRatioEstimator.updateEstimation and found there were negative currentEstimation values.  The following were logs I added
public static float updateEstimation(String topic, CompressionType type, float observedRatio) {
    float[] compressionRatioForTopic = getAndCreateEstimationIfAbsent(topic);
    float currentEstimation = compressionRatioForTopic[type.id];
    synchronized (compressionRatioForTopic) {
        if (observedRatio > currentEstimation)
        {
                compressionRatioForTopic[type.id] = Math.max(currentEstimation + COMPRESSION_RATIO_DETERIORATE_STEP, observedRatio);
        }
        else if (observedRatio < currentEstimation) {
                  compressionRatioForTopic[type.id] = currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP;
                  log.warn("####currentEstimation is {} , COMPRESSION_RATIO_IMPROVING_STEP is {} , compressionRatioForTopic[type.id] is {}, type.id is {}", currentEstimation, COMPRESSION_RATIO_IMPROVING_STEP,compressionRatioForTopic[type.id], type.id);
        }
    }
     return compressionRatioForTopic[type.id];
}


The observedRatio is smaller than COMPRESSION_RATIO_IMPROVING_STEP in some cases.  Some I think the else if block should be changed into 

else if (observedRatio < currentEstimation) {
                  compressionRatioForTopic[type.id] = Math.max(currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP, observedRatio);
              }





--
This message was sent by Atlassian Jira
(v8.3.4#803005)