You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Daniel Compton (JIRA)" <ji...@apache.org> on 2014/07/03 05:07:24 UTC

[jira] [Updated] (KAFKA-1516) Producer Performance Test sends messages with bytes of 0x0

     [ https://issues.apache.org/jira/browse/KAFKA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Compton updated KAFKA-1516:
----------------------------------

    Description: 
The producer performance test in Kafka sends messages with either [0x0 bytes|https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala#L237] or messages with [all X's|https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala#L225]. This skews the compression ratio massively and probably affects performance in other ways.

We want to create messages which will give a more realistic performance profile. Using random bytes may not be the best solution as these won't compress at all and will skew compression times.

Perhaps using a template which injects random or sequential data into it could work. Or maybe I'm overthinking it and we should just go for random bytes. What other options do we have? Others seem to use random bytes like [cassandra-stress|https://github.com/zznate/cassandra-stress/blob/master/src/main/java/com/riptano/cassandra/stress/InsertCommand.java#L39]

  was:
The producer performance test in Kafka sends messages with either [0x0 bytes|https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala#L237] or messages with [all X's|https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala#L225]. This skews the compression ratio massively and probably affects performance in other ways.

We want to create messages which will give a more realistic performance profile. Using random bytes may not be the best solution as these won't compress at all and will skew compression times.

Perhaps using a template which injects random or sequential data into it could work. Or maybe I'm overthinking it and we should just go for random bytes.


> Producer Performance Test sends messages with bytes of 0x0
> ----------------------------------------------------------
>
>                 Key: KAFKA-1516
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1516
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.1.1
>            Reporter: Daniel Compton
>            Priority: Minor
>
> The producer performance test in Kafka sends messages with either [0x0 bytes|https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala#L237] or messages with [all X's|https://github.com/apache/kafka/blob/0.8.1/perf/src/main/scala/kafka/perf/ProducerPerformance.scala#L225]. This skews the compression ratio massively and probably affects performance in other ways.
> We want to create messages which will give a more realistic performance profile. Using random bytes may not be the best solution as these won't compress at all and will skew compression times.
> Perhaps using a template which injects random or sequential data into it could work. Or maybe I'm overthinking it and we should just go for random bytes. What other options do we have? Others seem to use random bytes like [cassandra-stress|https://github.com/zznate/cassandra-stress/blob/master/src/main/java/com/riptano/cassandra/stress/InsertCommand.java#L39]



--
This message was sent by Atlassian JIRA
(v6.2#6252)