You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "John Fung (Created) (JIRA)" <ji...@apache.org> on 2012/02/08 06:42:59 UTC

[jira] [Created] (KAFKA-267) Enhance ProducerPerformance to generate unique random checksum

Enhance ProducerPerformance to generate unique random checksum
--------------------------------------------------------------

                 Key: KAFKA-267
                 URL: https://issues.apache.org/jira/browse/KAFKA-267
             Project: Kafka
          Issue Type: Improvement
            Reporter: John Fung
            Assignee: John Fung


This is achieved by specifying the starting number of a range of number to be shuffled. The ending number of the range is the starting number + number of messages - 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Description: 
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
3. The ending number of the range is the starting number + number of messages - 1.


Other per

  was:
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
3. The ending number of the range is the starting number + number of messages - 1.

    
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: John Fung
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other per

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka-267-v5.patch


Change since v4:

1. removing the "UniqueRandom" class and its related usages

2. further simplify the code of function generateProducerData()
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483415#comment-13483415 ] 

Neha Narkhede edited comment on KAFKA-267 at 10/24/12 6:02 PM:
---------------------------------------------------------------

Thanks for patch v5, we are pretty close to checking this in. A few more review comments -

1. Fix the description of the topics argument, it should just say "produce to". Let's remove the consume from. Also, if it is a csv list, let's add that to the description as well
2. Fix the description of batch-size. Right now, it just says "size". Same for threads, it just says "count"
3. Let's describe the supported compression codecs in the description for compression-codec.
4. Fix typo in the description for metrics-dir
5. I think it is valuable to default to async. That is the most common production usage, I doubt we will frequently test one-message-at-a-time
6. Should message-send-gap be renamed to message-send-gap-ms ?
7. Now that we have metrics csv reporter, will we use ProducerPerformance to calculate the producer throughput in MB/s, messages/sec at all ? I guess we can just enable csv reporting and rely on our metrics to measure throughput/latency correctly. In that case, we don't need to compute the various metrics and also can get rid of the println and showDetailedStats parameter. With this change, we can view ProducerPerformance like a workload generator only. Not sure if I'm missing something here though.
8. If the changes specified in #7 are implemented, let's change the system test script to dump the producer metrics for ProducerPerformance through the new metrics-dir option.
9. generateProducerData is a bit unreadable. Recommend you separately compute the messageId and messageSize and pass it into generateMessageWithSeqId

                
      was (Author: nehanarkhede):
    Thanks for patch v5, we are pretty close to checking this in. A few more review comments -

1. Fix the description of the topics argument, it should just say "produce to". Let's remove the consume from. Also, if it is a csv list, let's add that to the description as well
2. Fix the description of batch-size. Right now, it just says "size". Same for threads, it just says "count"
3. Let's describe the supported compression codecs in the description for compression-codec.
4. I think it is valuable to default to async. That is the most common production usage, I doubt we will frequently test one-message-at-a-time
5. Should message-send-gap be renamed to message-send-gap-ms ?
6. Now that we have metrics csv reporter, will we use ProducerPerformance to calculate the producer throughput in MB/s, messages/sec at all ? I guess we can just enable csv reporting and rely on our metrics to measure throughput/latency correctly. In that case, we don't need to compute the various metrics and also can get rid of the println and showDetailedStats parameter. With this change, we can view ProducerPerformance like a workload generator only. Not sure if I'm missing something here though.
7. generateProducerData is a bit unreadable. Recommend you separately compute the messageId and messageSize and pass it into generateMessageWithSeqId

                  
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486323#comment-13486323 ] 

Yang Ye commented on KAFKA-267:
-------------------------------

V6 should be the right one, please review this one
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483293#comment-13483293 ] 

Jun Rao commented on KAFKA-267:
-------------------------------

Thanks for patch v5. Looks good. Some minor comments:

50. In the following line in generateProducerData(), we should allocate a byte array and pass it to the constructor of Message. ByteBuffer.array() gives you the backing array and it may or may not be the size that you allocated.
        new Message(ByteBuffer.allocate(if(config.isFixSize) config.messageSize else 1 + rand.nextInt(config.messageSize)).array())

51. Could you break all long lines into multiple lines?

52. Could you expose producer.num.retries and producer.retry.backoff.ms to the command line and default them to the same value as in ProducerConfig?

                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485042#comment-13485042 ] 

Jun Rao commented on KAFKA-267:
-------------------------------

What we can do is that, in system tests, tune #retries and backoff time in ProducerPerformance according to the failure scenario so that we expect no data loss on the producer side. Both knobs are exposed in ProducerPerformance now.
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480112#comment-13480112 ] 

Jun Rao commented on KAFKA-267:
-------------------------------

Thanks for patch v4. It's better, but still messy. Some more comments:

40. I  suggest that we combine initialMessageIdOpt and varyMessageSizeOpt into one option, something like message-mode. It will take 3 choices: sequence, fix-length, and variable-length that are mutually exclusive.

41. I suggest that batchSizeOpt be only used in async mode. In sync mode, we always send a single message at a time without batching. This will simplify the way that we generate messages. When generating the data, we just need to generate one message at a time, depending on message-mode.

42. The following 2 methods should be moved out of ProducerThread and put in a util class.
generateProducerData
generateMessageWithSeqId

43. UniqueRandom: Why do we need this class? Can't we just use Random.nextBytes()?

44. The following chunk of code is identical to that in Kafka. Could we create a util function to share the code?
      val metricsConfig = new KafkaMetricsConfig(verifiableProps)
      metricsConfig.reporters.foreach(reporterType => {
        val reporter = Utils.createObject[KafkaMetricsReporter](reporterType)
        reporter.init(verifiableProps)
        if (reporter.isInstanceOf[KafkaMetricsReporterMBean])
          Utils.registerMBean(reporter, reporter.asInstanceOf[KafkaMetricsReporterMBean].getMBeanName)
      })

45. There are a few long lines like the following. Let's put .format in a separate line.
            println(("%s, %d, %d, %d, %d, %.2f, %.4f, %d, %.4f").format(formattedReportTime, config.compressionCodec.codec,
                                                                        threadId, config.messageSize, config.batchSize, (bytesSent*1.0)/(1024 * 1024), mbPerSec, nSends, numMessagesPerSec))

                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-267:
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.8
           Status: Resolved  (was: Patch Available)

+1 on patch v6. Committed to 0.8. 
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>             Fix For: 0.8
>
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye reassigned KAFKA-267:
-----------------------------

    Assignee: Yang Ye  (was: John Fung)
    
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "John Fung (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-267:
----------------------------

    Status: Patch Available  (was: Open)
    
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: John Fung
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Description: 
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
3. The ending number of the range is the starting number + number of messages - 1.


Other ProducerPerformance advancement: 
1. producing to multiple topics
2. supporting multiple instances of producer performance ( and distinguishes them)
3. allowing waiting some time after sending a request

  was:
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
3. The ending number of the range is the starting number + number of messages - 1.


Other per

    
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: John Fung
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka_267_additional.diff


This is a follow up patch in order to simulate real data load. Before we generate sequential messages padded with a series of character "x"'s, which leads to very good compression ratio which is not typical in real case. So we want to generate random payload.

To give a comparison: sending 400 messages with batch size 200, before the data size is 48800 bytes, after the patch the size is 65600 bytes.



                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>             Fix For: 0.8
>
>         Attachments: kafka_267_additional.diff, kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka-267-v3.patch


rebase on kafka 42
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483415#comment-13483415 ] 

Neha Narkhede commented on KAFKA-267:
-------------------------------------

Thanks for patch v5, we are pretty close to checking this in. A few more review comments -

1. Fix the description of the topics argument, it should just say "produce to". Let's remove the consume from. Also, if it is a csv list, let's add that to the description as well
2. Fix the description of batch-size. Right now, it just says "size". Same for threads, it just says "count"
3. Let's describe the supported compression codecs in the description for compression-codec.
4. I think it is valuable to default to async. That is the most common production usage, I doubt we will frequently test one-message-at-a-time
5. Should message-send-gap be renamed to message-send-gap-ms ?
6. Now that we have metrics csv reporter, will we use ProducerPerformance to calculate the producer throughput in MB/s, messages/sec at all ? I guess we can just enable csv reporting and rely on our metrics to measure throughput/latency correctly. In that case, we don't need to compute the various metrics and also can get rid of the println and showDetailedStats parameter. With this change, we can view ProducerPerformance like a workload generator only. Not sure if I'm missing something here though.
7. generateProducerData is a bit unreadable. Recommend you separately compute the messageId and messageSize and pass it into generateMessageWithSeqId

                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka-267-v6.patch


change since v6:

1. the producer performance furthur simplified, long line decoupled into two lines

2. "showdetai" option is removed from producerPerformance

3. producer performance use async for default

4. #messages sent in replication_test case 1 and test 0001 are set to 100,000, max log file size is changed to 1024000000, so that few log segments are created, the log checksum checking could be speed up.
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473382#comment-13473382 ] 

Jun Rao commented on KAFKA-267:
-------------------------------

Can't find a revision that I can apply the patch cleanly. Could you rebase?
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474299#comment-13474299 ] 

Jun Rao commented on KAFKA-267:
-------------------------------

Thanks for patch v3. The code is still pretty messy, mostly for historical reasons. For example, the fix/variable length option is mixed with seqIdMode and batch size, etc. I suggest that we do the following: (1) Get rid of the seqId mode and always generate sequential ids in the message header. If the user doesn't specify the starting index, seq Id will start from 0. (2) Alway pad a message with random bytes, whether the message is fix length or variable length. (3) Don't batch messages into sets in the sync mode. Instead, send one message at a time in sync mode. (4) The send gap should probably be added after sending a batch of messages, not after each message. If we do all this, the send thread can just have one simple loop as the following:

while(j < messagesPerThread) {
  for each topic {
    msgString = // get the message for a seq id of a given size     
    msg = create Message
    producer.send(msg)
    if(config.messageSendGapMs > 0 && batch size is reached)
      Thread.sleep(config.messageSendGapMs)
  }
}

Also, we need to patch system test since the command line option has changed.

                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485024#comment-13485024 ] 

Jun Rao commented on KAFKA-267:
-------------------------------

Thanks for patch v7. Not sure if printing messages in syncProducer is a good idea. In general, messages sent in syncProducer are not necessarily strings and may not be printable. ProducerPerformance, on the other hand, always sends string messages, which are printable. So, I suggest that we keep the message printing in ProducerPerformance. 
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka-267-v7.patch


Move the logging of successfully sending messages from Producer Performance to syncProducer
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485032#comment-13485032 ] 

Neha Narkhede commented on KAFKA-267:
-------------------------------------

Keeping the printing in ProducerPerformance would work, but we need to have a consistent way of counting the messages that were successfully sent by the async producer, during system testing. Right now, ProducerPerformance prints messages assuming they will get sent but only SyncProducer really knows if the messages were sent successfully. Thoughts ?
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch, kafka-267-v5.patch, kafka-267-v6.patch, kafka-267-v7.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka-267-v2.patch



Refactor a log of code
1. support mutliple topics
2. support various sized messages
3. support sleep after sending the mesage

4. refactor a lot of code to make it readable

5. can be used for both producing test data and performance test. For the former usage, one has to give the --initial-message-id argument. 
                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "John Fung (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-267:
----------------------------

    Description: 
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
3. The ending number of the range is the starting number + number of messages - 1.

  was:
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. Specifying the starting number of the range to be shuffled.
3. The ending number of the range is the starting number + number of messages - 1.

    
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: John Fung
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "John Fung (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-267:
----------------------------

    Description: 
This is achieved by:
1. Adding a new class UniqueRandom to shuffle a range of numbers.
2. Specifying the starting number of the range to be shuffled.
3. The ending number of the range is the starting number + number of messages - 1.

  was:This is achieved by specifying the starting number of a range of number to be shuffled. The ending number of the range is the starting number + number of messages - 1.

        Summary: Enhance ProducerPerformance to generate unique random Long value for payload  (was: Enhance ProducerPerformance to generate unique random checksum)
    
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: John Fung
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. Specifying the starting number of the range to be shuffled.
> 3. The ending number of the range is the starting number + number of messages - 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by wm <wm...@gmail.com>.
On 10/18/2012 09:46 PM, Yang Ye (JIRA) wrote:
>       [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Yang Ye updated KAFKA-267:
> --------------------------
>
>      Attachment: kafka-267-v4.patch
>
>
> create a function "generateProducerData" for common use. It can generate for both sync and async mode, fixed and variable length, sequential id mode and non-sequential id mode.
>
> The main structure of the fun() function of ProducerThread is shrinked to :
>
> For each batch:
>       For each topic:
>             generateProducerData
>             send
>             sleep if required
>       
>
>
>                  
>> Enhance ProducerPerformance to generate unique random Long value for payload
>> ----------------------------------------------------------------------------
>>
>>                  Key: KAFKA-267
>>                  URL: https://issues.apache.org/jira/browse/KAFKA-267
>>              Project: Kafka
>>           Issue Type: Improvement
>>             Reporter: John Fung
>>             Assignee: Yang Ye
>>          Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch
>>
>>
>> This is achieved by:
>> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
>> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
>> 3. The ending number of the range is the starting number + number of messages - 1.
>> Other ProducerPerformance advancement:
>> 1. producing to multiple topics
>> 2. supporting multiple instances of producer performance ( and distinguishes them)
>> 3. allowing waiting some time after sending a request
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
the only reason to sleep is? that's not part of your function. But if 
you're batching your function needs to know where it left off sending 
and where it left off receiving ack's and then you run the batch with an 
error condition on those .... what i mean....generateProducerData, in a 
batch, should do as little and report as much as possible as quickly as 
possible. eh? batches fail because of records. no batcher runs serially 
when the code below is mt safe.

w

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random Long value for payload

Posted by "Yang Ye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yang Ye updated KAFKA-267:
--------------------------

    Attachment: kafka-267-v4.patch


create a function "generateProducerData" for common use. It can generate for both sync and async mode, fixed and variable length, sequential id mode and non-sequential id mode.

The main structure of the fun() function of ProducerThread is shrinked to :

For each batch:
     For each topic:
           generateProducerData
           send
           sleep if required
     


                
> Enhance ProducerPerformance to generate unique random Long value for payload
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: Yang Ye
>         Attachments: kafka-267-v1.patch, kafka-267-v2.patch, kafka-267-v3.patch, kafka-267-v4.patch
>
>
> This is achieved by:
> 1. Adding a new class UniqueRandom to shuffle a range of numbers.
> 2. An optional new argument "start-index" is added to specify the starting number of the range to be shuffled. If this argument is omitted, it is defaulted to 1. So it is backward compatible with the argument options.
> 3. The ending number of the range is the starting number + number of messages - 1.
> Other ProducerPerformance advancement: 
> 1. producing to multiple topics
> 2. supporting multiple instances of producer performance ( and distinguishes them)
> 3. allowing waiting some time after sending a request

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-267) Enhance ProducerPerformance to generate unique random checksum

Posted by "John Fung (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Fung updated KAFKA-267:
----------------------------

    Attachment: kafka-267-v1.patch
    
> Enhance ProducerPerformance to generate unique random checksum
> --------------------------------------------------------------
>
>                 Key: KAFKA-267
>                 URL: https://issues.apache.org/jira/browse/KAFKA-267
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: John Fung
>            Assignee: John Fung
>         Attachments: kafka-267-v1.patch
>
>
> This is achieved by specifying the starting number of a range of number to be shuffled. The ending number of the range is the starting number + number of messages - 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira