You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by GitBox <gi...@apache.org> on 2019/04/03 11:47:10 UTC

[GitHub] [incubator-doris] morningman opened a new pull request #870: Optimize the consumer assignment of Kafka routine load job

morningman opened a new pull request #870: Optimize the consumer assignment of Kafka routine load job
URL: https://github.com/apache/incubator-doris/pull/870
 
 
   1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine
   load job. 
   But unfortunately, the test shows that 3 consumers to consume 3 partitions has same consuming rate as 1 consumer consumes 3 partitions. And the bottle neck is at fetching messages from Kafka. I don't know why, so I add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 1.
   
       * 1 of 3 consumers (consume cost is the time we call `consumer->consume()`) 
   total cost(ms): 20165, consume cost(ms): 18665, received rows: 601807
       * 1 of 1 consumers:
   total cost(ms): 20051, consume cost(ms): 17259, received rows: 1686118
   
   2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@doris.apache.org
For additional commands, e-mail: dev-help@doris.apache.org