You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by girija arumugam <gi...@gmail.com> on 2020/12/01 10:53:37 UTC
Re: Regarding framing producer rate in-terms of software as well as
hardware configurations
Adding few application related configurations which can affect producer
rate,
- linger.ms
- batch.size
- buffer.memory
- acks
- compression
- num.io.threads
- num.network.threads
On Mon, Nov 30, 2020 at 3:07 PM girija arumugam <gi...@gmail.com>
wrote:
> Team,
> *Use-case :*
> *IMAP* . I have an application in which an org has users , who use
> IMAP to send mails, where the mail contents are produced to kafka.
>
> Here the scaling factors are
>
> 1. org can grow from 1 to million
> 2. users can grow from 1 to million.
>
> For this use-case, I need to calculate the producer rate and broker
> response rate for a single machine.
>
> So far we have identified, the factors that will be involved in
> producer-rate are :
>
> 1. Message size
> 2. Request size
> 3. Request rate overhead
> 4. Request latency
> 5. Round Trip Time
> 6. Number of Sender Threads
> 7. Number of Processor Threads at Broker
> 8. Replication factor
>
> Variables identified at Network layer, Kernel, NIC :
>
> 1. sysctl_wmem
> 2. Tx queues
> 3. Ring Buffer
> 4. Driver Queue
> 5. NAPI Polling
>
> Observations made so far :
>
> 1. SocketChannel is the one who is the entry point of sending data at
> the application level.
> 2. sendfile() system call used to transfer the data.
>
> *Questions* :
>
> 1. How data is transferred from SocketChannel to NIC ? (ie) The
> data-flow in-terms of network(protocol) layer, kernel, network device
> drivers, NIC .
> 2. Since, each KafkaProducer instance will create an
> SocketChannel.What is the maximum number of producer instances , a machine
> can have to utilise the network in an efficient manner ?
> 3. In-addition to the above listed variables,
> 1. What are the list of variables involved in sending data in the
> network layer ?
> 2. What are the list of variables involved in sending data in the
> kernel ?
> 3. What are the list of variables involved in sending data to NIC ?
> 4. How to frame the producer rate in-terms of the variables identified
> in each layer ?
> 5. *With the given machine hardware, how to precisely frame the
> producer rate in a single formula in-terms of hardware and software level ?*
>
>
> Anyone, Please help me in identifying the variables and also in-corporate
> those variables in a single formula to frame the producer-rate for a
> machine in-terms of producer instances.
>
> Thanks in advance.
>
> PS : I have already came across the following documents
>
> -
> https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/
> - https://cwiki.apache.org/confluence/display/KAFKA/Performance+testing
> -
> https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600
>
>
> Regards,
> Girija A.
>
>
>