You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "awagle@pch.com" <st...@gmail.com> on 2016/02/29 23:12:16 UTC

Need to understand the graph

Hello,

I have an issue with the timing of Kafka. As and when the time increases with load testing, we see the increase in "request queue time". What is the "request queue time" ?

Some basic config we are using

2 Kafka Nodes(CPU - 8 CPU's [Thread(s) per core:    2], Mem - 32 GB Ram)
pkafkaapp01.pchoso.com
pkafkaapp02.pchoso.com

Tests duration: 13:05 - 14:05
Messages published - 869156

Load Average, CPU, memory is all under control so not sure what the issue is.

Below are some SPM graphs showing the state of my system.
Here's the 'Requests' graph:
  https://apps.sematext.com/spm-reports/s/lCOJULIKuJ

Re: Need to understand the graph

Posted by Alexis Midon <al...@airbnb.com.INVALID>.
Also the metrics kafka.network.RequestChannel.RequestQueueSize and
ResponseQueueSize
will give you the saturation of the network and IO threads.

On Tue, Mar 1, 2016 at 9:21 AM Alexis Midon <al...@airbnb.com> wrote:

> "request queue time" is the time it takes for IO threads to pick up the
> request. As you increase the load on your broker, it makes sense to see
> higher queue time.
> Here are more details on the request/response model in a Kafka broker
> (0.8.2).
>
> All your requests and responses are belong to RequestChannel (
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/network/RequestChannel.scala
> ).
>
> The center piece of all Kafka Request/Response handling is the
> `kafka.network.RequestChannel`. RequestChannel is a container for all Kafka
> Requests waiting to be handled, and the Kafka Responses ready to be sent
> back to the clients.
> Requests are queued in a single bounded queue. Up to `queued.max.requests`
> requests will be stored. The response queues are not bounded.
>
> ## First, let's see how the Requests are created.
> A SocketServer (created by the main class KafkaServer) starts up multiple
> threads to handle network connections in a non-blocking fashion:
> . a single Acceptor thread
> . `network.threads` Processor threads
>
> The Acceptor  thread handles OP_ACCEPT events and passes new socket
> channel to the Processor threads (with a simple round-robin algorithm).
>
> A Processor thread has 3 responsibilities:
> 1. register for monitoring the newly created socket channel passed by the
> Acceptor thread
> 2. read data from the socket channels until a complete Kafka request is
> deserialized. The Request is then handed off to the RequestChannel. The
> hand-off might block if the request queue is full.
> 3. poll the Kafka Responses queued in the RequestChannel, and write the
> data on the corresponding socket channel. Each Processor has its own queue
> in the RequestChannel so that a response is processed by the same thread
> which read the request from the connection.
>
> ## Second, let's see how the Responses are built.
> During start up KafkaServer creates an instance of
> KafkaRequestHandlerPool.  KafkaRequestHandlerPool is a thread pool of
> `io.thread` threads of KafkaRequestHandler (named "kafka-request-handler-"
> ).
> A KafkaRequestHandler has a fairly simple job: it polls Request from the
> RequestChannel and passes them to KafkaApis (`KafkaApis.handle()`).
> KafkaApis dispatches the request and eventually a Response is pushed to the
> RequestChannel. The response will then be picked up by a Processor thread
> as described earlier.
>
>
> A diagram:
> http://postimg.org/image/6gadny6px/
>
> The different stages of the request/response handling are nicely measured
> with the following metrics:
> kafka.network.RequestMetrics.RequestQueueTimeMs
> kafka.network.RequestMetrics.LocalTimeMs
> kafka.network.RequestMetrics.RemoteTimeMs
> kafka.network.RequestMetrics.ResponseQueueTimeMs
> kafka.network.RequestMetrics.ResponseSendTimeMs
> kafka.network.RequestMetrics.TotalTimeMs
>
>
>
> On Mon, Feb 29, 2016 at 2:30 PM awagle@pch.com <st...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I have an issue with the timing of Kafka. As and when the time increases
>> with load testing, we see the increase in "request queue time". What is the
>> "request queue time" ?
>>
>> Some basic config we are using
>>
>> 2 Kafka Nodes(CPU - 8 CPU's [Thread(s) per core:    2], Mem - 32 GB Ram)
>> pkafkaapp01.pchoso.com
>> pkafkaapp02.pchoso.com
>>
>> Tests duration: 13:05 - 14:05
>> Messages published - 869156
>>
>> Load Average, CPU, memory is all under control so not sure what the issue
>> is.
>>
>> Below are some SPM graphs showing the state of my system.
>> Here's the 'Requests' graph:
>>   https://apps.sematext.com/spm-reports/s/lCOJULIKuJ
>
>

Re: Need to understand the graph

Posted by Alexis Midon <al...@airbnb.com.INVALID>.
"request queue time" is the time it takes for IO threads to pick up the
request. As you increase the load on your broker, it makes sense to see
higher queue time.
Here are more details on the request/response model in a Kafka broker
(0.8.2).

All your requests and responses are belong to RequestChannel (
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/network/RequestChannel.scala
).

The center piece of all Kafka Request/Response handling is the
`kafka.network.RequestChannel`. RequestChannel is a container for all Kafka
Requests waiting to be handled, and the Kafka Responses ready to be sent
back to the clients.
Requests are queued in a single bounded queue. Up to `queued.max.requests`
requests will be stored. The response queues are not bounded.

## First, let's see how the Requests are created.
A SocketServer (created by the main class KafkaServer) starts up multiple
threads to handle network connections in a non-blocking fashion:
. a single Acceptor thread
. `network.threads` Processor threads

The Acceptor  thread handles OP_ACCEPT events and passes new socket channel
to the Processor threads (with a simple round-robin algorithm).

A Processor thread has 3 responsibilities:
1. register for monitoring the newly created socket channel passed by the
Acceptor thread
2. read data from the socket channels until a complete Kafka request is
deserialized. The Request is then handed off to the RequestChannel. The
hand-off might block if the request queue is full.
3. poll the Kafka Responses queued in the RequestChannel, and write the
data on the corresponding socket channel. Each Processor has its own queue
in the RequestChannel so that a response is processed by the same thread
which read the request from the connection.

## Second, let's see how the Responses are built.
During start up KafkaServer creates an instance of
KafkaRequestHandlerPool.  KafkaRequestHandlerPool is a thread pool of
`io.thread` threads of KafkaRequestHandler (named "kafka-request-handler-"
).
A KafkaRequestHandler has a fairly simple job: it polls Request from the
RequestChannel and passes them to KafkaApis (`KafkaApis.handle()`).
KafkaApis dispatches the request and eventually a Response is pushed to the
RequestChannel. The response will then be picked up by a Processor thread
as described earlier.


A diagram:
http://postimg.org/image/6gadny6px/

The different stages of the request/response handling are nicely measured
with the following metrics:
kafka.network.RequestMetrics.RequestQueueTimeMs
kafka.network.RequestMetrics.LocalTimeMs
kafka.network.RequestMetrics.RemoteTimeMs
kafka.network.RequestMetrics.ResponseQueueTimeMs
kafka.network.RequestMetrics.ResponseSendTimeMs
kafka.network.RequestMetrics.TotalTimeMs



On Mon, Feb 29, 2016 at 2:30 PM awagle@pch.com <st...@gmail.com>
wrote:

> Hello,
>
> I have an issue with the timing of Kafka. As and when the time increases
> with load testing, we see the increase in "request queue time". What is the
> "request queue time" ?
>
> Some basic config we are using
>
> 2 Kafka Nodes(CPU - 8 CPU's [Thread(s) per core:    2], Mem - 32 GB Ram)
> pkafkaapp01.pchoso.com
> pkafkaapp02.pchoso.com
>
> Tests duration: 13:05 - 14:05
> Messages published - 869156
>
> Load Average, CPU, memory is all under control so not sure what the issue
> is.
>
> Below are some SPM graphs showing the state of my system.
> Here's the 'Requests' graph:
>   https://apps.sematext.com/spm-reports/s/lCOJULIKuJ