You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Jan Filipiak <Ja...@trivago.com> on 2015/02/22 11:16:40 UTC

Re: High CPU usage of Crc32 on Kafka broker

I just want to bring up that idea of no server side de/recompression 
again. Features like KAFKA-1499 
<https://issues.apache.org/jira/browse/KAFKA-1499> seem to steer the 
project into a different direction and the fact that tickets like 
KAFKA-845 <https://issues.apache.org/jira/browse/KAFKA-845> are not 
getting much attention gives the same impression. This is something my 
head keeps spinning around almost 24/7 recently.

The problem I see is that CPU's are not the cheapest part of a new 
server and if you can spare a gigahertz or some cores by just making 
sure your configs are the same across all producers I would always opt 
for the operational overhead instead of the bigger servers. I think this 
will usually decrease the tco's of kafka installations.

I am currently not familiar enough with the codebase to judge if server 
side decompression happens before acknowledge. If so, these would be 
some additional milliseconds to respond faster if we could spare 
de/recompression.

Those are my thoughts about server side de/recompression. It would be 
great if I could get some responses and thoughts back.

Jan



On 07.11.2014 00:23, Jay Kreps wrote:
> I suspect it is possible to save and reuse the CRCs though it might be a
> bit of an invasive change. I suspect the first usage is when we are
> checking the validity of the messages and the second is from when we
> rebuild the compressed message set (I'm assuming you guys are using
> compression because I think we optimize this out otherwise). Technically I
> think the CRCs stay the same.
>
> An alternative approach, though, would be working to remove the need for
> recompression entirely on the broker side by making the offsets in the
> compressed message relative to the base offset of the message set. This is
> a much more invasive change but potentially better as it would also remove
> the recompression done on the broker which is also CPU heavy.
>
> -Jay
>
> On Thu, Nov 6, 2014 at 2:36 PM, Allen Wang <aw...@netflix.com.invalid>
> wrote:
>
>> Sure. Here is the link to the screen shot of jmc with the JTR file loaded:
>>
>> http://picpaste.com/fligh-recorder-crc.png
>>
>>
>>
>> On Thu, Nov 6, 2014 at 2:12 PM, Neha Narkhede <ne...@gmail.com>
>> wrote:
>>
>>> Allen,
>>>
>>> Apache mailing lists don't allow attachments. Could you please link to a
>>> pastebin or something?
>>>
>>> Thanks,
>>> Neha
>>>
>>> On Thu, Nov 6, 2014 at 12:02 PM, Allen Wang <aw...@netflix.com.invalid>
>>> wrote:
>>>
>>>> After digging more into the stack trace got from flight recorder (which
>>> is
>>>> attached), it seems that Kafka (0.8.1.1) can optimize the usage of
>> Crc32.
>>>> The stack trace shows that Crc32 is invoked twice from Log.append().
>>> First
>>>> is from the line number 231:
>>>>
>>>> val appendInfo = analyzeAndValidateMessageSet(messages)
>>>>
>>>> The second time is from line 252 in the same function:
>>>>
>>>> validMessages = validMessages.assignOffsets(offset, appendInfo.codec)
>>>>
>>>> If one of the Crc32 invocation can be eliminated, we are looking at
>>> saving
>>>> at least 7% of CPU usage.
>>>>
>>>> Thanks,
>>>> Allen
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 5, 2014 at 6:32 PM, Allen Wang <aw...@netflix.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Using flight recorder, we have observed high CPU usage of CRC32
>>>>> (kafka.utils.Crc32.update()) on Kafka broker. It uses as much as 25%
>> of
>>> CPU
>>>>> on an instance. Tracking down stack trace, this method is invoked by
>>>>> ReplicaFetcherThread.
>>>>>
>>>>> Is there any tuning we can do to reduce this?
>>>>>
>>>>> Also on the topic of CPU utilization, we observed that overall CPU
>>>>> utilization is proportional to AllTopicsBytesInPerSec metric. Does
>> this
>>>>> metric include incoming replication traffic?
>>>>>
>>>>> Thanks,
>>>>> Allen
>>>>>
>>>>>

Re: High CPU usage of Crc32 on Kafka broker

Posted by Guozhang Wang <wa...@gmail.com>.

Allen,

Regarding the two crc computation calls, the first one is used to validate
the messages, and the second call is only used if we need to re-compress
the data. So logically they are not redundant operations. As Jay said, the
re-compression is acutally savable and once it is removed, we will not need
the second call any more.

Jan,

You can find some more discussions regarding the compression /
de-compression here (
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Enriched+Message+Metadata
)

Guozhang



On Sun, Feb 22, 2015 at 10:46 AM, Jay Kreps <ja...@gmail.com> wrote:

> Here's my summary of the state of the compression discussion:
>
>    1. We all agree that current compression performance isn't very good and
>    it would be nice to improve it.
>    2. This is not entirely due to actual (de)compression, in large part it
>    is inefficiencies in the current implementation. Snappy is like
>    300Mb/sec/core so should not be a bottleneck. We could probably hugely
>    improve performance without any fundamental changes. See:
>    https://issues.apache.org/jira/browse/KAFKA-527
>    3. There are really three separate things that get conflated:
>       1. De-compression on the server
>       2. Re-compression on the server
>       3. De-compression and re-compression in mirror maker
>    4. Getting rid of de-compression on the server is unlikely to happen
>    because de-compression is required to validate the data sent. In the
> very
>    early days of Kafka we did indeed just append whatever the client sent
> us
>    to the binary log without validation. Then we realized that any bug in
> any
>    of the clients in all the languages would potentially corrupt the log
> and
>    potentially thus bring down the whole cluster. You can imagine how we
>    realized this! This is why basically no system in the world appends
> client
>    data directly to it's binary on disk structures. Decompression can
>    potentially be highly optimized, though, by not fully instantiating
>    messages.
>    5. The current compression code re-compresses the data to assign it
>    sequential offsets. It would be possible to improve this by allowing
> some
>    kind of relative offset scheme where the individual messages would have
>    offsets like (-3,-2,-1, 0) and this would be interpreted relative to the
>    offset of the batch. This would let us avoid recompression for
> co-operating
>    clients.
>    6. This would likely require bumping the log version. Prior to doing
>    this we need to have better backwards compatibility support in place to
>    make this kind of upgrade easy to do.
>    7. Optimizing de-compression and re-compression in mm requires having
>    APIs that give you back uncompressed messages and let you send already
>    compressed batches. This might be possible but it would break a lot of
>    things like the proposed filters in mm. We would also need to do this
> in a
>    way that it wasn't too gross of an API.
>
> -Jay
>
> On Sun, Feb 22, 2015 at 2:16 AM, Jan Filipiak <Ja...@trivago.com>
> wrote:
>
> > I just want to bring up that idea of no server side de/recompression
> > again. Features like KAFKA-1499 <https://issues.apache.org/
> > jira/browse/KAFKA-1499> seem to steer the project into a different
> > direction and the fact that tickets like KAFKA-845 <
> > https://issues.apache.org/jira/browse/KAFKA-845> are not getting much
> > attention gives the same impression. This is something my head keeps
> > spinning around almost 24/7 recently.
> >
> > The problem I see is that CPU's are not the cheapest part of a new server
> > and if you can spare a gigahertz or some cores by just making sure your
> > configs are the same across all producers I would always opt for the
> > operational overhead instead of the bigger servers. I think this will
> > usually decrease the tco's of kafka installations.
> >
> > I am currently not familiar enough with the codebase to judge if server
> > side decompression happens before acknowledge. If so, these would be some
> > additional milliseconds to respond faster if we could spare
> > de/recompression.
> >
> > Those are my thoughts about server side de/recompression. It would be
> > great if I could get some responses and thoughts back.
> >
> > Jan
> >
> >
> >
> >
> > On 07.11.2014 00:23, Jay Kreps wrote:
> >
> >> I suspect it is possible to save and reuse the CRCs though it might be a
> >> bit of an invasive change. I suspect the first usage is when we are
> >> checking the validity of the messages and the second is from when we
> >> rebuild the compressed message set (I'm assuming you guys are using
> >> compression because I think we optimize this out otherwise).
> Technically I
> >> think the CRCs stay the same.
> >>
> >> An alternative approach, though, would be working to remove the need for
> >> recompression entirely on the broker side by making the offsets in the
> >> compressed message relative to the base offset of the message set. This
> is
> >> a much more invasive change but potentially better as it would also
> remove
> >> the recompression done on the broker which is also CPU heavy.
> >>
> >> -Jay
> >>
> >> On Thu, Nov 6, 2014 at 2:36 PM, Allen Wang <aw...@netflix.com.invalid>
> >> wrote:
> >>
> >>  Sure. Here is the link to the screen shot of jmc with the JTR file
> >>> loaded:
> >>>
> >>> http://picpaste.com/fligh-recorder-crc.png
> >>>
> >>>
> >>>
> >>> On Thu, Nov 6, 2014 at 2:12 PM, Neha Narkhede <neha.narkhede@gmail.com
> >
> >>> wrote:
> >>>
> >>>  Allen,
> >>>>
> >>>> Apache mailing lists don't allow attachments. Could you please link
> to a
> >>>> pastebin or something?
> >>>>
> >>>> Thanks,
> >>>> Neha
> >>>>
> >>>> On Thu, Nov 6, 2014 at 12:02 PM, Allen Wang <awang@netflix.com.invalid
> >
> >>>> wrote:
> >>>>
> >>>>  After digging more into the stack trace got from flight recorder
> (which
> >>>>>
> >>>> is
> >>>>
> >>>>> attached), it seems that Kafka (0.8.1.1) can optimize the usage of
> >>>>>
> >>>> Crc32.
> >>>
> >>>> The stack trace shows that Crc32 is invoked twice from Log.append().
> >>>>>
> >>>> First
> >>>>
> >>>>> is from the line number 231:
> >>>>>
> >>>>> val appendInfo = analyzeAndValidateMessageSet(messages)
> >>>>>
> >>>>> The second time is from line 252 in the same function:
> >>>>>
> >>>>> validMessages = validMessages.assignOffsets(offset, appendInfo.codec)
> >>>>>
> >>>>> If one of the Crc32 invocation can be eliminated, we are looking at
> >>>>>
> >>>> saving
> >>>>
> >>>>> at least 7% of CPU usage.
> >>>>>
> >>>>> Thanks,
> >>>>> Allen
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Nov 5, 2014 at 6:32 PM, Allen Wang <aw...@netflix.com>
> wrote:
> >>>>>
> >>>>>  Hi,
> >>>>>>
> >>>>>> Using flight recorder, we have observed high CPU usage of CRC32
> >>>>>> (kafka.utils.Crc32.update()) on Kafka broker. It uses as much as 25%
> >>>>>>
> >>>>> of
> >>>
> >>>> CPU
> >>>>
> >>>>> on an instance. Tracking down stack trace, this method is invoked by
> >>>>>> ReplicaFetcherThread.
> >>>>>>
> >>>>>> Is there any tuning we can do to reduce this?
> >>>>>>
> >>>>>> Also on the topic of CPU utilization, we observed that overall CPU
> >>>>>> utilization is proportional to AllTopicsBytesInPerSec metric. Does
> >>>>>>
> >>>>> this
> >>>
> >>>> metric include incoming replication traffic?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Allen
> >>>>>>
> >>>>>>
> >>>>>>
> >
>



-- 
-- Guozhang

Re: High CPU usage of Crc32 on Kafka broker

Posted by Jay Kreps <ja...@gmail.com>.

Here's my summary of the state of the compression discussion:

   1. We all agree that current compression performance isn't very good and
   it would be nice to improve it.
   2. This is not entirely due to actual (de)compression, in large part it
   is inefficiencies in the current implementation. Snappy is like
   300Mb/sec/core so should not be a bottleneck. We could probably hugely
   improve performance without any fundamental changes. See:
   https://issues.apache.org/jira/browse/KAFKA-527
   3. There are really three separate things that get conflated:
      1. De-compression on the server
      2. Re-compression on the server
      3. De-compression and re-compression in mirror maker
   4. Getting rid of de-compression on the server is unlikely to happen
   because de-compression is required to validate the data sent. In the very
   early days of Kafka we did indeed just append whatever the client sent us
   to the binary log without validation. Then we realized that any bug in any
   of the clients in all the languages would potentially corrupt the log and
   potentially thus bring down the whole cluster. You can imagine how we
   realized this! This is why basically no system in the world appends client
   data directly to it's binary on disk structures. Decompression can
   potentially be highly optimized, though, by not fully instantiating
   messages.
   5. The current compression code re-compresses the data to assign it
   sequential offsets. It would be possible to improve this by allowing some
   kind of relative offset scheme where the individual messages would have
   offsets like (-3,-2,-1, 0) and this would be interpreted relative to the
   offset of the batch. This would let us avoid recompression for co-operating
   clients.
   6. This would likely require bumping the log version. Prior to doing
   this we need to have better backwards compatibility support in place to
   make this kind of upgrade easy to do.
   7. Optimizing de-compression and re-compression in mm requires having
   APIs that give you back uncompressed messages and let you send already
   compressed batches. This might be possible but it would break a lot of
   things like the proposed filters in mm. We would also need to do this in a
   way that it wasn't too gross of an API.

-Jay

On Sun, Feb 22, 2015 at 2:16 AM, Jan Filipiak <Ja...@trivago.com>
wrote:

> I just want to bring up that idea of no server side de/recompression
> again. Features like KAFKA-1499 <https://issues.apache.org/
> jira/browse/KAFKA-1499> seem to steer the project into a different
> direction and the fact that tickets like KAFKA-845 <
> https://issues.apache.org/jira/browse/KAFKA-845> are not getting much
> attention gives the same impression. This is something my head keeps
> spinning around almost 24/7 recently.
>
> The problem I see is that CPU's are not the cheapest part of a new server
> and if you can spare a gigahertz or some cores by just making sure your
> configs are the same across all producers I would always opt for the
> operational overhead instead of the bigger servers. I think this will
> usually decrease the tco's of kafka installations.
>
> I am currently not familiar enough with the codebase to judge if server
> side decompression happens before acknowledge. If so, these would be some
> additional milliseconds to respond faster if we could spare
> de/recompression.
>
> Those are my thoughts about server side de/recompression. It would be
> great if I could get some responses and thoughts back.
>
> Jan
>
>
>
>
> On 07.11.2014 00:23, Jay Kreps wrote:
>
>> I suspect it is possible to save and reuse the CRCs though it might be a
>> bit of an invasive change. I suspect the first usage is when we are
>> checking the validity of the messages and the second is from when we
>> rebuild the compressed message set (I'm assuming you guys are using
>> compression because I think we optimize this out otherwise). Technically I
>> think the CRCs stay the same.
>>
>> An alternative approach, though, would be working to remove the need for
>> recompression entirely on the broker side by making the offsets in the
>> compressed message relative to the base offset of the message set. This is
>> a much more invasive change but potentially better as it would also remove
>> the recompression done on the broker which is also CPU heavy.
>>
>> -Jay
>>
>> On Thu, Nov 6, 2014 at 2:36 PM, Allen Wang <aw...@netflix.com.invalid>
>> wrote:
>>
>>  Sure. Here is the link to the screen shot of jmc with the JTR file
>>> loaded:
>>>
>>> http://picpaste.com/fligh-recorder-crc.png
>>>
>>>
>>>
>>> On Thu, Nov 6, 2014 at 2:12 PM, Neha Narkhede <ne...@gmail.com>
>>> wrote:
>>>
>>>  Allen,
>>>>
>>>> Apache mailing lists don't allow attachments. Could you please link to a
>>>> pastebin or something?
>>>>
>>>> Thanks,
>>>> Neha
>>>>
>>>> On Thu, Nov 6, 2014 at 12:02 PM, Allen Wang <aw...@netflix.com.invalid>
>>>> wrote:
>>>>
>>>>  After digging more into the stack trace got from flight recorder (which
>>>>>
>>>> is
>>>>
>>>>> attached), it seems that Kafka (0.8.1.1) can optimize the usage of
>>>>>
>>>> Crc32.
>>>
>>>> The stack trace shows that Crc32 is invoked twice from Log.append().
>>>>>
>>>> First
>>>>
>>>>> is from the line number 231:
>>>>>
>>>>> val appendInfo = analyzeAndValidateMessageSet(messages)
>>>>>
>>>>> The second time is from line 252 in the same function:
>>>>>
>>>>> validMessages = validMessages.assignOffsets(offset, appendInfo.codec)
>>>>>
>>>>> If one of the Crc32 invocation can be eliminated, we are looking at
>>>>>
>>>> saving
>>>>
>>>>> at least 7% of CPU usage.
>>>>>
>>>>> Thanks,
>>>>> Allen
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 5, 2014 at 6:32 PM, Allen Wang <aw...@netflix.com> wrote:
>>>>>
>>>>>  Hi,
>>>>>>
>>>>>> Using flight recorder, we have observed high CPU usage of CRC32
>>>>>> (kafka.utils.Crc32.update()) on Kafka broker. It uses as much as 25%
>>>>>>
>>>>> of
>>>
>>>> CPU
>>>>
>>>>> on an instance. Tracking down stack trace, this method is invoked by
>>>>>> ReplicaFetcherThread.
>>>>>>
>>>>>> Is there any tuning we can do to reduce this?
>>>>>>
>>>>>> Also on the topic of CPU utilization, we observed that overall CPU
>>>>>> utilization is proportional to AllTopicsBytesInPerSec metric. Does
>>>>>>
>>>>> this
>>>
>>>> metric include incoming replication traffic?
>>>>>>
>>>>>> Thanks,
>>>>>> Allen
>>>>>>
>>>>>>
>>>>>>
>