You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Nicolas MOTTE <ni...@amadeus.com> on 2017/03/08 20:40:42 UTC

Performance and Encryption

Hi everyone,

I understand one of the reasons why Kafka is performant is by using zero-copy.

I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance.

If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset...

Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null).

Cheers
Nico


Re: Performance and Encryption

Posted by Hans Jespersen <ha...@confluent.io>.
You are correct that a Kafka broker is not just writing to one file. Jay Kreps wrote a great blog post with lots of links to even greater detail on the topic of Kafka and disk write performance. Still a good read many years later.

https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines <https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines>

-hans


> On Mar 15, 2017, at 7:51 AM, Nicolas MOTTE <ni...@amadeus.com> wrote:
> 
> Ok that makes sense, thanks !
> 
> The next question I have regarding performance is about the way Kafka writes in the data files.
> I often hear Kafka is very performant because it writes in an append-only fashion.
> So even with hard disk (not SSD) we get a great performance because it writes in sequence.
> 
> I could understand that if Kafka was only writing to one file.
> But in reality it s writing to N files, N being the number of partitions hosted by the broker.
> So even though it appends the data to each file, overall I assume it is not writing in sequence on the disk.
> 
> Am I wrong ?
> 
> -----Original Message-----
> From: Tauzell, Dave [mailto:Dave.Tauzell@surescripts.com] 
> Sent: 08 March 2017 22:09
> To: users@kafka.apache.org
> Subject: RE: Performance and Encryption
> 
> I think because the product batches messages which could be for different topics.
> 
> -Dave
> 
> -----Original Message-----
> From: Nicolas MOTTE [mailto:nicolas.motte@amadeus.com]
> Sent: Wednesday, March 8, 2017 2:41 PM
> To: users@kafka.apache.org
> Subject: Performance and Encryption
> 
> Hi everyone,
> 
> I understand one of the reasons why Kafka is performant is by using zero-copy.
> 
> I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance.
> 
> If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset...
> 
> Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null).
> 
> Cheers
> Nico
> 
> This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.
> 


RE: Performance and Encryption

Posted by Nicolas MOTTE <ni...@amadeus.com>.
Ok that makes sense, thanks !

The next question I have regarding performance is about the way Kafka writes in the data files.
I often hear Kafka is very performant because it writes in an append-only fashion.
So even with hard disk (not SSD) we get a great performance because it writes in sequence.

I could understand that if Kafka was only writing to one file.
But in reality it s writing to N files, N being the number of partitions hosted by the broker.
So even though it appends the data to each file, overall I assume it is not writing in sequence on the disk.

Am I wrong ?

-----Original Message-----
From: Tauzell, Dave [mailto:Dave.Tauzell@surescripts.com] 
Sent: 08 March 2017 22:09
To: users@kafka.apache.org
Subject: RE: Performance and Encryption

I think because the product batches messages which could be for different topics.

-Dave

-----Original Message-----
From: Nicolas MOTTE [mailto:nicolas.motte@amadeus.com]
Sent: Wednesday, March 8, 2017 2:41 PM
To: users@kafka.apache.org
Subject: Performance and Encryption

Hi everyone,

I understand one of the reasons why Kafka is performant is by using zero-copy.

I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance.

If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset...

Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null).

Cheers
Nico

This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.


Re: Performance and Encryption

Posted by Todd Palino <tp...@gmail.com>.
Nicholas, this appears to be a duplicate of your question from 2 days ago.
Please review that for discussion on this question.

-Todd


On Wed, Mar 8, 2017 at 1:08 PM, Tauzell, Dave <Da...@surescripts.com>
wrote:

> I think because the product batches messages which could be for different
> topics.
>
> -Dave
>
> -----Original Message-----
> From: Nicolas MOTTE [mailto:nicolas.motte@amadeus.com]
> Sent: Wednesday, March 8, 2017 2:41 PM
> To: users@kafka.apache.org
> Subject: Performance and Encryption
>
> Hi everyone,
>
> I understand one of the reasons why Kafka is performant is by using
> zero-copy.
>
> I often hear that when encryption is enabled, then Kafka has to copy the
> data in user space to decode the message, so it has a big impact on
> performance.
>
> If it is true, I don t get why the message has to be decoded by Kafka. I
> would assume that whether the message is encrypted or not, Kafka simply
> receives it, appends it to the file, and when a consumer wants to read it,
> it simply reads at the right offset...
>
> Also I m wondering if it s the case if we don t use keys (pure queuing
> system with key=null).
>
> Cheers
> Nico
>
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>
>


-- 
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino

RE: Performance and Encryption

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I think because the product batches messages which could be for different topics.

-Dave

-----Original Message-----
From: Nicolas MOTTE [mailto:nicolas.motte@amadeus.com]
Sent: Wednesday, March 8, 2017 2:41 PM
To: users@kafka.apache.org
Subject: Performance and Encryption

Hi everyone,

I understand one of the reasons why Kafka is performant is by using zero-copy.

I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance.

If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset...

Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null).

Cheers
Nico

This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.