You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Bruno Rassaerts <br...@novazone.be> on 2016/01/14 23:23:11 UTC

Encryption on disk

Hello,

In our project we have a very strong requirement to protect all data, all the time. Even when the data is “in-rest” on disk, it needs to be protected.
We’ve been trying to figure out how to this with Kafka, and hit some obstacles.

One thing we’ve tried to do is to encrypt every message we hand over to kafka. This results in the encrypted messages being written to disk on the brokers.
However, the performance of performing encryption has serious performance implications, due to the CPU intensive operation which encryption is, and the fact that batch compression offered by Kafka is not nearly as efficient anymore after encrypting the data. Doing this message by message encryption gives us a performance penalty of about 75%, even if we compress the messages before encryption.

What we are looking for is a way to plugin our encryption in two possible locations:

1. As a custom compression algorithm, which would batch compress, and batch encrypt. And get the files stored as such.
2. As a encryption plugin specifically designed for storing the kafka broker files.

Is there any way that this can be done using Kafka (0.9), or can somebody point us to the place were we could add this in the Kafka codebase.

Thanks,
Bruno Rassaerts 

Re: Encryption on disk

Posted by Alex Loddengaard <al...@confluent.io>.
Have you considered encrypting at the broker filesystem level, perhaps with
something like LUKS?

Alex

On Fri, Jan 15, 2016 at 8:38 AM, Jim Hoagland <ji...@symantec.com>
wrote:

> We did not look at compression and did not use it.  You'll probably get
> the best compression while having encryption by building a batch of
> messages, compressing that, then encrypting the compressed batch.
>
> Compressing across the batch will probably almost certainly be better
> space-wise than compressing each message separately because there are
> likely to be similarities between the messages and a good compression
> algorithm will pick up on that make the message smaller.  Even small
> similarities such as it containing a lot of ASCII can be picked up.
>
> To defy cryptanalysis, a good encryption algorithm will make the encrypted
> message appear random.  Random data will not really compress.  If it is
> reliably compressing after encryption, then your encryption is not as
> secure as it should be.  Also discussed here:
> http://security.stackexchange.com/a/19970.
>
> -- Jim
>
> On 1/15/16, 6:39 AM, "Bruno Rassaerts" <br...@novazone.be>
> wrote:
>
> >Thanks for the input Jim.
> >
> >We managed to reduce the encryption impact to about 25% by disabling the
> >kafka batch compression and compressing the messages ourselves before
> >encrypting them one-by-one. However we still believe we could improve by
> >batch compressing + batch encrypting.
> >
> >Can you confirm that in your tests batch compression was disabled ?
> >
> >Thanks,
> >Bruno
> >
> >
> >> On 14 Jan 2016, at 23:47, Jim Hoagland <ji...@symantec.com>
> >>wrote:
> >>
> >> We did a proof of concept on end-to-end encryption using an approach
> >>which
> >> sounds similar to what you describe.  We blogged about it here:
> >>
> >>
> >>
> http://www.symantec.com/connect/blogs/end-end-encryption-though-kafka-our
> >>-p
> >> roof-concept
> >>
> >> You might want to review what is there to see how it differs from what
> >>you
> >> did.  In our tests, the encryption didn't add as much overhead as we
> >> thought it would.
> >>
> >> -- Jim
> >>
> >> --
> >> Jim Hoagland, Ph.D.
> >> Sr. Principal Software Engineer
> >> Big Data Analytics Team
> >> Cloud Platform Engineering
> >>
> >>
> >>
> >> On 1/14/16, 2:23 PM, "Bruno Rassaerts" <br...@novazone.be>
> >>wrote:
> >>
> >>> Hello,
> >>>
> >>> In our project we have a very strong requirement to protect all data,
> >>>all
> >>> the time. Even when the data is “in-rest” on disk, it needs to be
> >>> protected.
> >>> We’ve been trying to figure out how to this with Kafka, and hit some
> >>> obstacles.
> >>>
> >>> One thing we’ve tried to do is to encrypt every message we hand over to
> >>> kafka. This results in the encrypted messages being written to disk on
> >>> the brokers.
> >>> However, the performance of performing encryption has serious
> >>>performance
> >>> implications, due to the CPU intensive operation which encryption is,
> >>>and
> >>> the fact that batch compression offered by Kafka is not nearly as
> >>> efficient anymore after encrypting the data. Doing this message by
> >>> message encryption gives us a performance penalty of about 75%, even if
> >>> we compress the messages before encryption.
> >>>
> >>> What we are looking for is a way to plugin our encryption in two
> >>>possible
> >>> locations:
> >>>
> >>> 1. As a custom compression algorithm, which would batch compress, and
> >>> batch encrypt. And get the files stored as such.
> >>> 2. As a encryption plugin specifically designed for storing the kafka
> >>> broker files.
> >>>
> >>> Is there any way that this can be done using Kafka (0.9), or can
> >>>somebody
> >>> point us to the place were we could add this in the Kafka codebase.
> >>>
> >>> Thanks,
> >>> Bruno Rassaerts
> >>
> >
>
>


-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*

Re: Encryption on disk

Posted by Jim Hoagland <ji...@symantec.com>.
We did not look at compression and did not use it.  You'll probably get
the best compression while having encryption by building a batch of
messages, compressing that, then encrypting the compressed batch.

Compressing across the batch will probably almost certainly be better
space-wise than compressing each message separately because there are
likely to be similarities between the messages and a good compression
algorithm will pick up on that make the message smaller.  Even small
similarities such as it containing a lot of ASCII can be picked up.

To defy cryptanalysis, a good encryption algorithm will make the encrypted
message appear random.  Random data will not really compress.  If it is
reliably compressing after encryption, then your encryption is not as
secure as it should be.  Also discussed here:
http://security.stackexchange.com/a/19970.

-- Jim

On 1/15/16, 6:39 AM, "Bruno Rassaerts" <br...@novazone.be> wrote:

>Thanks for the input Jim.
>
>We managed to reduce the encryption impact to about 25% by disabling the
>kafka batch compression and compressing the messages ourselves before
>encrypting them one-by-one. However we still believe we could improve by
>batch compressing + batch encrypting.
>
>Can you confirm that in your tests batch compression was disabled ?
>
>Thanks,
>Bruno
>
>
>> On 14 Jan 2016, at 23:47, Jim Hoagland <ji...@symantec.com>
>>wrote:
>> 
>> We did a proof of concept on end-to-end encryption using an approach
>>which
>> sounds similar to what you describe.  We blogged about it here:
>> 
>> 
>>http://www.symantec.com/connect/blogs/end-end-encryption-though-kafka-our
>>-p
>> roof-concept
>> 
>> You might want to review what is there to see how it differs from what
>>you
>> did.  In our tests, the encryption didn't add as much overhead as we
>> thought it would.
>> 
>> -- Jim
>> 
>> -- 
>> Jim Hoagland, Ph.D.
>> Sr. Principal Software Engineer
>> Big Data Analytics Team
>> Cloud Platform Engineering
>> 
>> 
>> 
>> On 1/14/16, 2:23 PM, "Bruno Rassaerts" <br...@novazone.be>
>>wrote:
>> 
>>> Hello,
>>> 
>>> In our project we have a very strong requirement to protect all data,
>>>all
>>> the time. Even when the data is “in-rest” on disk, it needs to be
>>> protected.
>>> We’ve been trying to figure out how to this with Kafka, and hit some
>>> obstacles.
>>> 
>>> One thing we’ve tried to do is to encrypt every message we hand over to
>>> kafka. This results in the encrypted messages being written to disk on
>>> the brokers.
>>> However, the performance of performing encryption has serious
>>>performance
>>> implications, due to the CPU intensive operation which encryption is,
>>>and
>>> the fact that batch compression offered by Kafka is not nearly as
>>> efficient anymore after encrypting the data. Doing this message by
>>> message encryption gives us a performance penalty of about 75%, even if
>>> we compress the messages before encryption.
>>> 
>>> What we are looking for is a way to plugin our encryption in two
>>>possible
>>> locations:
>>> 
>>> 1. As a custom compression algorithm, which would batch compress, and
>>> batch encrypt. And get the files stored as such.
>>> 2. As a encryption plugin specifically designed for storing the kafka
>>> broker files.
>>> 
>>> Is there any way that this can be done using Kafka (0.9), or can
>>>somebody
>>> point us to the place were we could add this in the Kafka codebase.
>>> 
>>> Thanks,
>>> Bruno Rassaerts
>> 
>


Re: Encryption on disk

Posted by Bruno Rassaerts <br...@novazone.be>.
Thanks for the input Jim.

We managed to reduce the encryption impact to about 25% by disabling the kafka batch compression and compressing the messages ourselves before encrypting them one-by-one. However we still believe we could improve by batch compressing + batch encrypting. 

Can you confirm that in your tests batch compression was disabled ?

Thanks,
Bruno


> On 14 Jan 2016, at 23:47, Jim Hoagland <ji...@symantec.com> wrote:
> 
> We did a proof of concept on end-to-end encryption using an approach which
> sounds similar to what you describe.  We blogged about it here:
> 
> http://www.symantec.com/connect/blogs/end-end-encryption-though-kafka-our-p
> roof-concept
> 
> You might want to review what is there to see how it differs from what you
> did.  In our tests, the encryption didn't add as much overhead as we
> thought it would.
> 
> -- Jim
> 
> -- 
> Jim Hoagland, Ph.D.
> Sr. Principal Software Engineer
> Big Data Analytics Team
> Cloud Platform Engineering
> 
> 
> 
> On 1/14/16, 2:23 PM, "Bruno Rassaerts" <br...@novazone.be> wrote:
> 
>> Hello,
>> 
>> In our project we have a very strong requirement to protect all data, all
>> the time. Even when the data is “in-rest” on disk, it needs to be
>> protected.
>> We’ve been trying to figure out how to this with Kafka, and hit some
>> obstacles.
>> 
>> One thing we’ve tried to do is to encrypt every message we hand over to
>> kafka. This results in the encrypted messages being written to disk on
>> the brokers.
>> However, the performance of performing encryption has serious performance
>> implications, due to the CPU intensive operation which encryption is, and
>> the fact that batch compression offered by Kafka is not nearly as
>> efficient anymore after encrypting the data. Doing this message by
>> message encryption gives us a performance penalty of about 75%, even if
>> we compress the messages before encryption.
>> 
>> What we are looking for is a way to plugin our encryption in two possible
>> locations:
>> 
>> 1. As a custom compression algorithm, which would batch compress, and
>> batch encrypt. And get the files stored as such.
>> 2. As a encryption plugin specifically designed for storing the kafka
>> broker files.
>> 
>> Is there any way that this can be done using Kafka (0.9), or can somebody
>> point us to the place were we could add this in the Kafka codebase.
>> 
>> Thanks,
>> Bruno Rassaerts 
> 


Re: Encryption on disk

Posted by Jim Hoagland <ji...@symantec.com>.
We did a proof of concept on end-to-end encryption using an approach which
sounds similar to what you describe.  We blogged about it here:
  
http://www.symantec.com/connect/blogs/end-end-encryption-though-kafka-our-p
roof-concept

You might want to review what is there to see how it differs from what you
did.  In our tests, the encryption didn't add as much overhead as we
thought it would.

-- Jim

-- 
Jim Hoagland, Ph.D.
Sr. Principal Software Engineer
Big Data Analytics Team
Cloud Platform Engineering



On 1/14/16, 2:23 PM, "Bruno Rassaerts" <br...@novazone.be> wrote:

>Hello,
>
>In our project we have a very strong requirement to protect all data, all
>the time. Even when the data is “in-rest” on disk, it needs to be
>protected.
>We’ve been trying to figure out how to this with Kafka, and hit some
>obstacles.
>
>One thing we’ve tried to do is to encrypt every message we hand over to
>kafka. This results in the encrypted messages being written to disk on
>the brokers.
>However, the performance of performing encryption has serious performance
>implications, due to the CPU intensive operation which encryption is, and
>the fact that batch compression offered by Kafka is not nearly as
>efficient anymore after encrypting the data. Doing this message by
>message encryption gives us a performance penalty of about 75%, even if
>we compress the messages before encryption.
>
>What we are looking for is a way to plugin our encryption in two possible
>locations:
>
>1. As a custom compression algorithm, which would batch compress, and
>batch encrypt. And get the files stored as such.
>2. As a encryption plugin specifically designed for storing the kafka
>broker files.
>
>Is there any way that this can be done using Kafka (0.9), or can somebody
>point us to the place were we could add this in the Kafka codebase.
>
>Thanks,
>Bruno Rassaerts