You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Sönke Liebau <so...@opencore.com.INVALID> on 2019/05/21 12:15:33 UTC

Re: [DISCUSS] KIP-317: Transparent Data Encryption

Hi everybody,

I'd like to rekindle the discussion around KIP-317.
I have reworked the KIP a little bit in order to design everything as a
pluggable implementation. During the course of that work I've also decided
to rename the KIP, as encryption will only be transparent in some cases. It
is now called "Add end to end data encryption functionality to Apache
Kafka" [1].

I'd very much appreciate it if you could give the KIP a quick read. This is
not at this point a fully fleshed out design, as I would like to agree on
the underlying structure that I came up with first, before spending time on
details.

TL/DR is:
Create three pluggable classes:
KeyManager runs on the broker and manages which keys to use, key rollover
etc
KeyProvider runs on the client and retrieves keys based on what the
KeyManager tells it
EncryptionEngine runs on the client andhandles the actual encryption
First idea of control flow between these components can be seen at [2]

Please let me know any thoughts or concerns that you may have!

Best regards,
Sönke

[1]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+end-to-end+data+encryption+functionality+to+Apache+Kafka
[2]
https://cwiki.apache.org/confluence/download/attachments/85479936/kafka_e2e-encryption_control-flow.png?version=1&modificationDate=1558439227551&api=v2



On Fri, 10 Aug 2018 at 14:05, Sönke Liebau <so...@opencore.com>
wrote:

> Hi Viktor,
>
> thanks for your input! We could accommodate magic headers by removing any
> known fixed bytes pre-encryption, sticking them in a header field and
> prepending them after decryption. However, I am not sure whether this is
> actually necessary, as most modern (AES for sure) algorithms are considered
> to be resistant to known-plaintext types of attack. Even if the entire
> plaintext is known to the attacker he still needs to brute-force the key -
> which may take a while.
>
> Something different to consider in this context are compression
> sidechannel attacks like CRIME or BREACH, which may be relevant depending
> on what type of data is being sent through Kafka. Both these attacks depend
> on the encrypted record containing a combination of secret and user
> controlled data.
> For example if Kafka was used to forward data that the user entered on a
> website along with a secret API key that the website adds to a back-end
> server and the user can obtain the Kafka messages, these attacks would
> become relevant. Not much we can do about that except disallow encryption
> when compression is enabled (TLS chose this approach in version 1.3)
>
> I agree with you, that we definitely need to clearly document any risks
> and how much security can reasonably be expected in any given scenario. We
> might even consider logging a warning message when sending data that is
> compressed and encrypted.
>
> On a different note, I've started amending the KIP to make key management
> and distribution pluggable, should hopefully be able to publish sometime
> Monday.
>
> Best regards,
> Sönke
>
>
> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi <vi...@gmail.com>
> wrote:
>
>> Hi Sönke,
>>
>> Compressing before encrypting has its dangers as well. Suppose you have a
>> known compression format which adds a magic header and you're using a
>> block
>> cipher with a small enough block, then it becomes much easier to figure
>> out
>> the encryption key. For instance you can look at Snappy's stream
>> identifier:
>> https://github.com/google/snappy/blob/master/framing_format.txt
>> . Based on this you should only use block ciphers where block sizes are
>> much larger then 6 bytes. AES for instance should be good with its 128
>> bits
>> = 16 bytes but even this isn't entirely secure as the first 6 bytes
>> already
>> leaked some information - and it depends on the cypher that how much it
>> is.
>> Also if we suppose that an adversary accesses a broker and takes all the
>> data, they'll have much easier job to decrypt it as they'll have much more
>> examples.
>> So overall we should make sure to define and document the compatible
>> encryptions with the supported compression methods and the level of
>> security they provide to make sure the users are fully aware of the
>> security implications.
>>
>> Cheers,
>> Viktor
>>
>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau
>> <so...@opencore.com.invalid> wrote:
>>
>> > Hi Stephane,
>> >
>> > thanks for pointing out the broken pictures, I fixed those.
>> >
>> > Regarding encrypting before or after batching the messages, you are
>> > correct, I had not thought of compression and how this changes things.
>> > Encrypted data does not really encrypt well. My reasoning at the time
>> > of writing was that if we encrypt the entire batch we'd have to wait
>> > for the batch to be full before starting to encrypt. Whereas with per
>> > message encryption we can encrypt them as they come in and more or
>> > less have them ready for sending when the batch is complete.
>> > However I think the difference will probably not be that large (will
>> > do some testing) and offset by just encrypting once instead of many
>> > times, which has a certain overhead every time. Also, from a security
>> > perspective encrypting longer chunks of data is preferable - another
>> > benefit.
>> >
>> > This does however take away the ability of the broker to see the
>> > individual records inside the encrypted batch, so this would need to
>> > be stored and retrieved as a single record - just like is done for
>> > compressed batches. I am not 100% sure that this won't create issues,
>> > especially when considering transactions, I will need to look at the
>> > compression code some more. In essence though, since it works for
>> > compression I see no reason why it can't be made to work here.
>> >
>> > On a different note, going down this route might make us reconsider
>> > storing the key with the data, as this might significantly reduce
>> > storage overhead - still much higher than just storing them once
>> > though.
>> >
>> > Best regards,
>> > Sönke
>> >
>> > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek
>> > <st...@simplemachines.com.au> wrote:
>> > > Hi Sonke
>> > >
>> > > Very much needed feature and discussion. FYI the image links seem
>> broken.
>> > >
>> > > My 2 cents (if I understood correctly): you say "This process will be
>> > > implemented after Serializer and Interceptors are done with the
>> message
>> > > right before it is added to the batch to be sent, in order to ensure
>> that
>> > > existing serializers and interceptors keep working with encryption
>> just
>> > > like without it."
>> > >
>> > > I think encryption should happen AFTER a batch is created, right
>> before
>> > it
>> > > is sent. Reason is that if we want to still keep advantage of
>> > compression,
>> > > encryption needs to happen after it (and I believe compression happens
>> > on a
>> > > batch level).
>> > > So to me for a producer: serializer / interceptors => batching =>
>> > > compression => encryption => send.
>> > > and the inverse for a consumer.
>> > >
>> > > Regards
>> > > Stephane
>> > >
>> > > On 19 June 2018 at 06:46, Sönke Liebau <soenke.liebau@opencore.com
>> > .invalid>
>> > > wrote:
>> > >
>> > >> Hi everybody,
>> > >>
>> > >> I've created a draft version of KIP-317 which describes the addition
>> > >> of transparent data encryption functionality to Kafka.
>> > >>
>> > >> Please consider this as a basis for discussion - I am aware that this
>> > >> is not at a level of detail sufficient for implementation, but I
>> > >> wanted to get some feedback from the community on the general idea
>> > >> before spending more time on this.
>> > >>
>> > >> Link to the KIP is:
>> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > >> 317%3A+Add+transparent+data+encryption+functionality
>> > >>
>> > >> Best regards,
>> > >> Sönke
>> > >>
>> >
>> >
>> >
>> > --
>> > Sönke Liebau
>> > Partner
>> > Tel. +49 179 7940878
>> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>> >
>>
>
>
>
> --
> Sönke Liebau
> Partner
> Tel. +49 179 7940878
> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>


-- 
Sönke Liebau
Partner
Tel. +49 179 7940878
OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany

Re: [DISCUSS] KIP-317: Transparent Data Encryption

Posted by Sönke Liebau <so...@opencore.com>.

Hi Andrew,

thanks for your feedback!
I am interested though, why are you doubtful about getting a committer to
volunteer an opinion? Shouldn't this be in their interest as well?

I'll just continue along for now and start building a very rough poc
implementation based on what's in the KIP so far to flesh out more details
and add them to the KIP as I go along.

Best regards,
Sönke

On Wed, 7 Aug 2019 at 18:18, Andrew Schofield <an...@live.com>
wrote:

> Hi,
> I think this is a useful KIP and it looks good in principle. While it can
> all be done using
> interceptors, if the brokers do not know anything about it, you need to
> maintain the
> mapping from topics to key ids somewhere external. I'd prefer the way
> you've done it.
>
> I'm not sure whether you'll manage to interest any committers in
> volunteering an
> opinion, and you'll need that before you can get the KIP accepted into
> Kafka.
>
> Thanks,
> Andrew Schofield (IBM)
>
> On 06/08/2019, 15:46, "Sönke Liebau" <so...@opencore.com.INVALID>
> wrote:
>
>     Hi,
>
>     I have so far received pretty much no comments on the technical details
>     outlined in the KIP. While I am happy to continue with my own ideas of
> how
>     to implement this, I would much prefer to at least get a very broad
> "looks
>     good in principle, but still lots to flesh out" from a few people
> before I
>     but more work into this.
>
>     Best regards,
>     Sönke
>
>
>
>
>     On Tue, 21 May 2019 at 14:15, Sönke Liebau <soenke.liebau@opencore.com
> >
>     wrote:
>
>     > Hi everybody,
>     >
>     > I'd like to rekindle the discussion around KIP-317.
>     > I have reworked the KIP a little bit in order to design everything
> as a
>     > pluggable implementation. During the course of that work I've also
> decided
>     > to rename the KIP, as encryption will only be transparent in some
> cases. It
>     > is now called "Add end to end data encryption functionality to Apache
>     > Kafka" [1].
>     >
>     > I'd very much appreciate it if you could give the KIP a quick read.
> This
>     > is not at this point a fully fleshed out design, as I would like to
> agree
>     > on the underlying structure that I came up with first, before
> spending time
>     > on details.
>     >
>     > TL/DR is:
>     > Create three pluggable classes:
>     > KeyManager runs on the broker and manages which keys to use, key
> rollover
>     > etc
>     > KeyProvider runs on the client and retrieves keys based on what the
>     > KeyManager tells it
>     > EncryptionEngine runs on the client andhandles the actual encryption
>     > First idea of control flow between these components can be seen at
> [2]
>     >
>     > Please let me know any thoughts or concerns that you may have!
>     >
>     > Best regards,
>     > Sönke
>     >
>     > [1]
>     >
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-317%253A%2BAdd%2Bend-to-end%2Bdata%2Bencryption%2Bfunctionality%2Bto%2BApache%2BKafka&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=GwcvmfILdjTZBxOseHR4IjUY0oMG3%2BKEjFNHo3pJlvc%3D&amp;reserved=0
>     > [2]
>     >
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdownload%2Fattachments%2F85479936%2Fkafka_e2e-encryption_control-flow.png%3Fversion%3D1%26modificationDate%3D1558439227551%26api%3Dv2&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=FcMoNEliLn48OZfWca1TCQv%2BiIlRNqJNQvU52UfkbEs%3D&amp;reserved=0
>     >
>     >
>     >
>     > On Fri, 10 Aug 2018 at 14:05, Sönke Liebau <
> soenke.liebau@opencore.com>
>     > wrote:
>     >
>     >> Hi Viktor,
>     >>
>     >> thanks for your input! We could accommodate magic headers by
> removing any
>     >> known fixed bytes pre-encryption, sticking them in a header field
> and
>     >> prepending them after decryption. However, I am not sure whether
> this is
>     >> actually necessary, as most modern (AES for sure) algorithms are
> considered
>     >> to be resistant to known-plaintext types of attack. Even if the
> entire
>     >> plaintext is known to the attacker he still needs to brute-force
> the key -
>     >> which may take a while.
>     >>
>     >> Something different to consider in this context are compression
>     >> sidechannel attacks like CRIME or BREACH, which may be relevant
> depending
>     >> on what type of data is being sent through Kafka. Both these
> attacks depend
>     >> on the encrypted record containing a combination of secret and user
>     >> controlled data.
>     >> For example if Kafka was used to forward data that the user entered
> on a
>     >> website along with a secret API key that the website adds to a
> back-end
>     >> server and the user can obtain the Kafka messages, these attacks
> would
>     >> become relevant. Not much we can do about that except disallow
> encryption
>     >> when compression is enabled (TLS chose this approach in version 1.3)
>     >>
>     >> I agree with you, that we definitely need to clearly document any
> risks
>     >> and how much security can reasonably be expected in any given
> scenario. We
>     >> might even consider logging a warning message when sending data
> that is
>     >> compressed and encrypted.
>     >>
>     >> On a different note, I've started amending the KIP to make key
> management
>     >> and distribution pluggable, should hopefully be able to publish
> sometime
>     >> Monday.
>     >>
>     >> Best regards,
>     >> Sönke
>     >>
>     >>
>     >> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi <
> viktorsomogyi@gmail.com
>     >> > wrote:
>     >>
>     >>> Hi Sönke,
>     >>>
>     >>> Compressing before encrypting has its dangers as well. Suppose you
> have a
>     >>> known compression format which adds a magic header and you're
> using a
>     >>> block
>     >>> cipher with a small enough block, then it becomes much easier to
> figure
>     >>> out
>     >>> the encryption key. For instance you can look at Snappy's stream
>     >>> identifier:
>     >>>
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fsnappy%2Fblob%2Fmaster%2Fframing_format.txt&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=qe9szbUHLk81YmrbL7WeK%2Bse2LAB44vN%2FEOL7PT7wbE%3D&amp;reserved=0
>     >>> . Based on this you should only use block ciphers where block
> sizes are
>     >>> much larger then 6 bytes. AES for instance should be good with its
> 128
>     >>> bits
>     >>> = 16 bytes but even this isn't entirely secure as the first 6 bytes
>     >>> already
>     >>> leaked some information - and it depends on the cypher that how
> much it
>     >>> is.
>     >>> Also if we suppose that an adversary accesses a broker and takes
> all the
>     >>> data, they'll have much easier job to decrypt it as they'll have
> much
>     >>> more
>     >>> examples.
>     >>> So overall we should make sure to define and document the
> compatible
>     >>> encryptions with the supported compression methods and the level of
>     >>> security they provide to make sure the users are fully aware of the
>     >>> security implications.
>     >>>
>     >>> Cheers,
>     >>> Viktor
>     >>>
>     >>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau
>     >>> <so...@opencore.com.invalid> wrote:
>     >>>
>     >>> > Hi Stephane,
>     >>> >
>     >>> > thanks for pointing out the broken pictures, I fixed those.
>     >>> >
>     >>> > Regarding encrypting before or after batching the messages, you
> are
>     >>> > correct, I had not thought of compression and how this changes
> things.
>     >>> > Encrypted data does not really encrypt well. My reasoning at the
> time
>     >>> > of writing was that if we encrypt the entire batch we'd have to
> wait
>     >>> > for the batch to be full before starting to encrypt. Whereas
> with per
>     >>> > message encryption we can encrypt them as they come in and more
> or
>     >>> > less have them ready for sending when the batch is complete.
>     >>> > However I think the difference will probably not be that large
> (will
>     >>> > do some testing) and offset by just encrypting once instead of
> many
>     >>> > times, which has a certain overhead every time. Also, from a
> security
>     >>> > perspective encrypting longer chunks of data is preferable -
> another
>     >>> > benefit.
>     >>> >
>     >>> > This does however take away the ability of the broker to see the
>     >>> > individual records inside the encrypted batch, so this would
> need to
>     >>> > be stored and retrieved as a single record - just like is done
> for
>     >>> > compressed batches. I am not 100% sure that this won't create
> issues,
>     >>> > especially when considering transactions, I will need to look at
> the
>     >>> > compression code some more. In essence though, since it works for
>     >>> > compression I see no reason why it can't be made to work here.
>     >>> >
>     >>> > On a different note, going down this route might make us
> reconsider
>     >>> > storing the key with the data, as this might significantly reduce
>     >>> > storage overhead - still much higher than just storing them once
>     >>> > though.
>     >>> >
>     >>> > Best regards,
>     >>> > Sönke
>     >>> >
>     >>> > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek
>     >>> > <st...@simplemachines.com.au> wrote:
>     >>> > > Hi Sonke
>     >>> > >
>     >>> > > Very much needed feature and discussion. FYI the image links
> seem
>     >>> broken.
>     >>> > >
>     >>> > > My 2 cents (if I understood correctly): you say "This process
> will be
>     >>> > > implemented after Serializer and Interceptors are done with the
>     >>> message
>     >>> > > right before it is added to the batch to be sent, in order to
> ensure
>     >>> that
>     >>> > > existing serializers and interceptors keep working with
> encryption
>     >>> just
>     >>> > > like without it."
>     >>> > >
>     >>> > > I think encryption should happen AFTER a batch is created,
> right
>     >>> before
>     >>> > it
>     >>> > > is sent. Reason is that if we want to still keep advantage of
>     >>> > compression,
>     >>> > > encryption needs to happen after it (and I believe compression
>     >>> happens
>     >>> > on a
>     >>> > > batch level).
>     >>> > > So to me for a producer: serializer / interceptors => batching
> =>
>     >>> > > compression => encryption => send.
>     >>> > > and the inverse for a consumer.
>     >>> > >
>     >>> > > Regards
>     >>> > > Stephane
>     >>> > >
>     >>> > > On 19 June 2018 at 06:46, Sönke Liebau <
> soenke.liebau@opencore.com
>     >>> > .invalid>
>     >>> > > wrote:
>     >>> > >
>     >>> > >> Hi everybody,
>     >>> > >>
>     >>> > >> I've created a draft version of KIP-317 which describes the
> addition
>     >>> > >> of transparent data encryption functionality to Kafka.
>     >>> > >>
>     >>> > >> Please consider this as a basis for discussion - I am aware
> that
>     >>> this
>     >>> > >> is not at a level of detail sufficient for implementation,
> but I
>     >>> > >> wanted to get some feedback from the community on the general
> idea
>     >>> > >> before spending more time on this.
>     >>> > >>
>     >>> > >> Link to the KIP is:
>     >>> > >>
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=B7QY%2BWj4zwvdQkqVxLH0dpecT%2BEQzuR1luIctYTQFN8%3D&amp;reserved=0
>     >>> > >> 317%3A+Add+transparent+data+encryption+functionality
>     >>> > >>
>     >>> > >> Best regards,
>     >>> > >> Sönke
>     >>> > >>
>     >>> >
>     >>> >
>     >>> >
>     >>> > --
>     >>> > Sönke Liebau
>     >>> > Partner
>     >>> > Tel. +49 179 7940878
>     >>> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel -
> Germany
>     >>> >
>     >>>
>     >>
>     >>
>     >>
>     >> --
>     >> Sönke Liebau
>     >> Partner
>     >> Tel. +49 179 7940878
>     >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel -
> Germany
>     >>
>     >
>     >
>     > --
>     > Sönke Liebau
>     > Partner
>     > Tel. +49 179 7940878
>     > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>     >
>
>
>     --
>     Sönke Liebau
>     Partner
>     Tel. +49 179 7940878
>     OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>
>
>

-- 
Sönke Liebau
Partner
Tel. +49 179 7940878
OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany

Re: [DISCUSS] KIP-317: Transparent Data Encryption

Posted by Andrew Schofield <an...@live.com>.

Hi,
I think this is a useful KIP and it looks good in principle. While it can all be done using
interceptors, if the brokers do not know anything about it, you need to maintain the
mapping from topics to key ids somewhere external. I'd prefer the way you've done it.

I'm not sure whether you'll manage to interest any committers in volunteering an
opinion, and you'll need that before you can get the KIP accepted into Kafka.

Thanks,
Andrew Schofield (IBM)

On 06/08/2019, 15:46, "Sönke Liebau" <so...@opencore.com.INVALID> wrote:

    Hi,
    
    I have so far received pretty much no comments on the technical details
    outlined in the KIP. While I am happy to continue with my own ideas of how
    to implement this, I would much prefer to at least get a very broad "looks
    good in principle, but still lots to flesh out" from a few people before I
    but more work into this.
    
    Best regards,
    Sönke
    
    
    
    
    On Tue, 21 May 2019 at 14:15, Sönke Liebau <so...@opencore.com>
    wrote:
    
    > Hi everybody,
    >
    > I'd like to rekindle the discussion around KIP-317.
    > I have reworked the KIP a little bit in order to design everything as a
    > pluggable implementation. During the course of that work I've also decided
    > to rename the KIP, as encryption will only be transparent in some cases. It
    > is now called "Add end to end data encryption functionality to Apache
    > Kafka" [1].
    >
    > I'd very much appreciate it if you could give the KIP a quick read. This
    > is not at this point a fully fleshed out design, as I would like to agree
    > on the underlying structure that I came up with first, before spending time
    > on details.
    >
    > TL/DR is:
    > Create three pluggable classes:
    > KeyManager runs on the broker and manages which keys to use, key rollover
    > etc
    > KeyProvider runs on the client and retrieves keys based on what the
    > KeyManager tells it
    > EncryptionEngine runs on the client andhandles the actual encryption
    > First idea of control flow between these components can be seen at [2]
    >
    > Please let me know any thoughts or concerns that you may have!
    >
    > Best regards,
    > Sönke
    >
    > [1]
    > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-317%253A%2BAdd%2Bend-to-end%2Bdata%2Bencryption%2Bfunctionality%2Bto%2BApache%2BKafka&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=GwcvmfILdjTZBxOseHR4IjUY0oMG3%2BKEjFNHo3pJlvc%3D&amp;reserved=0
    > [2]
    > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdownload%2Fattachments%2F85479936%2Fkafka_e2e-encryption_control-flow.png%3Fversion%3D1%26modificationDate%3D1558439227551%26api%3Dv2&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=FcMoNEliLn48OZfWca1TCQv%2BiIlRNqJNQvU52UfkbEs%3D&amp;reserved=0
    >
    >
    >
    > On Fri, 10 Aug 2018 at 14:05, Sönke Liebau <so...@opencore.com>
    > wrote:
    >
    >> Hi Viktor,
    >>
    >> thanks for your input! We could accommodate magic headers by removing any
    >> known fixed bytes pre-encryption, sticking them in a header field and
    >> prepending them after decryption. However, I am not sure whether this is
    >> actually necessary, as most modern (AES for sure) algorithms are considered
    >> to be resistant to known-plaintext types of attack. Even if the entire
    >> plaintext is known to the attacker he still needs to brute-force the key -
    >> which may take a while.
    >>
    >> Something different to consider in this context are compression
    >> sidechannel attacks like CRIME or BREACH, which may be relevant depending
    >> on what type of data is being sent through Kafka. Both these attacks depend
    >> on the encrypted record containing a combination of secret and user
    >> controlled data.
    >> For example if Kafka was used to forward data that the user entered on a
    >> website along with a secret API key that the website adds to a back-end
    >> server and the user can obtain the Kafka messages, these attacks would
    >> become relevant. Not much we can do about that except disallow encryption
    >> when compression is enabled (TLS chose this approach in version 1.3)
    >>
    >> I agree with you, that we definitely need to clearly document any risks
    >> and how much security can reasonably be expected in any given scenario. We
    >> might even consider logging a warning message when sending data that is
    >> compressed and encrypted.
    >>
    >> On a different note, I've started amending the KIP to make key management
    >> and distribution pluggable, should hopefully be able to publish sometime
    >> Monday.
    >>
    >> Best regards,
    >> Sönke
    >>
    >>
    >> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi <viktorsomogyi@gmail.com
    >> > wrote:
    >>
    >>> Hi Sönke,
    >>>
    >>> Compressing before encrypting has its dangers as well. Suppose you have a
    >>> known compression format which adds a magic header and you're using a
    >>> block
    >>> cipher with a small enough block, then it becomes much easier to figure
    >>> out
    >>> the encryption key. For instance you can look at Snappy's stream
    >>> identifier:
    >>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fsnappy%2Fblob%2Fmaster%2Fframing_format.txt&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=qe9szbUHLk81YmrbL7WeK%2Bse2LAB44vN%2FEOL7PT7wbE%3D&amp;reserved=0
    >>> . Based on this you should only use block ciphers where block sizes are
    >>> much larger then 6 bytes. AES for instance should be good with its 128
    >>> bits
    >>> = 16 bytes but even this isn't entirely secure as the first 6 bytes
    >>> already
    >>> leaked some information - and it depends on the cypher that how much it
    >>> is.
    >>> Also if we suppose that an adversary accesses a broker and takes all the
    >>> data, they'll have much easier job to decrypt it as they'll have much
    >>> more
    >>> examples.
    >>> So overall we should make sure to define and document the compatible
    >>> encryptions with the supported compression methods and the level of
    >>> security they provide to make sure the users are fully aware of the
    >>> security implications.
    >>>
    >>> Cheers,
    >>> Viktor
    >>>
    >>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau
    >>> <so...@opencore.com.invalid> wrote:
    >>>
    >>> > Hi Stephane,
    >>> >
    >>> > thanks for pointing out the broken pictures, I fixed those.
    >>> >
    >>> > Regarding encrypting before or after batching the messages, you are
    >>> > correct, I had not thought of compression and how this changes things.
    >>> > Encrypted data does not really encrypt well. My reasoning at the time
    >>> > of writing was that if we encrypt the entire batch we'd have to wait
    >>> > for the batch to be full before starting to encrypt. Whereas with per
    >>> > message encryption we can encrypt them as they come in and more or
    >>> > less have them ready for sending when the batch is complete.
    >>> > However I think the difference will probably not be that large (will
    >>> > do some testing) and offset by just encrypting once instead of many
    >>> > times, which has a certain overhead every time. Also, from a security
    >>> > perspective encrypting longer chunks of data is preferable - another
    >>> > benefit.
    >>> >
    >>> > This does however take away the ability of the broker to see the
    >>> > individual records inside the encrypted batch, so this would need to
    >>> > be stored and retrieved as a single record - just like is done for
    >>> > compressed batches. I am not 100% sure that this won't create issues,
    >>> > especially when considering transactions, I will need to look at the
    >>> > compression code some more. In essence though, since it works for
    >>> > compression I see no reason why it can't be made to work here.
    >>> >
    >>> > On a different note, going down this route might make us reconsider
    >>> > storing the key with the data, as this might significantly reduce
    >>> > storage overhead - still much higher than just storing them once
    >>> > though.
    >>> >
    >>> > Best regards,
    >>> > Sönke
    >>> >
    >>> > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek
    >>> > <st...@simplemachines.com.au> wrote:
    >>> > > Hi Sonke
    >>> > >
    >>> > > Very much needed feature and discussion. FYI the image links seem
    >>> broken.
    >>> > >
    >>> > > My 2 cents (if I understood correctly): you say "This process will be
    >>> > > implemented after Serializer and Interceptors are done with the
    >>> message
    >>> > > right before it is added to the batch to be sent, in order to ensure
    >>> that
    >>> > > existing serializers and interceptors keep working with encryption
    >>> just
    >>> > > like without it."
    >>> > >
    >>> > > I think encryption should happen AFTER a batch is created, right
    >>> before
    >>> > it
    >>> > > is sent. Reason is that if we want to still keep advantage of
    >>> > compression,
    >>> > > encryption needs to happen after it (and I believe compression
    >>> happens
    >>> > on a
    >>> > > batch level).
    >>> > > So to me for a producer: serializer / interceptors => batching =>
    >>> > > compression => encryption => send.
    >>> > > and the inverse for a consumer.
    >>> > >
    >>> > > Regards
    >>> > > Stephane
    >>> > >
    >>> > > On 19 June 2018 at 06:46, Sönke Liebau <soenke.liebau@opencore.com
    >>> > .invalid>
    >>> > > wrote:
    >>> > >
    >>> > >> Hi everybody,
    >>> > >>
    >>> > >> I've created a draft version of KIP-317 which describes the addition
    >>> > >> of transparent data encryption functionality to Kafka.
    >>> > >>
    >>> > >> Please consider this as a basis for discussion - I am aware that
    >>> this
    >>> > >> is not at a level of detail sufficient for implementation, but I
    >>> > >> wanted to get some feedback from the community on the general idea
    >>> > >> before spending more time on this.
    >>> > >>
    >>> > >> Link to the KIP is:
    >>> > >> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-&amp;data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&amp;sdata=B7QY%2BWj4zwvdQkqVxLH0dpecT%2BEQzuR1luIctYTQFN8%3D&amp;reserved=0
    >>> > >> 317%3A+Add+transparent+data+encryption+functionality
    >>> > >>
    >>> > >> Best regards,
    >>> > >> Sönke
    >>> > >>
    >>> >
    >>> >
    >>> >
    >>> > --
    >>> > Sönke Liebau
    >>> > Partner
    >>> > Tel. +49 179 7940878
    >>> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
    >>> >
    >>>
    >>
    >>
    >>
    >> --
    >> Sönke Liebau
    >> Partner
    >> Tel. +49 179 7940878
    >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
    >>
    >
    >
    > --
    > Sönke Liebau
    > Partner
    > Tel. +49 179 7940878
    > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
    >
    
    
    -- 
    Sönke Liebau
    Partner
    Tel. +49 179 7940878
    OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany

Re: [DISCUSS] KIP-317: Transparent Data Encryption

Posted by Sönke Liebau <so...@opencore.com.INVALID>.

Hi,

I have so far received pretty much no comments on the technical details
outlined in the KIP. While I am happy to continue with my own ideas of how
to implement this, I would much prefer to at least get a very broad "looks
good in principle, but still lots to flesh out" from a few people before I
but more work into this.

Best regards,
Sönke




On Tue, 21 May 2019 at 14:15, Sönke Liebau <so...@opencore.com>
wrote:

> Hi everybody,
>
> I'd like to rekindle the discussion around KIP-317.
> I have reworked the KIP a little bit in order to design everything as a
> pluggable implementation. During the course of that work I've also decided
> to rename the KIP, as encryption will only be transparent in some cases. It
> is now called "Add end to end data encryption functionality to Apache
> Kafka" [1].
>
> I'd very much appreciate it if you could give the KIP a quick read. This
> is not at this point a fully fleshed out design, as I would like to agree
> on the underlying structure that I came up with first, before spending time
> on details.
>
> TL/DR is:
> Create three pluggable classes:
> KeyManager runs on the broker and manages which keys to use, key rollover
> etc
> KeyProvider runs on the client and retrieves keys based on what the
> KeyManager tells it
> EncryptionEngine runs on the client andhandles the actual encryption
> First idea of control flow between these components can be seen at [2]
>
> Please let me know any thoughts or concerns that you may have!
>
> Best regards,
> Sönke
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+end-to-end+data+encryption+functionality+to+Apache+Kafka
> [2]
> https://cwiki.apache.org/confluence/download/attachments/85479936/kafka_e2e-encryption_control-flow.png?version=1&modificationDate=1558439227551&api=v2
>
>
>
> On Fri, 10 Aug 2018 at 14:05, Sönke Liebau <so...@opencore.com>
> wrote:
>
>> Hi Viktor,
>>
>> thanks for your input! We could accommodate magic headers by removing any
>> known fixed bytes pre-encryption, sticking them in a header field and
>> prepending them after decryption. However, I am not sure whether this is
>> actually necessary, as most modern (AES for sure) algorithms are considered
>> to be resistant to known-plaintext types of attack. Even if the entire
>> plaintext is known to the attacker he still needs to brute-force the key -
>> which may take a while.
>>
>> Something different to consider in this context are compression
>> sidechannel attacks like CRIME or BREACH, which may be relevant depending
>> on what type of data is being sent through Kafka. Both these attacks depend
>> on the encrypted record containing a combination of secret and user
>> controlled data.
>> For example if Kafka was used to forward data that the user entered on a
>> website along with a secret API key that the website adds to a back-end
>> server and the user can obtain the Kafka messages, these attacks would
>> become relevant. Not much we can do about that except disallow encryption
>> when compression is enabled (TLS chose this approach in version 1.3)
>>
>> I agree with you, that we definitely need to clearly document any risks
>> and how much security can reasonably be expected in any given scenario. We
>> might even consider logging a warning message when sending data that is
>> compressed and encrypted.
>>
>> On a different note, I've started amending the KIP to make key management
>> and distribution pluggable, should hopefully be able to publish sometime
>> Monday.
>>
>> Best regards,
>> Sönke
>>
>>
>> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi <viktorsomogyi@gmail.com
>> > wrote:
>>
>>> Hi Sönke,
>>>
>>> Compressing before encrypting has its dangers as well. Suppose you have a
>>> known compression format which adds a magic header and you're using a
>>> block
>>> cipher with a small enough block, then it becomes much easier to figure
>>> out
>>> the encryption key. For instance you can look at Snappy's stream
>>> identifier:
>>> https://github.com/google/snappy/blob/master/framing_format.txt
>>> . Based on this you should only use block ciphers where block sizes are
>>> much larger then 6 bytes. AES for instance should be good with its 128
>>> bits
>>> = 16 bytes but even this isn't entirely secure as the first 6 bytes
>>> already
>>> leaked some information - and it depends on the cypher that how much it
>>> is.
>>> Also if we suppose that an adversary accesses a broker and takes all the
>>> data, they'll have much easier job to decrypt it as they'll have much
>>> more
>>> examples.
>>> So overall we should make sure to define and document the compatible
>>> encryptions with the supported compression methods and the level of
>>> security they provide to make sure the users are fully aware of the
>>> security implications.
>>>
>>> Cheers,
>>> Viktor
>>>
>>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau
>>> <so...@opencore.com.invalid> wrote:
>>>
>>> > Hi Stephane,
>>> >
>>> > thanks for pointing out the broken pictures, I fixed those.
>>> >
>>> > Regarding encrypting before or after batching the messages, you are
>>> > correct, I had not thought of compression and how this changes things.
>>> > Encrypted data does not really encrypt well. My reasoning at the time
>>> > of writing was that if we encrypt the entire batch we'd have to wait
>>> > for the batch to be full before starting to encrypt. Whereas with per
>>> > message encryption we can encrypt them as they come in and more or
>>> > less have them ready for sending when the batch is complete.
>>> > However I think the difference will probably not be that large (will
>>> > do some testing) and offset by just encrypting once instead of many
>>> > times, which has a certain overhead every time. Also, from a security
>>> > perspective encrypting longer chunks of data is preferable - another
>>> > benefit.
>>> >
>>> > This does however take away the ability of the broker to see the
>>> > individual records inside the encrypted batch, so this would need to
>>> > be stored and retrieved as a single record - just like is done for
>>> > compressed batches. I am not 100% sure that this won't create issues,
>>> > especially when considering transactions, I will need to look at the
>>> > compression code some more. In essence though, since it works for
>>> > compression I see no reason why it can't be made to work here.
>>> >
>>> > On a different note, going down this route might make us reconsider
>>> > storing the key with the data, as this might significantly reduce
>>> > storage overhead - still much higher than just storing them once
>>> > though.
>>> >
>>> > Best regards,
>>> > Sönke
>>> >
>>> > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek
>>> > <st...@simplemachines.com.au> wrote:
>>> > > Hi Sonke
>>> > >
>>> > > Very much needed feature and discussion. FYI the image links seem
>>> broken.
>>> > >
>>> > > My 2 cents (if I understood correctly): you say "This process will be
>>> > > implemented after Serializer and Interceptors are done with the
>>> message
>>> > > right before it is added to the batch to be sent, in order to ensure
>>> that
>>> > > existing serializers and interceptors keep working with encryption
>>> just
>>> > > like without it."
>>> > >
>>> > > I think encryption should happen AFTER a batch is created, right
>>> before
>>> > it
>>> > > is sent. Reason is that if we want to still keep advantage of
>>> > compression,
>>> > > encryption needs to happen after it (and I believe compression
>>> happens
>>> > on a
>>> > > batch level).
>>> > > So to me for a producer: serializer / interceptors => batching =>
>>> > > compression => encryption => send.
>>> > > and the inverse for a consumer.
>>> > >
>>> > > Regards
>>> > > Stephane
>>> > >
>>> > > On 19 June 2018 at 06:46, Sönke Liebau <soenke.liebau@opencore.com
>>> > .invalid>
>>> > > wrote:
>>> > >
>>> > >> Hi everybody,
>>> > >>
>>> > >> I've created a draft version of KIP-317 which describes the addition
>>> > >> of transparent data encryption functionality to Kafka.
>>> > >>
>>> > >> Please consider this as a basis for discussion - I am aware that
>>> this
>>> > >> is not at a level of detail sufficient for implementation, but I
>>> > >> wanted to get some feedback from the community on the general idea
>>> > >> before spending more time on this.
>>> > >>
>>> > >> Link to the KIP is:
>>> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> > >> 317%3A+Add+transparent+data+encryption+functionality
>>> > >>
>>> > >> Best regards,
>>> > >> Sönke
>>> > >>
>>> >
>>> >
>>> >
>>> > --
>>> > Sönke Liebau
>>> > Partner
>>> > Tel. +49 179 7940878
>>> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>>> >
>>>
>>
>>
>>
>> --
>> Sönke Liebau
>> Partner
>> Tel. +49 179 7940878
>> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>>
>
>
> --
> Sönke Liebau
> Partner
> Tel. +49 179 7940878
> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>


-- 
Sönke Liebau
Partner
Tel. +49 179 7940878
OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany