You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by Gidon Gershinsky <gg...@gmail.com> on 2021/03/25 15:24:25 UTC

Key rotation in Iceberg data encryption

Hi all,

We're working with Jack on a design for encryption of Iceberg data tables,
and got a question / decision point we'd like to bring to the community's
attention. Might be a bit exotic, but is important, so we have to try this.
Any input on this subject, or pointers to relevant contacts / sources will
be appreciated.

A rather long text below; I tried to make it as short as possible to
explain the question.

We use the standard envelope encryption approach, where the data is
encrypted with a "data encryption key" (DEK). There are lots of DEKs in a
table, because we must generate a key per file/column (this is related to
NIST requirements for cipher usage). Envelope encryption means that the
many DEKs are encrypted with a few MEKs ("master encryption keys"). There
could be just one MEK for the whole table, or for many tables; or a MEK per
sensitive column. The MEKs are managed in a KMS ("key management service")
- which stores them, and handles their access control.

DEKs encrypted with MEKs are stored close to the data. Currently, in the
"key_metadata" field in the Iceberg manifest files.

Envelope encryption practice requires "key rotation", where the MEKs are
replaced from time to time (in ~ weeks or months, as a precaution, or to
limit the number of crypto operations as required by NIST; or after being
compromised). MEK ID stays the same, but the key contents (and version)
change.

This means we have to delete all manifest files after the key rotation.
Because they keep DEKs encrypted with the previous version of a MEK, which
is not safe anymore.

To avoid this - and to minimize Iceberg-KMS interactions (KMSs can be
slow), we add a double envelope encryption mode (already in use in
Parquet), where DEKs are encrypted with an intermediate KEK ("key
encryption key"), which in turn is encrypted with a MEKs. There are less
KEKs than DEKs (e.g. one KEK per writer process lifetime; or per day; or
per N DEKs; or per partition; or per table; etc), but there are more KEKs
than MEKs.

The question is - upon MEK rotation, do we have to replace KEKs?
If not, then we can keep DEKs encrypted with KEKs in the manifest files -
which do not have to be deleted / replaced upon MEK rotation. KEKs
encrypted with MEKs will be kept elsewhere, in a mutable/replaceable medium
(which is easier, because there are much less KEKs than DEKs).
If yes, then we either have to replace all manifest files in a table (once
in a few weeks or months), or to keep "key_metadata" outside manifests,
e.g. in new file types (like with the bloom filters). The size of the
key_metadata entry - per data file - ranges from a few dozen bytes to a few
dozen kilobytes.

Cheers, Gidon

Re: Key rotation in Iceberg data encryption

Posted by Gidon Gershinsky <gg...@gmail.com>.
Sounds good. Giving the users a tool, and the decision to make on whether
to rotate a KEK and replace the manifest file, is a flexible way to address
this for now. As we gather more information on the safety of unrotated
KEKs, and on the consequences of replacing the manifest files, we can
either document the recommendations, or update the mechanism to enforce
certain policies.

Cheers, Gidon


On Fri, Mar 26, 2021 at 6:10 AM Ye, Jack <yz...@amazon.com.invalid>
wrote:

> Yes, I totally agree with Russell that key rotation should be treated as
> something like a rewrite manifest action, and when the rewrite completes,
> the old files with old keys can be expired in a separated snapshot
> expiration action. Because of requirements like GDPR, this expiration would
> happen even if there is no one manually executing that expiration step. A
> flag can be added in the procedure to force expiring that snapshot during
> key rotation if needed. In the doc, I have designed this action as a call
> procedure to resemble its similarity with rewrite manifest.
>
>
>
> I also agree that we should try to not use external files. For bloom
> filters, it might be fine because (1) the filter might be too large to fit
> in the manifest, and (2) once written we do not need to update it anymore.
> The encryption key is the complete opposite situation. As the stored
> procedure is executed distributedly anyway, I would not expect key rotation
> to cause too much burden on manifest rewriting. I have also worked in a
> system that stores encryption info in manifest, and I don’t see a
> scalability issue as long as the manifests are maintained properly to be of
> reasonable size. (if manifest is too large, all sorts of scalability issue
> would arise so it’s not just for the encryption key, and that’s why we have
> the rewrite manifest procedure at the first place)
>
>
>
> Based on the discussions so far, I provided the following updates to the
> doc:
>
>    1. Added concept Iceberg Encryption Key (IEK) as a generalized master
>    key which can either be single-wrap (KEK ID) or double-wrap (MEK ID +
>    encrypted KEK), just to make the rest of the doc cleaner without the
>    repeated reference to single and double wrap.
>    2. Added concept of KeyResolver to tackle the problem of needing many
>    KEKs for per partition, per day use cases, so that storing those keys in
>    table metadata is scalable, and a mutable/replaceable medium can be plugged
>    in if necessary.
>    3. In the key rotation procedure, added an option to allow users to
>    (1) rotate KEK in single wrap, (2) rotate KEK in double wrap, (3) rotate
>    MEK only in double wrap, so that the system is flexible enough to handle
>    all of those cases.
>    4. In the key rotation procedure, added an option to force expiring
>    the old table version.
>
>
>
> With these changes, I think we can avoid the question if we need to
> replace KEK or not for double wrapping, because the amount of work for 3.1
> and 3.2 are mostly the same, with 3.2 just need to also rewrap the KEKs if
> MEK is rotated, but that only changes a single table metadata file. People
> who do not want to rotate KEK can just use option 3.3.
>
>
>
> Best,
>
> Jack Ye
>
>
>
> *From: *Russell Spitzer <ru...@gmail.com>
> *Reply-To: *"dev@iceberg.apache.org" <de...@iceberg.apache.org>
> *Date: *Thursday, March 25, 2021 at 08:33
> *To: *Iceberg Dev List <de...@iceberg.apache.org>
> *Subject: *RE: [EXTERNAL] Key rotation in Iceberg data encryption
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> I think you can treat the key rotation as a spark action like
> "RewriteManifestsAction" or something like that which creates a new
> Snapshot and new set of manifest files. If we want to be secure we would
> follow this up by immediately exporting and deleting previous snapshots and
> manifests. One problem with this approach though is we basically lose the
> history of the table when we do this process.
>
> I'm not as big a fan of the new external file approach, mainly because I
> think we have a new object to keep track of and unlike bloom filters, this
> is extremely sensitive information. I think we could go this route but we
> should be very careful. Although this would probably let us update keys
> without losing history ...?
>
>
>
> On Thu, Mar 25, 2021 at 10:24 AM Gidon Gershinsky <gg...@gmail.com>
> wrote:
>
> Hi all,
>
>
>
> We're working with Jack on a design for encryption of Iceberg data tables,
> and got a question / decision point we'd like to bring to the community's
> attention. Might be a bit exotic, but is important, so we have to try this.
> Any input on this subject, or pointers to relevant contacts / sources will
> be appreciated.
>
>
>
> A rather long text below; I tried to make it as short as possible to
> explain the question.
>
>
>
> We use the standard envelope encryption approach, where the data is
> encrypted with a "data encryption key" (DEK). There are lots of DEKs in a
> table, because we must generate a key per file/column (this is related to
> NIST requirements for cipher usage). Envelope encryption means that the
> many DEKs are encrypted with a few MEKs ("master encryption keys"). There
> could be just one MEK for the whole table, or for many tables; or a MEK per
> sensitive column. The MEKs are managed in a KMS ("key management service")
> - which stores them, and handles their access control.
>
>
>
> DEKs encrypted with MEKs are stored close to the data. Currently, in the
> "key_metadata" field in the Iceberg manifest files.
>
>
>
> Envelope encryption practice requires "key rotation", where the MEKs are
> replaced from time to time (in ~ weeks or months, as a precaution, or to
> limit the number of crypto operations as required by NIST; or after being
> compromised). MEK ID stays the same, but the key contents (and version)
> change.
>
>
>
> This means we have to delete all manifest files after the key rotation.
> Because they keep DEKs encrypted with the previous version of a MEK, which
> is not safe anymore.
>
>
>
> To avoid this - and to minimize Iceberg-KMS interactions (KMSs can be
> slow), we add a double envelope encryption mode (already in use in
> Parquet), where DEKs are encrypted with an intermediate KEK ("key
> encryption key"), which in turn is encrypted with a MEKs. There are less
> KEKs than DEKs (e.g. one KEK per writer process lifetime; or per day; or
> per N DEKs; or per partition; or per table; etc), but there are more KEKs
> than MEKs.
>
>
>
> The question is - upon MEK rotation, do we have to replace KEKs?
>
> If not, then we can keep DEKs encrypted with KEKs in the manifest files -
> which do not have to be deleted / replaced upon MEK rotation. KEKs
> encrypted with MEKs will be kept elsewhere, in a mutable/replaceable medium
> (which is easier, because there are much less KEKs than DEKs).
>
> If yes, then we either have to replace all manifest files in a table (once
> in a few weeks or months), or to keep "key_metadata" outside manifests,
> e.g. in new file types (like with the bloom filters). The size of the
> key_metadata entry - per data file - ranges from a few dozen bytes to a few
> dozen kilobytes.
>
>
>
> Cheers, Gidon
>
>

Re: Key rotation in Iceberg data encryption

Posted by "Ye, Jack" <yz...@amazon.com.INVALID>.
Yes, I totally agree with Russell that key rotation should be treated as something like a rewrite manifest action, and when the rewrite completes, the old files with old keys can be expired in a separated snapshot expiration action. Because of requirements like GDPR, this expiration would happen even if there is no one manually executing that expiration step. A flag can be added in the procedure to force expiring that snapshot during key rotation if needed. In the doc, I have designed this action as a call procedure to resemble its similarity with rewrite manifest.

I also agree that we should try to not use external files. For bloom filters, it might be fine because (1) the filter might be too large to fit in the manifest, and (2) once written we do not need to update it anymore. The encryption key is the complete opposite situation. As the stored procedure is executed distributedly anyway, I would not expect key rotation to cause too much burden on manifest rewriting. I have also worked in a system that stores encryption info in manifest, and I don’t see a scalability issue as long as the manifests are maintained properly to be of reasonable size. (if manifest is too large, all sorts of scalability issue would arise so it’s not just for the encryption key, and that’s why we have the rewrite manifest procedure at the first place)

Based on the discussions so far, I provided the following updates to the doc:

  1.  Added concept Iceberg Encryption Key (IEK) as a generalized master key which can either be single-wrap (KEK ID) or double-wrap (MEK ID + encrypted KEK), just to make the rest of the doc cleaner without the repeated reference to single and double wrap.
  2.  Added concept of KeyResolver to tackle the problem of needing many KEKs for per partition, per day use cases, so that storing those keys in table metadata is scalable, and a mutable/replaceable medium can be plugged in if necessary.
  3.  In the key rotation procedure, added an option to allow users to (1) rotate KEK in single wrap, (2) rotate KEK in double wrap, (3) rotate MEK only in double wrap, so that the system is flexible enough to handle all of those cases.
  4.  In the key rotation procedure, added an option to force expiring the old table version.

With these changes, I think we can avoid the question if we need to replace KEK or not for double wrapping, because the amount of work for 3.1 and 3.2 are mostly the same, with 3.2 just need to also rewrap the KEKs if MEK is rotated, but that only changes a single table metadata file. People who do not want to rotate KEK can just use option 3.3.

Best,
Jack Ye

From: Russell Spitzer <ru...@gmail.com>
Reply-To: "dev@iceberg.apache.org" <de...@iceberg.apache.org>
Date: Thursday, March 25, 2021 at 08:33
To: Iceberg Dev List <de...@iceberg.apache.org>
Subject: RE: [EXTERNAL] Key rotation in Iceberg data encryption


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


I think you can treat the key rotation as a spark action like "RewriteManifestsAction" or something like that which creates a new Snapshot and new set of manifest files. If we want to be secure we would follow this up by immediately exporting and deleting previous snapshots and manifests. One problem with this approach though is we basically lose the history of the table when we do this process.

I'm not as big a fan of the new external file approach, mainly because I think we have a new object to keep track of and unlike bloom filters, this is extremely sensitive information. I think we could go this route but we should be very careful. Although this would probably let us update keys without losing history ...?

On Thu, Mar 25, 2021 at 10:24 AM Gidon Gershinsky <gg...@gmail.com>> wrote:
Hi all,

We're working with Jack on a design for encryption of Iceberg data tables, and got a question / decision point we'd like to bring to the community's attention. Might be a bit exotic, but is important, so we have to try this. Any input on this subject, or pointers to relevant contacts / sources will be appreciated.


A rather long text below; I tried to make it as short as possible to explain the question.


We use the standard envelope encryption approach, where the data is encrypted with a "data encryption key" (DEK). There are lots of DEKs in a table, because we must generate a key per file/column (this is related to NIST requirements for cipher usage). Envelope encryption means that the many DEKs are encrypted with a few MEKs ("master encryption keys"). There could be just one MEK for the whole table, or for many tables; or a MEK per sensitive column. The MEKs are managed in a KMS ("key management service") - which stores them, and handles their access control.


DEKs encrypted with MEKs are stored close to the data. Currently, in the "key_metadata" field in the Iceberg manifest files.


Envelope encryption practice requires "key rotation", where the MEKs are replaced from time to time (in ~ weeks or months, as a precaution, or to limit the number of crypto operations as required by NIST; or after being compromised). MEK ID stays the same, but the key contents (and version) change.


This means we have to delete all manifest files after the key rotation. Because they keep DEKs encrypted with the previous version of a MEK, which is not safe anymore.


To avoid this - and to minimize Iceberg-KMS interactions (KMSs can be slow), we add a double envelope encryption mode (already in use in Parquet), where DEKs are encrypted with an intermediate KEK ("key encryption key"), which in turn is encrypted with a MEKs. There are less KEKs than DEKs (e.g. one KEK per writer process lifetime; or per day; or per N DEKs; or per partition; or per table; etc), but there are more KEKs than MEKs.


The question is - upon MEK rotation, do we have to replace KEKs?
If not, then we can keep DEKs encrypted with KEKs in the manifest files - which do not have to be deleted / replaced upon MEK rotation. KEKs encrypted with MEKs will be kept elsewhere, in a mutable/replaceable medium (which is easier, because there are much less KEKs than DEKs).
If yes, then we either have to replace all manifest files in a table (once in a few weeks or months), or to keep "key_metadata" outside manifests, e.g. in new file types (like with the bloom filters). The size of the key_metadata entry - per data file - ranges from a few dozen bytes to a few dozen kilobytes.


Cheers, Gidon

Re: Key rotation in Iceberg data encryption

Posted by Russell Spitzer <ru...@gmail.com>.
I think you can treat the key rotation as a spark action like
"RewriteManifestsAction" or something like that which creates a new
Snapshot and new set of manifest files. If we want to be secure we would
follow this up by immediately exporting and deleting previous snapshots and
manifests. One problem with this approach though is we basically lose the
history of the table when we do this process.

I'm not as big a fan of the new external file approach, mainly because I
think we have a new object to keep track of and unlike bloom filters, this
is extremely sensitive information. I think we could go this route but we
should be very careful. Although this would probably let us update keys
without losing history ...?

On Thu, Mar 25, 2021 at 10:24 AM Gidon Gershinsky <gg...@gmail.com> wrote:

> Hi all,
>
> We're working with Jack on a design for encryption of Iceberg data tables,
> and got a question / decision point we'd like to bring to the community's
> attention. Might be a bit exotic, but is important, so we have to try this.
> Any input on this subject, or pointers to relevant contacts / sources will
> be appreciated.
>
> A rather long text below; I tried to make it as short as possible to
> explain the question.
>
> We use the standard envelope encryption approach, where the data is
> encrypted with a "data encryption key" (DEK). There are lots of DEKs in a
> table, because we must generate a key per file/column (this is related to
> NIST requirements for cipher usage). Envelope encryption means that the
> many DEKs are encrypted with a few MEKs ("master encryption keys"). There
> could be just one MEK for the whole table, or for many tables; or a MEK per
> sensitive column. The MEKs are managed in a KMS ("key management service")
> - which stores them, and handles their access control.
>
> DEKs encrypted with MEKs are stored close to the data. Currently, in the
> "key_metadata" field in the Iceberg manifest files.
>
> Envelope encryption practice requires "key rotation", where the MEKs are
> replaced from time to time (in ~ weeks or months, as a precaution, or to
> limit the number of crypto operations as required by NIST; or after being
> compromised). MEK ID stays the same, but the key contents (and version)
> change.
>
> This means we have to delete all manifest files after the key rotation.
> Because they keep DEKs encrypted with the previous version of a MEK, which
> is not safe anymore.
>
> To avoid this - and to minimize Iceberg-KMS interactions (KMSs can be
> slow), we add a double envelope encryption mode (already in use in
> Parquet), where DEKs are encrypted with an intermediate KEK ("key
> encryption key"), which in turn is encrypted with a MEKs. There are less
> KEKs than DEKs (e.g. one KEK per writer process lifetime; or per day; or
> per N DEKs; or per partition; or per table; etc), but there are more KEKs
> than MEKs.
>
> The question is - upon MEK rotation, do we have to replace KEKs?
> If not, then we can keep DEKs encrypted with KEKs in the manifest files -
> which do not have to be deleted / replaced upon MEK rotation. KEKs
> encrypted with MEKs will be kept elsewhere, in a mutable/replaceable medium
> (which is easier, because there are much less KEKs than DEKs).
> If yes, then we either have to replace all manifest files in a table (once
> in a few weeks or months), or to keep "key_metadata" outside manifests,
> e.g. in new file types (like with the bloom filters). The size of the
> key_metadata entry - per data file - ranges from a few dozen bytes to a few
> dozen kilobytes.
>
> Cheers, Gidon
>