You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/14 14:49:05 UTC

[GitHub] [arrow] pitrou commented on a change in pull request #10450: ARROW-9947: [Python] High-level Python API for Parquet encryption of files.

pitrou commented on a change in pull request #10450:
URL: https://github.com/apache/arrow/pull/10450#discussion_r768693687



##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption

Review comment:
       ```suggestion
   Reading and writing encrypted Parquet files involves passing file encryption
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:

Review comment:
       ```suggestion
   Writing an encrypted Parquet file:
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()

Review comment:
       I'm not sure this snippet is useful, since `MyKmsClient` shows the API to implement below.

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``

Review comment:
       Can you add the additional classes to `docs/source/python/api/formats.rst` (in the Parquet section), then use the proper reference markup here (probably ```:class:`~pyarrow.parquet.CryptoFactory` ```) ?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.

Review comment:
       What is this, a list of tuples?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.

Review comment:
       ```suggestion
   * ``plaintext_footer``, whether to write the file footer in plain text (otherwise it is encrypted).
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS

Review comment:
       ```suggestion
   The master encryption keys should be kept and managed in a production-grade KMS
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is
+  stored in separate files in the same folder, which enables key rotation for
+  immutable Parquet files.
+* ``data_key_length_bits``, length of data encryption keys (DEKs), randomly
+  generated by parquet key management tools. Can be 128, 192 or 256 bits.

Review comment:
       ```suggestion
     generated by Parquet key management tools. Can be 128, 192 or 256 bits.
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)

Review comment:
       Thanks a lot for the detailed doc here!

##########
File path: python/pyarrow/_parquet.pyx
##########
@@ -1352,6 +1366,42 @@ cdef shared_ptr[ArrowWriterProperties] _create_arrow_writer_properties(
     return arrow_properties
 
 
+cdef ParquetCipher cipher_from_name(name):
+    name = name.upper()
+    if name == 'AES_GCM_V1':
+        return ParquetCipher_AES_GCM_V1
+    elif name == 'AES_GCM_CTR_V1':
+        return ParquetCipher_AES_GCM_CTR_V1
+    else:
+        raise ValueError('Invalid value for algorithm: {0}'.format(name))

Review comment:
       ```suggestion
           raise ValueError(f'Invalid cipher name: {name!r}')
   ```

##########
File path: python/examples/minimal_build/build_venv.sh
##########
@@ -54,6 +54,7 @@ cmake -GNinja \
       -DARROW_WITH_SNAPPY=ON \
       -DARROW_WITH_BROTLI=ON \
       -DARROW_PARQUET=ON \
+      -DPARQUET_REQUIRE_ENCRYPTION=ON \

Review comment:
       @andersonm-ibm This is right. Ideally this would be optional, though.

##########
File path: python/pyarrow/_parquet.pyx
##########
@@ -1464,3 +1517,429 @@ cdef class ParquetWriter(_Weakrefable):
             return result
         raise RuntimeError(
             'file metadata is only available after writer close')
+
+cdef class EncryptionConfiguration(_Weakrefable):
+    """Configuration of the encryption, such as which columns to encrypt"""
+    cdef:
+        shared_ptr[CEncryptionConfiguration] configuration
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, footer_key, *, column_keys=None,
+                 uniform_encryption=None, encryption_algorithm=None,
+                 plaintext_footer=None, double_wrapping=None,
+                 cache_lifetime=None, internal_key_material=None,
+                 data_key_length_bits=None):
+        self.configuration.reset(
+            new CEncryptionConfiguration(tobytes(footer_key)))
+        if column_keys is not None:
+            self.column_keys = column_keys
+        if uniform_encryption is not None:
+            self.uniform_encryption = uniform_encryption
+        if encryption_algorithm is not None:
+            self.encryption_algorithm = encryption_algorithm
+        if plaintext_footer is not None:
+            self.plaintext_footer = plaintext_footer
+        if double_wrapping is not None:
+            self.double_wrapping = double_wrapping
+        if cache_lifetime is not None:
+            self.cache_lifetime = cache_lifetime
+        if internal_key_material is not None:
+            self.internal_key_material = internal_key_material
+        if data_key_length_bits is not None:
+            self.data_key_length_bits = data_key_length_bits
+
+    @property
+    def footer_key(self):
+        """ID of the master key for footer encryption/signing"""
+        return frombytes(self.configuration.get().footer_key)
+
+    @property
+    def column_keys(self):
+        """
+        List of columns to encrypt, with master key IDs.
+        """
+        column_keys_str = frombytes(self.configuration.get().column_keys)
+        # Convert from "masterKeyID:colName,colName;masterKeyID:colName..."
+        # (see HIVE-21848) to dictionary of master key ID to column name lists
+        column_keys_to_key_list_str = dict(subString.replace(" ", "").split(
+            ":") for subString in column_keys_str.split(";"))
+        column_keys_dict = {k: v.split(
+            ",") for k, v in column_keys_to_key_list_str.items()}
+        return column_keys_dict
+
+    @column_keys.setter
+    def column_keys(self, dict value):
+        if value is not None:
+            # convert a dictionary such as
+            # '{"key1": ["col1 ", "col2"], "key2": ["col3 ", "col4"]}''
+            # to the string defined by the spec
+            # 'key1: col1 , col2; key2: col3 , col4'
+            column_keys = "; ".join(
+                ["{}: {}".format(k, ", ".join(v)) for k, v in value.items()])
+            self.configuration.get().column_keys = tobytes(column_keys)
+
+    @property
+    def uniform_encryption(self):
+        """Encrypt footer and all columns with the same encryption key."""
+        return self.configuration.get().uniform_encryption
+
+    @uniform_encryption.setter
+    def uniform_encryption(self, value):
+        self.configuration.get().uniform_encryption = value
+
+    @property
+    def encryption_algorithm(self):
+        """Parquet encryption algorithm.
+        Can be "AES_GCM_V1" (default), or "AES_GCM_CTR_V1"."""
+        return cipher_to_name(self.configuration.get().encryption_algorithm)
+
+    @encryption_algorithm.setter
+    def encryption_algorithm(self, value):
+        cipher = cipher_from_name(value)
+        self.configuration.get().encryption_algorithm = cipher
+
+    @property
+    def plaintext_footer(self):
+        """Write files with plaintext footer."""
+        return self.configuration.get().plaintext_footer
+
+    @plaintext_footer.setter
+    def plaintext_footer(self, value):
+        self.configuration.get().plaintext_footer = value
+
+    @property
+    def double_wrapping(self):
+        """Use double wrapping - where data encryption keys (DEKs) are
+        encrypted with key encryption keys (KEKs), which in turn are
+        encrypted with master keys.
+        If set to false, use single wrapping - where DEKs are
+        encrypted directly with master keys."""
+        return self.configuration.get().double_wrapping
+
+    @double_wrapping.setter
+    def double_wrapping(self, value):
+        self.configuration.get().double_wrapping = value
+
+    @property
+    def cache_lifetime(self):
+        """Lifetime of cached entities (key encryption keys,
+        local wrapping keys, KMS client objects)."""
+        return timedelta(
+            seconds=self.configuration.get().cache_lifetime_seconds)
+
+    @cache_lifetime.setter
+    def cache_lifetime(self, value):
+        if not isinstance(value, timedelta):
+            raise TypeError("cache_lifetime should be a timedelta")
+        self.configuration.get().cache_lifetime_seconds = value.total_seconds()
+
+    @property
+    def internal_key_material(self):
+        """Store key material inside Parquet file footers; this mode doesn’t
+        produce additional files. If set to false, key material is stored in
+        separate files in the same folder, which enables key rotation for
+        immutable Parquet files."""
+        return self.configuration.get().internal_key_material
+
+    @internal_key_material.setter
+    def internal_key_material(self, value):
+        self.configuration.get().internal_key_material = value
+
+    @property
+    def data_key_length_bits(self):
+        """Length of data encryption keys (DEKs), randomly generated by parquet key
+        management tools. Can be 128, 192 or 256 bits."""
+        return self.configuration.get().data_key_length_bits
+
+    @data_key_length_bits.setter
+    def data_key_length_bits(self, value):
+        self.configuration.get().data_key_length_bits = value
+
+    cdef inline shared_ptr[CEncryptionConfiguration] unwrap(self) nogil:
+        return self.configuration
+
+cdef class DecryptionConfiguration(_Weakrefable):
+    """Configuration of the decryption, such as cache timeout."""
+    cdef:
+        shared_ptr[CDecryptionConfiguration] configuration
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, *, cache_lifetime=None):
+        self.configuration.reset(new CDecryptionConfiguration())
+
+    @property
+    def cache_lifetime(self):
+        """Lifetime of cached entities (key encryption keys,
+        local wrapping keys, KMS client objects)."""
+        return timedelta(
+            seconds=self.configuration.get().cache_lifetime_seconds)
+
+    @cache_lifetime.setter
+    def cache_lifetime(self, value):
+        self.configuration.get().cache_lifetime_seconds = value.total_seconds()
+
+    cdef inline shared_ptr[CDecryptionConfiguration] unwrap(self) nogil:
+        return self.configuration
+
+
+cdef class KmsConnectionConfig(_Weakrefable):
+    """Configuration of the connection to the Key Management Service (KMS)"""
+    cdef:
+        shared_ptr[CKmsConnectionConfig] configuration
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, *, kms_instance_id=None, kms_instance_url=None,
+                 key_access_token=None, custom_kms_conf=None):
+        self.configuration.reset(new CKmsConnectionConfig())
+        if kms_instance_id is not None:
+            self.kms_instance_id = kms_instance_id
+        if kms_instance_url is not None:
+            self.kms_instance_url = kms_instance_url
+        if key_access_token is None:
+            self.key_access_token = b'DEFAULT'
+        else:
+            self.key_access_token = key_access_token
+        if custom_kms_conf is not None:
+            self.custom_kms_conf = custom_kms_conf
+
+    @property
+    def kms_instance_id(self):
+        """ID of the KMS instance that will be used for encryption
+        (if multiple KMS instances are available)."""
+        return frombytes(self.configuration.get().kms_instance_id)
+
+    @kms_instance_id.setter
+    def kms_instance_id(self, value):
+        self.configuration.get().kms_instance_id = tobytes(value)
+
+    @property
+    def kms_instance_url(self):
+        """URL of the KMS instance."""
+        return frombytes(self.configuration.get().kms_instance_url)
+
+    @kms_instance_url.setter
+    def kms_instance_url(self, value):
+        self.configuration.get().kms_instance_url = tobytes(value)
+
+    @property
+    def key_access_token(self):
+        """Authorization token that will be passed to KMS."""
+        return frombytes(self.configuration.get()
+                         .refreshable_key_access_token.get().value())
+
+    @key_access_token.setter
+    def key_access_token(self, value):
+        self.refresh_key_access_token(value)
+
+    @property
+    def custom_kms_conf(self):
+        """A dictionary with KMS-type-specific configuration"""
+        custom_kms_conf = {
+            frombytes(k): frombytes(v)
+            for k, v in self.configuration.get().custom_kms_conf
+        }
+        return custom_kms_conf
+
+    @custom_kms_conf.setter
+    def custom_kms_conf(self, dict value):
+        if value is not None:
+            for k, v in value.items():
+                if isinstance(k, str) and isinstance(v, str):
+                    self.configuration.get().custom_kms_conf[tobytes(k)] = \
+                        tobytes(v)
+                else:
+                    raise TypeError("Expected custom_kms_conf to be " +
+                                    "a dictionary of strings")
+
+    def refresh_key_access_token(self, value):
+        cdef:
+            shared_ptr[CKeyAccessToken] c_key_access_token = \
+                self.configuration.get().refreshable_key_access_token
+
+        c_key_access_token.get().Refresh(tobytes(value))
+
+    cdef inline shared_ptr[CKmsConnectionConfig] unwrap(self) nogil:
+        return self.configuration
+
+    @staticmethod
+    cdef wrap(const CKmsConnectionConfig& config):
+        result = KmsConnectionConfig()
+        result.configuration = make_shared[CKmsConnectionConfig](move(config))
+        return result
+
+# Callback definitions for CPyKmsClientVtable
+cdef void _cb_wrap_key(
+        handler, const c_string& key_bytes,
+        const c_string& master_key_identifier, c_string* out) except *:
+    mkid_str = frombytes(master_key_identifier)
+    wrapped_key = handler.wrap_key(key_bytes, mkid_str)
+    out[0] = tobytes(wrapped_key)
+
+cdef void _cb_unwrap_key(
+        handler, const c_string& wrapped_key,
+        const c_string& master_key_identifier, c_string* out) except *:
+    mkid_str = frombytes(master_key_identifier)
+    wk_str = frombytes(wrapped_key)
+    key = handler.unwrap_key(wk_str, mkid_str)
+    out[0] = tobytes(key)
+
+cdef class KmsClient(_Weakrefable):
+    """The abstract base class for KmsClient implementations."""
+    cdef:
+        shared_ptr[CKmsClient] client
+
+    def __init__(self):
+        self.init()
+
+    cdef init(self):
+        cdef:
+            CPyKmsClientVtable vtable = CPyKmsClientVtable()
+
+        vtable.wrap_key = _cb_wrap_key
+        vtable.unwrap_key = _cb_unwrap_key
+
+        self.client.reset(new CPyKmsClient(self, vtable))
+
+    def wrap_key(self, key_bytes, master_key_identifier):
+        """Wrap a key - encrypt it with the master key."""
+        raise NotImplementedError()
+
+    def unwrap_key(self, wrapped_key, master_key_identifier):
+        """Unwrap a key - decrypt it with the master key."""
+        raise NotImplementedError()
+
+    cdef inline shared_ptr[CKmsClient] unwrap(self) nogil:
+        return self.client
+
+
+# Callback definition for CPyKmsClientFactoryVtable
+cdef void _cb_create_kms_client(
+        handler,
+        const CKmsConnectionConfig& kms_connection_config,
+        shared_ptr[CKmsClient]* out) except *:
+    connection_config = KmsConnectionConfig.wrap(kms_connection_config)
+
+    result = handler(connection_config)
+    if not isinstance(result, KmsClient):
+        raise TypeError(
+            "callable must return KmsClient instances, but got {}".format(
+                type(result)))
+
+    out[0] = (<KmsClient> result).unwrap()
+
+cdef class CryptoFactory(_Weakrefable):
+    """ A factory that produces the low-level FileEncryptionProperties and
+    FileDecryptionProperties objects, from the high-level parameters."""
+    cdef:
+        unique_ptr[CPyCryptoFactory] factory
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, kms_client_factory):
+        """Create CryptoFactory.
+
+        Parameters
+        ----------
+        kms_client_factory : a callable that accepts KmsConnectionConfig
+            and returns a KmsClient
+        """
+        self.factory.reset(new CPyCryptoFactory())
+
+        if callable(kms_client_factory):
+            self.init(kms_client_factory)
+        else:
+            raise TypeError("Parameter kms_client_factory must be a callable")
+
+    cdef init(self, callable_client_factory):
+        cdef:
+            CPyKmsClientFactoryVtable vtable
+            shared_ptr[CPyKmsClientFactory] kms_client_factory
+
+        vtable.create_kms_client = _cb_create_kms_client
+        kms_client_factory.reset(
+            new CPyKmsClientFactory(callable_client_factory, vtable))
+        # A KmsClientFactory object must be registered
+        # via this method before calling any of
+        # file_encryption_properties()/file_decryption_properties() methods.
+        self.factory.get().RegisterKmsClientFactory(
+            static_pointer_cast[CKmsClientFactory, CPyKmsClientFactory](
+                kms_client_factory))
+
+    def file_encryption_properties(self,
+                                   KmsConnectionConfig kms_connection_config,
+                                   EncryptionConfiguration encryption_config):
+        """Create file encryption properties.
+
+        Parameters
+        ----------
+        kms_connection_config : KmsConnectionConfig
+            Configuration of connection to KMS
+
+        encryption_config : EncryptionConfiguration
+            Configuration of the encryption, such as which columns to encrypt
+
+        Returns
+        -------
+        file_encryption_properties : FileEncryptionProperties
+            File encryption properties.
+        """
+        cdef:
+            CResult[shared_ptr[CFileEncryptionProperties]] \
+                file_encryption_properties_result
+
+        file_encryption_properties_result = \
+            self.factory.get().SafeGetFileEncryptionProperties(

Review comment:
       Can this call make arbitrary IO and/or wait for an external resource? If so, should probably enclose this inside a `with nogil:` block.

##########
File path: cpp/src/parquet/encryption/key_toolkit_internal.cc
##########
@@ -31,7 +31,8 @@ static constexpr const int32_t kAcceptableDataKeyLengths[] = {128, 192, 256};
 std::string EncryptKeyLocally(const std::string& key_bytes, const std::string& master_key,
                               const std::string& aad) {
   AesEncryptor key_encryptor(ParquetCipher::AES_GCM_V1,
-                             static_cast<int>(master_key.size()), false);
+                             static_cast<int>(master_key.size()), false,
+                             false /*write_length*/);

Review comment:
       Does this fix an error in the implementation?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires

Review comment:
       Can you spell out "KMS" in full at least once? For example "Key Management System (KMS)".

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is
+  stored in separate files in the same folder, which enables key rotation for
+  immutable Parquet files.
+* ``data_key_length_bits``, length of data encryption keys (DEKs), randomly
+  generated by parquet key management tools. Can be 128, 192 or 256 bits.
+
+.. note::
+   By default, Parquet implements a "double envelope encryption" mode, that
+   minimizes the interaction of the program with a KMS server. In this mode,
+   the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated
+   by Parquet). The KEKs are encrypted with MEKs in KMS; the result and the
+   KEK itself are cached in the process memory. Users interested in regular
+   envelope encryption, can switch to it by setting the double_wrapping

Review comment:
       ```suggestion
      envelope encryption, can switch to it by setting the ``double_wrapping``
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:

Review comment:
       ```suggestion
   Reading an encrypted Parquet file:
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is
+  stored in separate files in the same folder, which enables key rotation for
+  immutable Parquet files.
+* ``data_key_length_bits``, length of data encryption keys (DEKs), randomly
+  generated by parquet key management tools. Can be 128, 192 or 256 bits.
+
+.. note::
+   By default, Parquet implements a "double envelope encryption" mode, that
+   minimizes the interaction of the program with a KMS server. In this mode,
+   the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated
+   by Parquet). The KEKs are encrypted with MEKs in KMS; the result and the
+   KEK itself are cached in the process memory. Users interested in regular
+   envelope encryption, can switch to it by setting the double_wrapping
+   parameter of EncryptionConfiguration to false.

Review comment:
       Can you use the proper Sphinx markup to link to the EncryptionConfiguration API here?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same

Review comment:
       ```suggestion
   * ``uniform_encryption``, whether to encrypt the footer and all columns with the same
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:

Review comment:
       Similar question here.

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following

Review comment:
       This is not very clear, what does "include" mean? Is ``kms_connection_config`` a dict? An instance of a particular type?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is
+  stored in separate files in the same folder, which enables key rotation for
+  immutable Parquet files.
+* ``data_key_length_bits``, length of data encryption keys (DEKs), randomly
+  generated by parquet key management tools. Can be 128, 192 or 256 bits.
+
+.. note::
+   By default, Parquet implements a "double envelope encryption" mode, that
+   minimizes the interaction of the program with a KMS server. In this mode,
+   the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated
+   by Parquet). The KEKs are encrypted with MEKs in KMS; the result and the
+   KEK itself are cached in the process memory. Users interested in regular
+   envelope encryption, can switch to it by setting the double_wrapping
+   parameter of EncryptionConfiguration to false.
+
+An example encryption configuration:
+
+.. code-block:: python
+
+   encryption_config = pq.EncryptionConfiguration(
+      footer_key="footer_key_name",
+      column_keys={
+         "column_key_name": ["Column1", "Column2"],
+      },
+   )
+
+Decryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+   
+Decryption configuration (``decryption_config`` used when creating file
+decryption properties) is optional and it includes the following options:

Review comment:
       Similar question as for encryption configuration.

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)

Review comment:
       In which unit is this expressed?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)

Review comment:
       ```suggestion
   * ``double_wrapping``, whether to use double wrapping - where data encryption keys (DEKs)
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is
+  stored in separate files in the same folder, which enables key rotation for
+  immutable Parquet files.
+* ``data_key_length_bits``, length of data encryption keys (DEKs), randomly
+  generated by parquet key management tools. Can be 128, 192 or 256 bits.
+
+.. note::
+   By default, Parquet implements a "double envelope encryption" mode, that
+   minimizes the interaction of the program with a KMS server. In this mode,
+   the DEKs are encrypted with "key encryption keys" (KEKs, randomly generated
+   by Parquet). The KEKs are encrypted with MEKs in KMS; the result and the
+   KEK itself are cached in the process memory. Users interested in regular
+   envelope encryption, can switch to it by setting the double_wrapping
+   parameter of EncryptionConfiguration to false.
+
+An example encryption configuration:
+
+.. code-block:: python
+
+   encryption_config = pq.EncryptionConfiguration(
+      footer_key="footer_key_name",
+      column_keys={
+         "column_key_name": ["Column1", "Column2"],
+      },
+   )
+
+Decryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+   
+Decryption configuration (``decryption_config`` used when creating file
+decryption properties) is optional and it includes the following options:
+
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys, local
+  wrapping keys, KMS client objects).

Review comment:
       In which unit is the lifetime expressed?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;

Review comment:
       ```suggestion
   * ``internal_key_material``, whether to store key material inside Parquet file footers;
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are

Review comment:
       ```suggestion
     with master keys. If set to ``false``, single wrapping is used - where DEKs are
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.

Review comment:
       ```suggestion
   * ``encryption_algorithm``, the Parquet encryption algorithm.
   ```

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is
+  stored in separate files in the same folder, which enables key rotation for
+  immutable Parquet files.

Review comment:
       For the record, by "key material", we mean the MEK-encrypted DEKs and KEKs? The DEKs don't change, but their encrypted storage might due to rotation of KEKs and/or MEKs?

##########
File path: docs/source/python/parquet.rst
##########
@@ -595,3 +595,172 @@ One example is Azure Blob storage, which can be interfaced through the
 
     abfs = AzureBlobFileSystem(account_name="XXXX", account_key="XXXX", container_name="XXXX")
     table = pq.read_table("file.parquet", filesystem=abfs)
+
+Parquet Modular Encryption (Columnar Encryption)
+------------------------------------------------
+
+Columnar encryption is supported for Parquet files in C++ starting from
+Apache Arrow 4.0.0 and in PyArrow starting from Apache Arrow 6.0.0.
+
+Parquet uses the envelope encryption practice, where file parts are encrypted
+with "data encryption keys" (DEKs), and the DEKs are encrypted with "master
+encryption keys" (MEKs). The DEKs are randomly generated by Parquet for each
+encrypted file/column. The MEKs are generated, stored and managed in a Key
+Management Service (KMS) of user’s choice.
+
+Reading and writing encrypted parquet files involves passing file encryption
+and decryption properties to :class:`~pyarrow.parquet.ParquetWriter` and to
+:class:`~.ParquetFile`, respectively.
+
+Writing an encrypted parquet:
+
+.. code-block:: python
+
+   encryption_properties = crypto_factory.file_encryption_properties(
+                                    kms_connection_config, encryption_config)
+   with pq.ParquetWriter(filename, schema,
+                        encryption_properties=encryption_properties) as writer:
+      writer.write_table(table)
+
+Reading an encrypted parquet:
+
+.. code-block:: python
+
+   decryption_properties = crypto_factory.file_decryption_properties(
+                                                    kms_connection_config)
+   parquet_file = pq.ParquetFile(filename,
+                                 decryption_properties=decryption_properties)
+
+
+In order to create the encryption and decryption properties, a ``CryptoFactory``
+should be created and initialized with KMS Client details, as described below.
+
+
+KMS Client
+~~~~~~~~~~
+
+The master encryption keys must be kept and managed in a production-grade KMS
+system, deployed in user's organization. Using Parquet encryption requires
+implementation of a client class for the KMS server.
+Any KmsClient implementation should implement the following informal interface:
+
+.. code-block:: python
+
+   class KmsClient:
+      def wrap_key(self, key_bytes, master_key_identifier):
+         """Wrap a key - encrypt it with the master key."""
+            raise NotImplementedError()
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         """Unwrap a key - decrypt it with the master key."""
+         raise NotImplementedError()
+
+
+
+   class MyKmsClient(pq.KmsClient):
+      """An example KmsClient implementation skeleton"""
+      def __init__(self, kms_connection_configuration):
+         pq.KmsClient.__init__(self)
+         # Any KMS-specific initialization based on
+         # kms_connection_configuration comes here
+
+      def wrap_key(self, key_bytes, master_key_identifier):
+         wrapped_key = ... # call KMS to wrap key_bytes with key specified by
+                           # master_key_identifier
+         return wrapped_key
+
+      def unwrap_key(self, wrapped_key, master_key_identifier):
+         key_bytes = ... # call KMS to unwrap wrapped_key with key specified by
+                         # master_key_identifier
+         return key_bytes
+
+The concrete implementation will be loaded at runtime by a factory method
+provided by the user. This factory method will be used to initialize the
+``CryptoFactory`` for creating file encryption and decryption properties.
+For example, in order to use the ``MyKmsClient`` defined above:
+
+.. code-block:: python
+
+   def kms_client_factory(kms_connection_configuration):
+      return MyKmsClient(kms_connection_configuration)
+
+   crypto_factory = CryptoFactory(kms_client_factory)
+
+An :download:`example <../../../python/examples/parquet_encryption/sample_vault_kms_client.py>`
+of such a class for an open source
+`KMS <https://www.vaultproject.io/api/secret/transit>`_ can be found in the Apache
+Arrow GitHub repository. The production KMS client should be designed in
+cooperation with an organization's security administrators, and built by
+developers with experience in access control management. Once such a class is
+created, it can be passed to applications via a factory method and leveraged
+by general PyArrow users as shown in the encrypted parquet write/read sample
+above.
+
+KMS connection configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configuration of connection to KMS (``kms_connection_config`` used when
+creating file encryption and decryption properties) includes the following
+options:
+
+* ``kms_instance_url``, URL of the KMS instance.
+* ``kms_instance_id``, ID of the KMS instance that will be used for encryption
+  (if multiple KMS instances are available).
+* ``key_access_token``, authorization token that will be passed to KMS.
+* ``custom_kms_conf``, a string dictionary with KMS-type-specific configuration.
+
+Encryption configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Encryption configuration (``encryption_config`` used when creating file
+encryption properties) includes the following options:
+
+* ``footer_key``, ID of the master key for footer encryption/signing.
+* ``column_keys``, list of columns to encrypt, with master key IDs.
+* ``uniform_encryption``, encrypt footer and all columns with the same
+  encryption key.
+* ``encryption_algorithm``, parquet encryption algorithm.
+  Can be ``AES_GCM_V1`` (default), or ``AES_GCM_CTR_V1``.
+* ``plaintext_footer``, write files with plaintext footer.
+* ``double_wrapping``, use double wrapping - where data encryption keys (DEKs)
+  are encrypted with key encryption keys (KEKs), which in turn are encrypted
+  with master keys. If set to false, use single wrapping - where DEKs are
+  encrypted directly with master keys.
+* ``cache_lifetime``, lifetime of cached entities (key encryption keys,
+  local wrapping keys, KMS client objects)
+* ``internal_key_material``, store key material inside Parquet file footers;
+  this mode doesn’t produce additional files. If set to false, key material is

Review comment:
       ```suggestion
     this mode doesn’t produce additional files. If set to ``false``, key material is
   ```

##########
File path: python/pyarrow/_parquet.pyx
##########
@@ -1187,7 +1195,6 @@ cdef class ParquetReader(_Weakrefable):
                          .ReadSchemaField(field_index, &out))
         return pyarrow_wrap_chunked_array(out)
 
-

Review comment:
       Style nit, but can you keep two blank lines between classes and between individual functions?

##########
File path: python/pyarrow/_parquet.pyx
##########
@@ -1464,3 +1517,429 @@ cdef class ParquetWriter(_Weakrefable):
             return result
         raise RuntimeError(
             'file metadata is only available after writer close')
+
+cdef class EncryptionConfiguration(_Weakrefable):
+    """Configuration of the encryption, such as which columns to encrypt"""
+    cdef:
+        shared_ptr[CEncryptionConfiguration] configuration
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, footer_key, *, column_keys=None,
+                 uniform_encryption=None, encryption_algorithm=None,
+                 plaintext_footer=None, double_wrapping=None,
+                 cache_lifetime=None, internal_key_material=None,
+                 data_key_length_bits=None):
+        self.configuration.reset(
+            new CEncryptionConfiguration(tobytes(footer_key)))
+        if column_keys is not None:
+            self.column_keys = column_keys
+        if uniform_encryption is not None:
+            self.uniform_encryption = uniform_encryption
+        if encryption_algorithm is not None:
+            self.encryption_algorithm = encryption_algorithm
+        if plaintext_footer is not None:
+            self.plaintext_footer = plaintext_footer
+        if double_wrapping is not None:
+            self.double_wrapping = double_wrapping
+        if cache_lifetime is not None:
+            self.cache_lifetime = cache_lifetime
+        if internal_key_material is not None:
+            self.internal_key_material = internal_key_material
+        if data_key_length_bits is not None:
+            self.data_key_length_bits = data_key_length_bits
+
+    @property
+    def footer_key(self):
+        """ID of the master key for footer encryption/signing"""
+        return frombytes(self.configuration.get().footer_key)
+
+    @property
+    def column_keys(self):
+        """
+        List of columns to encrypt, with master key IDs.
+        """
+        column_keys_str = frombytes(self.configuration.get().column_keys)
+        # Convert from "masterKeyID:colName,colName;masterKeyID:colName..."
+        # (see HIVE-21848) to dictionary of master key ID to column name lists
+        column_keys_to_key_list_str = dict(subString.replace(" ", "").split(
+            ":") for subString in column_keys_str.split(";"))
+        column_keys_dict = {k: v.split(
+            ",") for k, v in column_keys_to_key_list_str.items()}
+        return column_keys_dict
+
+    @column_keys.setter
+    def column_keys(self, dict value):
+        if value is not None:
+            # convert a dictionary such as
+            # '{"key1": ["col1 ", "col2"], "key2": ["col3 ", "col4"]}''
+            # to the string defined by the spec
+            # 'key1: col1 , col2; key2: col3 , col4'
+            column_keys = "; ".join(
+                ["{}: {}".format(k, ", ".join(v)) for k, v in value.items()])
+            self.configuration.get().column_keys = tobytes(column_keys)
+
+    @property
+    def uniform_encryption(self):
+        """Encrypt footer and all columns with the same encryption key."""
+        return self.configuration.get().uniform_encryption
+
+    @uniform_encryption.setter
+    def uniform_encryption(self, value):
+        self.configuration.get().uniform_encryption = value
+
+    @property
+    def encryption_algorithm(self):
+        """Parquet encryption algorithm.
+        Can be "AES_GCM_V1" (default), or "AES_GCM_CTR_V1"."""
+        return cipher_to_name(self.configuration.get().encryption_algorithm)
+
+    @encryption_algorithm.setter
+    def encryption_algorithm(self, value):
+        cipher = cipher_from_name(value)
+        self.configuration.get().encryption_algorithm = cipher
+
+    @property
+    def plaintext_footer(self):
+        """Write files with plaintext footer."""
+        return self.configuration.get().plaintext_footer
+
+    @plaintext_footer.setter
+    def plaintext_footer(self, value):
+        self.configuration.get().plaintext_footer = value
+
+    @property
+    def double_wrapping(self):
+        """Use double wrapping - where data encryption keys (DEKs) are
+        encrypted with key encryption keys (KEKs), which in turn are
+        encrypted with master keys.
+        If set to false, use single wrapping - where DEKs are
+        encrypted directly with master keys."""
+        return self.configuration.get().double_wrapping
+
+    @double_wrapping.setter
+    def double_wrapping(self, value):
+        self.configuration.get().double_wrapping = value
+
+    @property
+    def cache_lifetime(self):
+        """Lifetime of cached entities (key encryption keys,
+        local wrapping keys, KMS client objects)."""
+        return timedelta(
+            seconds=self.configuration.get().cache_lifetime_seconds)
+
+    @cache_lifetime.setter
+    def cache_lifetime(self, value):
+        if not isinstance(value, timedelta):
+            raise TypeError("cache_lifetime should be a timedelta")
+        self.configuration.get().cache_lifetime_seconds = value.total_seconds()
+
+    @property
+    def internal_key_material(self):
+        """Store key material inside Parquet file footers; this mode doesn’t
+        produce additional files. If set to false, key material is stored in
+        separate files in the same folder, which enables key rotation for
+        immutable Parquet files."""
+        return self.configuration.get().internal_key_material
+
+    @internal_key_material.setter
+    def internal_key_material(self, value):
+        self.configuration.get().internal_key_material = value
+
+    @property
+    def data_key_length_bits(self):
+        """Length of data encryption keys (DEKs), randomly generated by parquet key
+        management tools. Can be 128, 192 or 256 bits."""
+        return self.configuration.get().data_key_length_bits
+
+    @data_key_length_bits.setter
+    def data_key_length_bits(self, value):
+        self.configuration.get().data_key_length_bits = value
+
+    cdef inline shared_ptr[CEncryptionConfiguration] unwrap(self) nogil:
+        return self.configuration
+
+cdef class DecryptionConfiguration(_Weakrefable):
+    """Configuration of the decryption, such as cache timeout."""
+    cdef:
+        shared_ptr[CDecryptionConfiguration] configuration
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, *, cache_lifetime=None):
+        self.configuration.reset(new CDecryptionConfiguration())
+
+    @property
+    def cache_lifetime(self):
+        """Lifetime of cached entities (key encryption keys,
+        local wrapping keys, KMS client objects)."""
+        return timedelta(
+            seconds=self.configuration.get().cache_lifetime_seconds)
+
+    @cache_lifetime.setter
+    def cache_lifetime(self, value):
+        self.configuration.get().cache_lifetime_seconds = value.total_seconds()
+
+    cdef inline shared_ptr[CDecryptionConfiguration] unwrap(self) nogil:
+        return self.configuration
+
+
+cdef class KmsConnectionConfig(_Weakrefable):
+    """Configuration of the connection to the Key Management Service (KMS)"""
+    cdef:
+        shared_ptr[CKmsConnectionConfig] configuration
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, *, kms_instance_id=None, kms_instance_url=None,
+                 key_access_token=None, custom_kms_conf=None):
+        self.configuration.reset(new CKmsConnectionConfig())
+        if kms_instance_id is not None:
+            self.kms_instance_id = kms_instance_id
+        if kms_instance_url is not None:
+            self.kms_instance_url = kms_instance_url
+        if key_access_token is None:
+            self.key_access_token = b'DEFAULT'
+        else:
+            self.key_access_token = key_access_token
+        if custom_kms_conf is not None:
+            self.custom_kms_conf = custom_kms_conf
+
+    @property
+    def kms_instance_id(self):
+        """ID of the KMS instance that will be used for encryption
+        (if multiple KMS instances are available)."""
+        return frombytes(self.configuration.get().kms_instance_id)
+
+    @kms_instance_id.setter
+    def kms_instance_id(self, value):
+        self.configuration.get().kms_instance_id = tobytes(value)
+
+    @property
+    def kms_instance_url(self):
+        """URL of the KMS instance."""
+        return frombytes(self.configuration.get().kms_instance_url)
+
+    @kms_instance_url.setter
+    def kms_instance_url(self, value):
+        self.configuration.get().kms_instance_url = tobytes(value)
+
+    @property
+    def key_access_token(self):
+        """Authorization token that will be passed to KMS."""
+        return frombytes(self.configuration.get()
+                         .refreshable_key_access_token.get().value())
+
+    @key_access_token.setter
+    def key_access_token(self, value):
+        self.refresh_key_access_token(value)
+
+    @property
+    def custom_kms_conf(self):
+        """A dictionary with KMS-type-specific configuration"""
+        custom_kms_conf = {
+            frombytes(k): frombytes(v)
+            for k, v in self.configuration.get().custom_kms_conf
+        }
+        return custom_kms_conf
+
+    @custom_kms_conf.setter
+    def custom_kms_conf(self, dict value):
+        if value is not None:
+            for k, v in value.items():
+                if isinstance(k, str) and isinstance(v, str):
+                    self.configuration.get().custom_kms_conf[tobytes(k)] = \
+                        tobytes(v)
+                else:
+                    raise TypeError("Expected custom_kms_conf to be " +
+                                    "a dictionary of strings")
+
+    def refresh_key_access_token(self, value):
+        cdef:
+            shared_ptr[CKeyAccessToken] c_key_access_token = \
+                self.configuration.get().refreshable_key_access_token
+
+        c_key_access_token.get().Refresh(tobytes(value))
+
+    cdef inline shared_ptr[CKmsConnectionConfig] unwrap(self) nogil:
+        return self.configuration
+
+    @staticmethod
+    cdef wrap(const CKmsConnectionConfig& config):
+        result = KmsConnectionConfig()
+        result.configuration = make_shared[CKmsConnectionConfig](move(config))
+        return result
+
+# Callback definitions for CPyKmsClientVtable
+cdef void _cb_wrap_key(
+        handler, const c_string& key_bytes,
+        const c_string& master_key_identifier, c_string* out) except *:
+    mkid_str = frombytes(master_key_identifier)
+    wrapped_key = handler.wrap_key(key_bytes, mkid_str)
+    out[0] = tobytes(wrapped_key)
+
+cdef void _cb_unwrap_key(
+        handler, const c_string& wrapped_key,
+        const c_string& master_key_identifier, c_string* out) except *:
+    mkid_str = frombytes(master_key_identifier)
+    wk_str = frombytes(wrapped_key)
+    key = handler.unwrap_key(wk_str, mkid_str)
+    out[0] = tobytes(key)
+
+cdef class KmsClient(_Weakrefable):
+    """The abstract base class for KmsClient implementations."""
+    cdef:
+        shared_ptr[CKmsClient] client
+
+    def __init__(self):
+        self.init()
+
+    cdef init(self):
+        cdef:
+            CPyKmsClientVtable vtable = CPyKmsClientVtable()
+
+        vtable.wrap_key = _cb_wrap_key
+        vtable.unwrap_key = _cb_unwrap_key
+
+        self.client.reset(new CPyKmsClient(self, vtable))
+
+    def wrap_key(self, key_bytes, master_key_identifier):
+        """Wrap a key - encrypt it with the master key."""
+        raise NotImplementedError()
+
+    def unwrap_key(self, wrapped_key, master_key_identifier):
+        """Unwrap a key - decrypt it with the master key."""
+        raise NotImplementedError()
+
+    cdef inline shared_ptr[CKmsClient] unwrap(self) nogil:
+        return self.client
+
+
+# Callback definition for CPyKmsClientFactoryVtable
+cdef void _cb_create_kms_client(
+        handler,
+        const CKmsConnectionConfig& kms_connection_config,
+        shared_ptr[CKmsClient]* out) except *:
+    connection_config = KmsConnectionConfig.wrap(kms_connection_config)
+
+    result = handler(connection_config)
+    if not isinstance(result, KmsClient):
+        raise TypeError(
+            "callable must return KmsClient instances, but got {}".format(
+                type(result)))
+
+    out[0] = (<KmsClient> result).unwrap()
+
+cdef class CryptoFactory(_Weakrefable):
+    """ A factory that produces the low-level FileEncryptionProperties and
+    FileDecryptionProperties objects, from the high-level parameters."""
+    cdef:
+        unique_ptr[CPyCryptoFactory] factory
+
+    # Avoid mistakingly creating attributes
+    __slots__ = ()
+
+    def __init__(self, kms_client_factory):
+        """Create CryptoFactory.
+
+        Parameters
+        ----------
+        kms_client_factory : a callable that accepts KmsConnectionConfig
+            and returns a KmsClient
+        """
+        self.factory.reset(new CPyCryptoFactory())
+
+        if callable(kms_client_factory):
+            self.init(kms_client_factory)
+        else:
+            raise TypeError("Parameter kms_client_factory must be a callable")
+
+    cdef init(self, callable_client_factory):
+        cdef:
+            CPyKmsClientFactoryVtable vtable
+            shared_ptr[CPyKmsClientFactory] kms_client_factory
+
+        vtable.create_kms_client = _cb_create_kms_client
+        kms_client_factory.reset(
+            new CPyKmsClientFactory(callable_client_factory, vtable))
+        # A KmsClientFactory object must be registered
+        # via this method before calling any of
+        # file_encryption_properties()/file_decryption_properties() methods.
+        self.factory.get().RegisterKmsClientFactory(
+            static_pointer_cast[CKmsClientFactory, CPyKmsClientFactory](
+                kms_client_factory))
+
+    def file_encryption_properties(self,
+                                   KmsConnectionConfig kms_connection_config,
+                                   EncryptionConfiguration encryption_config):
+        """Create file encryption properties.
+
+        Parameters
+        ----------
+        kms_connection_config : KmsConnectionConfig
+            Configuration of connection to KMS
+
+        encryption_config : EncryptionConfiguration
+            Configuration of the encryption, such as which columns to encrypt
+
+        Returns
+        -------
+        file_encryption_properties : FileEncryptionProperties
+            File encryption properties.
+        """
+        cdef:
+            CResult[shared_ptr[CFileEncryptionProperties]] \
+                file_encryption_properties_result
+
+        file_encryption_properties_result = \
+            self.factory.get().SafeGetFileEncryptionProperties(
+                deref(kms_connection_config.unwrap().get()),
+                deref(encryption_config.unwrap().get()))
+        file_encryption_properties = GetResultValue(
+            file_encryption_properties_result)
+        return FileEncryptionProperties.wrap(file_encryption_properties)
+
+    def file_decryption_properties(
+            self,
+            KmsConnectionConfig kms_connection_config,
+            DecryptionConfiguration decryption_config=None):
+        """Create file decryption properties.
+
+        Parameters
+        ----------
+        kms_connection_config : KmsConnectionConfig
+            Configuration of connection to KMS
+
+        decryption_config : DecryptionConfiguration, default None
+            Configuration of the decryption, such as cache timeout.
+            Can be None.
+
+        Returns
+        -------
+        file_decryption_properties : FileDecryptionProperties
+            File decryption properties.
+        """
+        cdef:
+            CDecryptionConfiguration c_decryption_config
+            CResult[shared_ptr[CFileDecryptionProperties]] \
+                c_file_decryption_properties
+        if decryption_config is None:
+            c_decryption_config = CDecryptionConfiguration()
+        else:
+            c_decryption_config = deref(decryption_config.unwrap().get())
+        c_file_decryption_properties = \
+            self.factory.get().SafeGetFileDecryptionProperties(

Review comment:
       Same question here wrt. to the Python GIL.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org