You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by nicolas paris <ni...@riseup.net> on 2022/11/02 14:17:43 UTC

Re: Modular encryption to support arrays and nested arrays

Thanks for your help.
> 
> The goal is to make the exception print something like:
> *Caused by: org.apache.parquet.crypto.ParquetCryptoRuntimeException:
> Encrypted column [rider] not in file schema column list: [foo] ,
> [rider.list.element.foo] , [rider.list.element.bar] , [ts] , [uuid]*
> 


this sounds good. I also got pain to apply encryption on map fields.
Finally found out in the source some pointers and made it work. Also
either key_value.value or key_value.key lead to the whole map to be
encrypted apparently.

```
spark.sparkContext.hadoopConfiguration.set("parquet.encryption.column.k
eys", "k2:ma.key_value.value")
val df = spark.sql("select  map('foo',2, 'bar',3) as ma")
```

> - Configuring a key for all children of a nested schema node (eg "
> *k2:rider.*"*). This had been discussed in the past, but not followed
> up..
> Is this something you'd be interested to build?
> > 

I'm affraid this is not something useful in my context for now. Columns
to be encrypted is a carefully hand crafted list - there is no usage
for * right now.