You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "enrico agnoli (Jira)" <ji...@apache.org> on 2020/06/22 07:53:00 UTC

[jira] [Created] (AVRO-2868) Introduce serialization finalizer hook

enrico agnoli created AVRO-2868:
-----------------------------------

             Summary: Introduce serialization finalizer hook
                 Key: AVRO-2868
                 URL: https://issues.apache.org/jira/browse/AVRO-2868
             Project: Apache Avro
          Issue Type: Improvement
          Components: java
            Reporter: enrico agnoli
             Fix For: 1.8.2


I would like to make a proposal change to AVRO to allow services to integrate some logic after serialization and before deserialization.
We use AVRO to support the data serialization in our streaming infrastructure and we decided to extend it to provide us the possibility to encrypt the data with info available directly on the data itself: the owner of it.
The change-set is pretty small and I would like to hear from you if it makes sense to contribute it back to the project.
 
== The problem is:
Multi-tenants applications have the need to encrypt data (with the keys of the owner/tenant that generated that piece of data) every time it is serialized to avoid commingling of different tenant data. To do so, transparently to the application, the ideal place to implement the encryption it is in the serialization library (AVRO).
 
== Proposal:
We modified the AVRO code to have afterSerialization and beforeDeserialization hooks that can use object defined values (the tenant/owner of that data) to implement encryption.
In the code we propose to submit we implemented a new interface: `SerializeFinalizationDelegate.java`
```
public interface SerializeFinalizationDelegate {
  void afterSerialization(ByteArrayOutputStream serializedData, Encoder finalEncoder);
  Decoder beforeDeserialization(Decoder dataToDecode);
}
```
That needs to be implemented by any AVRO serializable class that wants to define a post-serialization or pre-deserialization logic.
`GenericDatumWriter` and `GenericDatumReader` are modified to delegate to the object implementation of the methods above.
 
More info can be found at [https://www.slideshare.net/FlinkForward/multi-tenanted-streams-workday-enrico-agnoli-leire-fernandez-de-retana-roitegui-workday-185815223] from slide 21
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)