You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/08/27 22:05:18 UTC

[GitHub] [pulsar] lanwen opened a new issue #5059: [Feature request] Per key cleanup for the GDPR requirements

lanwen opened a new issue #5059: [Feature request] Per key cleanup for the GDPR requirements
URL: https://github.com/apache/pulsar/issues/5059
 
 
   **Is your feature request related to a problem? Please describe.**
   We are using Pulsar for the event sourcing as the source of truth with the indefinite number of events. Since we are storing some personal information in the events (like name or email) and operating in Europe we have to be compliant to GDPR and be able to remove all the data for the specific key. 
   
   **Describe the solution you'd like**
   A good thing would be to follow the same way as topic compaction - with the only difference that the only number of keys should be compacted. Admin tool and/or admin API allowing to run compaction for the specific limited set of keys.
   
   Another approach - a way to clean up the tiered storage - like offload the topic after some time to s3 and cleanup that somehow. A possible way would be to do a filtered offload, but with the situation handled where we receive a request for deletion to already offloaded data - so that it should be loaded and re-offloaded properly cleaned with no traces of old keys.
   
   **Describe alternatives you've considered**
   Right now we are migrating the whole cluster regularly to clean up all the events that should be removed, what is not fun at all :)
   
   **Additional context**
   Kafka doesn't provide anything like that. Also, the recommendation to encrypt the data in the pulsar and later just throw away the encryption key doesn't really work since it's not considered as compliant by some of the EU governments (like in Germany). 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services