You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2021/06/10 08:30:33 UTC

[GitHub] [bookkeeper] lhotari commented on a change in pull request #2730: BP-45: a pluggable way to modify payload sent to the ledger

lhotari commented on a change in pull request #2730:
URL: https://github.com/apache/bookkeeper/pull/2730#discussion_r648968043



##########
File path: site/bps/BP-45-LedgerPayloadInterceptor.md
##########
@@ -0,0 +1,183 @@
+---
+title: "BP-45: Add interceptor interface allowing modifications of the payload"
+issue: https://github.com/apache/bookkeeper/issues/2731
+state: Under Discussion
+release: "4.15.0"
+---
+
+### Motivation
+
+The proposed change targets addition of a pluggable way to modify payload sent to the ledger. 
+Specific use cases include GDPR compliance, encryption, and compression. 
+While these can be implemented at the application level, having this as a BK client level 
+extension unlocks such functionality to any application built on top of Bookkeeper.
+
+Let’s dig deeper into specific use cases: 
+
+#### GDPR
+
+Ledger's data on the bookie servers is stored in the immutable EntryLog files. The files, 
+in the most common case, mix the data from multiple ledgers. There is no guarantee of immediate 
+erasure from the disk upon the ledger deletion, the entry logs are "compacted" with some delay. 
+The amount of the delay is not guaranteed and, in an extreme case, can be infinite if the data 
+from a deleted small short-lived ledger gets mixed in the entry log with data from the long-lived 
+ledgers. 
+
+Such behavior of compaction is a trade-off between performance and disk space utilization. 
+
+The data is also stored in the journal file and, in an extreme case 
+(low TTL for deletion and low traffic volume), can be recoverable from the journal after the 
+ledger deletion.
+
+Modern global businesses are affected by privacy laws [1] and obligations, the most notable 
+of which is "Art. 17 GDPR: Right to be forgotten" [2]. Privacy policies require setting deadlines 
+after which the data cannot be used.
+
+"Forgotten" encryption keys are an acceptable alternative to the actual deletion of the data. 

Review comment:
       Isn't this commonly called [crypto-shredding](https://en.wikipedia.org/wiki/Crypto-shredding)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org