You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Artur Khanin <ar...@akvelon.com> on 2020/11/23 17:19:41 UTC
Proposal: Beam Template-like Example to protect sensitive data
Hi Community!
Some users may want to protect their sensitive data using tokenization.
We propose to create a Beam example template that will demonstrate Beam transform to protect sensitive data using tokenization. In our example, we will use an external service for the data tokenization.
At a high level, a pipeline that will:
* support batch (GCS) and streaming (Pub/Sub) input sources
* tokenize sensitive data via external REST service - we are about to use Protegrity
* output tokenized data into BigQuery or BigTable
I created JIRA ticket BEAM-11322<https://issues.apache.org/jira/browse/BEAM-11322> to describe this proposal and capture feedback. More details and the proposed design are available in the design doc<https://docs.google.com/document/d/1fnsUfGpCx8A_MBchPRvlm4gU0Ai5EQNSiZS1mg_A_zg/edit?usp=sharing>.
I welcome community feedback and comments regarding this Beam data tokenization template proposal
Thanks,
Artur Khanin
Akvelon, Inc