You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Kirill Kozlov <ki...@google.com> on 2020/01/08 19:28:14 UTC

[Design Proposal] DataStore SQL Connector

Hello everyone!

I have written up a proposal [1] for a DataStore SQL connector. I would
love to hear comments and suggestions from the Beam dev community!

A quick summary:
DataStore [2] is a NoSQL database with a dynamic schema, where entities
(documents) are stored in Kinds (databases). Each entity has a key (unique
identifier) [3], which consists of a partition id and a path (can be used
to link to other entities).
Proposal is to implement *PTransforms* to perform conversion between
DataStore data types and Beam types: *EntityToRow* and *RowToEntity*.
PTransforms can be used independently or via *SQL Table* (which will use
them implicitly).
SQL Table should allow users to specify a row schema field name to store
the key in.

[1]
https://docs.google.com/document/d/1FxuEGewJ3GPDl0IKglfOYf1edwa2m_wryFZYRMpRNbA/edit?usp=sharing
[2] https://cloud.google.com/datastore/
[3]
https://cloud.google.com/datastore/docs/concepts/entities#kinds_and_identifiers

Re: [Design Proposal] DataStore SQL Connector

Posted by Kenneth Knowles <ke...@apache.org>.
Seems useful. Nice to see SQL's capabilities expanding to more data
sources. Commented on the doc. This also may be a useful example for adding
more NoSQL data sources. And for non-SQL schema-driven transforms it is
helpful.

Kenn

On Wed, Jan 8, 2020 at 11:28 AM Kirill Kozlov <ki...@google.com>
wrote:

> Hello everyone!
>
> I have written up a proposal [1] for a DataStore SQL connector. I would
> love to hear comments and suggestions from the Beam dev community!
>
> A quick summary:
> DataStore [2] is a NoSQL database with a dynamic schema, where entities
> (documents) are stored in Kinds (databases). Each entity has a key (unique
> identifier) [3], which consists of a partition id and a path (can be used
> to link to other entities).
> Proposal is to implement *PTransforms* to perform conversion between
> DataStore data types and Beam types: *EntityToRow* and *RowToEntity*.
> PTransforms can be used independently or via *SQL Table* (which will use
> them implicitly).
> SQL Table should allow users to specify a row schema field name to store
> the key in.
>
> [1]
> https://docs.google.com/document/d/1FxuEGewJ3GPDl0IKglfOYf1edwa2m_wryFZYRMpRNbA/edit?usp=sharing
> [2] https://cloud.google.com/datastore/
> [3]
> https://cloud.google.com/datastore/docs/concepts/entities#kinds_and_identifiers
>