You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Marcin Kuthan (Jira)" <ji...@apache.org> on 2021/11/10 07:52:00 UTC

[jira] [Commented] (BEAM-11648) Implement new BigQuery sink (Vortex)

    [ https://issues.apache.org/jira/browse/BEAM-11648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441559#comment-17441559 ] 

Marcin Kuthan commented on BEAM-11648:
--------------------------------------

Existing Streaming API is marked as deprecated: [https://cloud.google.com/bigquery/streaming-data-into-bigquery.] BigQuery documentation recommends new Storage Write API but Beam seems to be far, far behind. It looks that implementation already exists: [https://beam.apache.org/releases/javadoc/2.33.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.Method.html#STORAGE_WRITE_API] but:
 # Documentation is missing (JavaDoc "Use the new, experimental Storage Write API." is not enough, really).
 # I don't know which BigQueryIO features are supported in STORAGE_WRITE_API (e.g. dynamic destination, retry policies, insert ids, invalid rows handling, beam schema, auto sharding, clustering, partitioning, etc.).
 # I don't know which Beam version supports new API (Jira "fix version" is not set)

Existing Streaming Insert method has major limitations (low throughput, high costs), we really need decent support for Storage Write API.

> Implement new BigQuery sink (Vortex)
> ------------------------------------
>
>                 Key: BEAM-11648
>                 URL: https://issues.apache.org/jira/browse/BEAM-11648
>             Project: Beam
>          Issue Type: New Feature
>          Components: extensions-java-gcp
>            Reporter: Reuven Lax
>            Priority: P3
>          Time Spent: 49h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)