You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/11/27 18:36:00 UTC

[jira] [Commented] (BEAM-2870) BQ Partitioned Table Write Fails When Destination has Partition Decorator

    [ https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267213#comment-16267213 ] 

ASF GitHub Bot commented on BEAM-2870:
--------------------------------------

jkff opened a new pull request #4177: [BEAM-2870] Strips partition decorators when creating/patching tables in batch
URL: https://github.com/apache/beam/pull/4177
 
 
   Addresses https://stackoverflow.com/questions/47351578/create-dynamic-side-outputs-in-apache-beam-dataflow?noredirect=1#comment81973314_47351578
   
   R: @chamikaramj 
   CC: @reuvenlax 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -------------------------------------------------------------------------
>
>                 Key: BEAM-2870
>                 URL: https://issues.apache.org/jira/browse/BEAM-2870
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.2.0
>         Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb SDD)
>            Reporter: Steven Jon Anderson
>            Assignee: Reuven Lax
>              Labels: bigquery, dataflow, google, google-cloud-bigquery, google-dataflow
>             Fix For: 2.2.0, 2.3.0
>
>
> Dataflow Job ID: https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events into our clients' numerous tables. To keep costs down, all of our tables are partitioned. However, this requires that we create the tables before we allow events to process as creating partitioned tables isn't supported in 2.1.0. We've been looking forward to [~reuvenlax]'s partition table write feature ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master for some time now as it'll allow us to launch our client platforms much, much faster. Today we got around to testing the 2.2.0 nightly and discovered this bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to an existing partitioned table with a decorator, the write succeeds. When using a partitioned table destination that doesn't exist without a decorator, the write succeeds. *However, when writing to a partitioned table that doesn't exist with a decorator, the write fails*. 
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
>   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
>   .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>   .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
>   .to(new DynamicDestinations<TableRow, String>() {
>     @Override
>     public String getDestination(ValueInSingleWindow<TableRow> element) {
>       return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
>     }
>     @Override
>     public TableDestination getTable(String destination) {
>       TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
>       return new TableDestination(destination, null, DAY_PARTITION);
>     }
>     @Override
>     public TableSchema getSchema(String destination) {
>       return TABLE_SCHEMA;
>     }
>   })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790 
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873 
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus underscores) and must be at most 1024 characters long. Also, Table decorators cannot be used.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)