You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Benjamin BENOIST (JIRA)" <ji...@apache.org> on 2018/03/02 17:09:00 UTC
[jira] [Created] (BEAM-3772) BigQueryIO - Can't use
DynamicDestination with CREATE_IF_NEEDED for unbounded PCollection and
FILE_LOADS
Benjamin BENOIST created BEAM-3772:
--------------------------------------
Summary: BigQueryIO - Can't use DynamicDestination with CREATE_IF_NEEDED for unbounded PCollection and FILE_LOADS
Key: BEAM-3772
URL: https://issues.apache.org/jira/browse/BEAM-3772
Project: Beam
Issue Type: Bug
Components: io-java-gcp
Affects Versions: 2.3.0, 2.2.0
Environment: Dataflow streaming pipeline
Reporter: Benjamin BENOIST
Assignee: Chamikara Jayalath
My workflow : KAFKA -> Dataflow streaming -> BigQuery
Given that having low-latency isn't important in my case, I use FILE_LOADS to reduce the costs. I'm using _BigQueryIO.Write_ with a _DynamicDestination_, which is a table with the current hour as a suffix.
This _BigQueryIO.Write_ is configured like this :
{code:java}
.withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
.withMethod(Method.FILE_LOADS)
.withTriggeringFrequency(triggeringFrequency)
.withNumFileShards(100)
{code}
The first table is successfully created and is written to. But then the following tables are never created and I get these exceptions:
{code:java}
(99e5cd8c66414e7a): java.lang.RuntimeException: Failed to create load job with id prefix 5047f71312a94bf3a42ee5d67feede75_5295fbf25e1a7534f85e25dcaa9f4986_00001_00023, reached max retries: 3, last failed load job: {
"configuration" : {
"load" : {
"createDisposition" : "CREATE_NEVER",
"destinationTable" : {
"datasetId" : "dev_mydataset",
"projectId" : "myproject-id",
"tableId" : "mytable_20180302_16"
},
{code}
The _CreateDisposition_ used is _CREATE_NEVER_, contrary as _CREATE_IF_NEEDED_ as specified.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)