You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "AJ (JIRA)" <ji...@apache.org> on 2017/11/16 07:12:01 UTC

[jira] [Created] (BEAM-3200) Streaming Pipeline throws RuntimeException when using DynamicDestinations and Method.FILE_LOADS

AJ created BEAM-3200:
------------------------

             Summary: Streaming Pipeline throws RuntimeException when using DynamicDestinations and Method.FILE_LOADS
                 Key: BEAM-3200
                 URL: https://issues.apache.org/jira/browse/BEAM-3200
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-gcp
    Affects Versions: 2.2.0
            Reporter: AJ
            Assignee: Chamikara Jayalath
            Priority: Critical


I am trying to use Method.FILE_LOADS for loading data into BQ in my streaming pipeline using RC3 release of 2.2.0. I am writing to around 500 tables using DynamicDestinations and I am also using withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED). Everything works fine when the first time bigquery load jobs get triggered. But on subsequent triggers pipeline throws a RuntimeException about table not found even though I created the pipeline with CreateDisposition.CREATE_IF_NEEDED. The exact exception is:
{code}
java.lang.RuntimeException: Failed to create load job with id prefix 717aed9ed1ef4aa7a616e1132f8b7f6d_a0928cae3d670b32f01ab2d9fe5cc0ee_00001_00001, reached max retries: 3, last failed load job: {
  "configuration" : {
    "load" : {
      "createDisposition" : "CREATE_NEVER",
      "destinationTable" : {
        "datasetId" : ...,
        "projectId" : ...,
        "tableId" : ....
      },
    "errors" : [ }
      "message" : "Not found: Table ....,
      "reason" : "notFound"
    } ],
{code}

My theory is all the subsequent load jobs get trigged using CREATE_NEVER disposition and 
this might be due to https://github.com/apache/beam/blob/release-2.2.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L140

When using DynamicDestinations all the destination tables might not be known during the first trigger and hence the pipeline's create disposition should be respected.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)