You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "ahmedabu98 (via GitHub)" <gi...@apache.org> on 2023/09/05 16:40:39 UTC

[GitHub] [beam] ahmedabu98 commented on issue #28309: [Bug]: [Java BQ FILE_LOADS] When streaming to dynamic destinations with copy jobs and CREATE_IF_NEEDED, only the first table is created

ahmedabu98 commented on issue #28309:
URL: https://github.com/apache/beam/issues/28309#issuecomment-1706958224

   This behavior is likely due to these lines: https://github.com/apache/beam/blob/4c66866aa9544d1796c7c3880192cb57d2a8dcc0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteRename.java#L193-L198
   
   The general idea is after the first pane, we set appropriate create and write dispositions so that subsequent jobs don't overwrite previous data. However here, `c.pane().isFirst()` in streaming is only `true` for the first copy job. Subsequent copy jobs seem to appear in different panes ([maybe because of this GBK](https://github.com/apache/beam/blob/4c66866aa9544d1796c7c3880192cb57d2a8dcc0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java#L404)). This results in Beam setting `CREATE_NEVER` disposition on everything after the first copy job, even if its the first job for a particular destination.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org