You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/08/10 17:08:20 UTC

[jira] [Updated] (BEAM-6684) BigQueryIO: Unable to create dataset "Location unknown is not yet publicly available

     [ https://issues.apache.org/jira/browse/BEAM-6684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Beam JIRA Bot updated BEAM-6684:
--------------------------------
    Labels: stale-P2  (was: )

> BigQueryIO: Unable to create dataset "Location unknown is not yet publicly available
> ------------------------------------------------------------------------------------
>
>                 Key: BEAM-6684
>                 URL: https://issues.apache.org/jira/browse/BEAM-6684
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>    Affects Versions: 2.10.0
>            Reporter: Pablo Estrada
>            Priority: P2
>              Labels: stale-P2
>
> My understanding is that BigQueryIO runs the query, writes the output to a temp dataset, and then extracts the temp dataset to GCS. This means the location of the temp dataset (if not manually set) is determined by the tables referenced in the query. This is confirmed in the source code for BigQueryIO: https://github.com/apache/beam/blob/v2.6.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java#L111
> So I would expect that the temp dataset should also be created in the US location, or default to the US. Instead, it appears to be defaulting to "unknown" (at least some of the time), therefore causing the whole Dataflow job to fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)