You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2022/04/22 17:28:00 UTC

[jira] [Updated] (BEAM-13618) Java BigQuery IO: DirectRead does not work with Beam Schema support.

     [ https://issues.apache.org/jira/browse/BEAM-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Beam JIRA Bot updated BEAM-13618:
---------------------------------
    Labels: stale-P2  (was: )

> Java BigQuery IO: DirectRead does not work with Beam Schema support.
> --------------------------------------------------------------------
>
>                 Key: BEAM-13618
>                 URL: https://issues.apache.org/jira/browse/BEAM-13618
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.35.0
>            Reporter: Daniel Oliveira
>            Priority: P2
>              Labels: stale-P2
>
> Currently in BigQueryIO, Reads with Beam Schema support (for example using [readTableRowsWithSchema|https://github.com/apache/beam/blob/v2.35.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L553]) don't actually have Schema support if using DirectRead as a read method. This appears to be because the expansion logic for DirectReads takes [a different path|https://github.com/apache/beam/blob/v2.35.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1060] that doesn't include any considerations for beam schemas ([example of the code handling Beam schemas in the default path|https://github.com/apache/beam/blob/v2.35.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1204]).
> Part of the reason for this is likely that the current approach to Beam Schema support is to get a description of the BQ table's schema and then convert it to a Beam schema. However, with DirectRead specific columns can be excluded while reading, meaning that the Beam schema needed doesn't actually convert directly to the table's schema, it would need to be constructed based on the specific fields selected for the read.
> (As a side note, this is currently not documented anywhere, leading me to believe this is an oversight or potential bug. I will add some documentation indicating that schema support currently does not work with DirectRead.)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)