You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 17:13:39 UTC
[GitHub] [beam] damccorm opened a new issue, #20399: Select Types not equal with nested schema
damccorm opened a new issue, #20399:
URL: https://github.com/apache/beam/issues/20399
When using SQL transform to join a large nested schema to a flat table getting an error about "Types not equal" from Select [1]
We are not able the test of our use of SqlTransform to pass with direct runner. All code is checked into CSR [2].
Things of note:
Calcite Query Planner
Query (the real business logic was much more complex but this is sufficient to reproduce issue in our test)
```sql
SELECT
t1.DeviceName AS DeviceName,
t1.LinkName AS LinkName,
t1.HostName AS HostName,
t1.MeasuredAt AS MeasuredAt,
t2.b_dBm AS b_dBm
FROM
RealtimeRows AS t1
INNER JOIN
\--BigQuery Dimension Side Input
TxPowerSideInput AS t2
ON
t1.DeviceName = t2.DeviceName
```
Tables created like so (though in real tive )
```java
// This table has the same schema to the real incoming Pub/Sub messages
// in the real world use case.
PCollection<Row\> realtimeTestData = pipeline
.apply("Read 1Hz staging",
BigQueryIO
.readTableRowsWithSchema()
.fromQuery(
"SELECT * FROM `taara-db.jake_views.staging_sample_float`")
.usingStandardSql())
.apply(Convert.toRows());
PCollection<Row\> txPowerCalcRows = pipeline
.apply("Read Tx Power Calc Side Input",
BigQueryIO
.readTableRowsWithSchema()
.fromQuery(
"SELECT * FROM `taara-db`.MANUFACTURING.tx_power_timeinvariant_calculations")
.usingStandardSql())
.apply(Convert.toRows());
```
Relevant java snippet
```java
PCollection<Row\> out = tables
.apply(
"Join to dimension Data",
SqlTransform
.query(sql)
.registerUdf("POW", Pow.class)
.registerUdf("SQRT", Sqrt.class)
.registerUdf("LOG10", Log10.class)
.registerUdf("GREATEST", Greatest.class)
.registerUdf("EXTRACT_OFFSET", ExtractArrayOffset.class)
.registerUdf("PARSE_TIMESTAMP", ParseTimestamp.class)
.registerUdf("UNIX_SECONDS", UnixSeconds.class)
);
```
[1] https://github.com/apache/beam/blob/b564239081e9351c56fb0e7d263495b95dd3f8f3/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Select.java#L203
[2] https://source.cloud.google.com/taara-db/pso-taara-realtime-margin/****/master:streaming-join/streaming-join/src/test/java/com/google/x/taara/dataflow/transforms/RxTxPowersCorrFERCombinedSqlTransformIT.java
Imported from Jira [BEAM-10544](https://issues.apache.org/jira/browse/BEAM-10544). Original Jira may contain additional context.
Reported by: data-runner0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org