You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (JIRA)" <ji...@apache.org> on 2019/01/21 21:49:00 UTC
[jira] [Commented] (BEAM-6474) Cannot reference field when using
SqlTransform (need to use "EXPR$N" instead)
[ https://issues.apache.org/jira/browse/BEAM-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748235#comment-16748235 ]
Kenneth Knowles commented on BEAM-6474:
---------------------------------------
[~polleyg] I think you may solve your problem by explicitly adding a name, so {{sum(views) AS view_sum}}.
I do agree that there might be a nicer default naming scheme than {{EXPR$1}}. I think in that case you would access the row by index, not field name.
> Cannot reference field when using SqlTransform (need to use "EXPR$N" instead)
> -----------------------------------------------------------------------------
>
> Key: BEAM-6474
> URL: https://issues.apache.org/jira/browse/BEAM-6474
> Project: Beam
> Issue Type: Bug
> Components: beam-model, dsl-sql, runner-dataflow
> Affects Versions: 2.9.0
> Environment: MacOS
> Reporter: Graham Polley
> Assignee: Kenneth Knowles
> Priority: Major
>
> Maybe I've done something wrong, but when you try to access a field this has been generated in a SqlTransform it throws an exception:
>
> {code:java}
> java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.IllegalArgumentException: Cannot find field views in schema Fields: Field{name=wikimedia_project, description=, type=FieldType{typeName=STRING, collectionElementType=null, collectionElementTypeNullable=null, mapKeyType=null, mapValueType=null, mapValueTypeNullable=null, rowSchema=null, metadata=null}, nullable=false} Field{name=EXPR$1, description=, type=FieldType{typeName=INT32, collectionElementType=null, collectionElementTypeNullable=null, mapKeyType=null, mapValueType=null, mapValueTypeNullable=null, rowSchema=null, metadata=null}, nullable=false}{code}
> Instead of being able to access the `views` field, it has been named `EXPR$1` by Beam/Dataflow. So, to get the value of the field I need to do this:
> {code:java}
> bqRow.set("views", row.getInt32("EXPR$1"));{code}
> instead of:
> {code:java}
> bqRow.set("views", row.getInt32("views"));{code}
>
> {code:java}
> PCollection<Row> outputStream =
> sqlRows.setRowSchema(SCHEMA)
> .apply("sql_transform",
> SqlTransform.query(
> "select wikimedia_project, sum(views) " +
> "from PCOLLECTION " +
> "group by wikimedia_project"));{code}
>
> Pipeline is reading a file from GCS, transforming it (using SqlTransform) and writing to BigQuery. Code can be found here:
> [https://github.com/polleyg/gcp-batch-ingestion-bigquery/blob/beam_sql/src/main/java/org/polleyg/TemplatePipeline.java]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)