You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 13:50:05 UTC

[GitHub] [beam] damccorm opened a new issue, #19866: BigQueryAvroUtils unable to convert field with record

damccorm opened a new issue, #19866:
URL: https://github.com/apache/beam/issues/19866

   I'm trying to create a simple test pipeline that export BigQuery as Parquet using BigQueryAvroUtils.java from Beam's code.
   
   When trying to read the BigQuery data and read it as Avro Generic Record, somehow the code failed when trying to read the data with this exception
   ```
   
   org.apache.avro.UnresolvedUnionException: Not in union ["null",{"type":"record","name":"record","namespace":"Translated
   Avro Schema for record","doc":"org.apache.beam.sdk.io.gcp.bigquery","fields":[{"name":"key_2","type":["null","string"]},{"name":"key_1","type":["null","double"]}]}]:
   {"key_2": "asdasd", "key_1": 123123.123}
   
   ```
   
   I have checked the Avro schema and it's the same with its BigQuery schema counterpart.
   
   Then I tried to export the BigQuery table using BigQuery console as Avro and compare its schema with the one generated from BigQueryAvroUtils.java. Turns out there's some difference at the Avro namespace between BigQueryAvroUtils.java and from BigQuery export.
   
   After I tried to patch the BigQueryAvroUtils.java to make the schema result the same with the schema from BigQuery export then the exception went away.
   
   So, I want to confirm whether there's problem in my implementation or BigQuery create a slightly different Avro schema
   
   I've created a simple code along with the patch and data sample here [https://github.com/zakazai/bq-to-parquet](https://github.com/zakazai/bq-to-parquet)
   
    
   
   Imported from Jira [BEAM-8177](https://issues.apache.org/jira/browse/BEAM-8177). Original Jira may contain additional context.
   Reported by: zakazai.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org