You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "jwzh222 (via GitHub)" <gi...@apache.org> on 2023/03/03 04:08:57 UTC

[GitHub] [beam] jwzh222 opened a new issue, #25704: [Bug]: beam.io.WriteToBigQuery failed when given schema with space

jwzh222 opened a new issue, #25704:
URL: https://github.com/apache/beam/issues/25704

   ### What happened?
   
   there is any issue in python SDK beam.io.WriteToBigQuery() 
   when you add a space in schema, like schema="name: STRING",   it will fail.
   
   **error message:**
   "message":  "Invalid value for type:  STRING is not a valid value"
   
   
   **example code:**
   
   
   `import apache_beam as beam
   from apache_beam.options.pipeline_options import PipelineOptions
   
   
   def run():
       pipeline_args = []
       pipeline_options = PipelineOptions(pipeline_args)
   
       table_ref = 'project_id:dataset_id.table_id'
       schema_with_space = "name: STRING"
       schema_without_space = "name:STRING"
   
       with beam.Pipeline(options=pipeline_options) as p :
           records = p | 'load records' >> beam.Create([{"name":"bob"},{"name":"alice"}])
           records | 'write to bigquery' >> beam.io.WriteToBigQuery(
               table = table_ref,
               schema = schema_with_space,
               create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
           )
   
   
   
   if __name__ == '__main__':
       pipeline_args = [
           '--runner',
           'DirectRunner',
           '--project',
           'YOUR_PROJECT_ID',
       ]
   
       run()`
   
   
   
   
   ### Issue Priority
   
   Priority: 3 (minor)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [X] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on issue #25704: [Bug]: beam.io.WriteToBigQuery failed when given schema with space

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #25704:
URL: https://github.com/apache/beam/issues/25704#issuecomment-1452940201

   Label DirectRunner cannot be managed because it does not exist in the repo. Please check your spelling.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] jwzh222 commented on issue #25704: [Bug]: beam.io.WriteToBigQuery failed when given schema with space

Posted by "jwzh222 (via GitHub)" <gi...@apache.org>.
jwzh222 commented on issue #25704:
URL: https://github.com/apache/beam/issues/25704#issuecomment-1452940027

   environment:
   windows 10
   python 3.9.0
   apache-beam 2.45.0
   
   it will fail with both DirectRunner and DataflowRunner
   
   .add-labels DirectRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #25704: [Bug]: beam.io.WriteToBigQuery failed when given schema with space

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #25704:
URL: https://github.com/apache/beam/issues/25704#issuecomment-1456989142

   Thanks for reporting this issue. It looks like indeed a bug that space is not considered here:
   
   https://github.com/apache/beam/blob/6452dc7982240819a763aaf9ff3efc4a01fc1d2b/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L1536
   
   Use (s.strip() for s  in field_and_type.split(':')) as L1534 should fix the problem
   
   Would you interested in getting a fix?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org