You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2021/05/15 18:00:02 UTC

[jira] [Updated] (BEAM-7382) Bigquery IO: schema autodetection failing

     [ https://issues.apache.org/jira/browse/BEAM-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kenneth Knowles updated BEAM-7382:
----------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Resolved)

Hello! Due to a bug in our Jira configuration, this issue had status:Resolved but resolution:Unresolved.

I am bulk editing these issues to have resolution:Fixed

If a different resolution is appropriate, please change it. To do this, click the "Resolve" button (you can do this even for closed issues) and set the Resolution field to the right value.

> Bigquery IO: schema autodetection failing
> -----------------------------------------
>
>                 Key: BEAM-7382
>                 URL: https://issues.apache.org/jira/browse/BEAM-7382
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Juta Staes
>            Priority: P2
>              Labels: stale-P2
>
> I am working on writing it tests for bigquery io on the dataflowrunner.
> When testing the schema auto detection I get:
> {code:java}
> ERROR: test_big_query_write_schema_autodetect (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests)*12:41:01* ----------------------------------------------------------------------*12:41:01* Traceback (most recent call last):*12:41:01*   File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py", line 156, in test_big_query_write_schema_autodetect*12:41:01*     write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))*12:41:01*   File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", line 426, in __exit__*12:41:01*     self.run().wait_until_finish()*12:41:01*   File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", line 419, in run*12:41:01*     return self.runner.run_pipeline(self, self._options)*12:41:01*   File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", line 64, in run_pipeline*12:41:01*     self.result.wait_until_finish(duration=wait_duration)*12:41:01*   File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", line 1322, in wait_until_finish*12:41:01*     (self.state, getattr(self._runner, 'last_error_msg', None)), self)*12:41:01* apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:*12:41:01* Workflow failed. Causes: S01:create/Read+write/WriteToBigQuery/NativeWrite failed., BigQuery import job "dataflow_job_18059625072014532771-B" failed., BigQuery job "dataflow_job_18059625072014532771-B" in project "apache-beam-testing" finished with error(s): errorResult: No schema specified on job or table., error: No schema specified on job or table.
> {code}
> test code:
> {code:java}
> input_data = [
>     {'number': 1, 'str': 'abc'},
>     {'number': 2, 'str': 'def'},
> ]
> with beam.Pipeline(argv=args) as p:
>   (p | 'create' >> beam.Create(input_data)
>    | 'write' >> beam.io.WriteToBigQuery(
>        output_table,
>        schema=beam.io.gcp.bigquery.SCHEMA_AUTODETECT,
>        create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
>        write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))
> {code}
> Is there something wrong with my test or is this a bug?
> link to pr: [https://github.com/apache/beam/pull/8621]
> cc: [~tvalentyn] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)