You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Michael (JIRA)" <ji...@apache.org> on 2019/08/15 07:36:00 UTC

[jira] [Commented] (AIRFLOW-3601) update operators to BigQuery to support location

    [ https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907891#comment-16907891 ] 

Michael commented on AIRFLOW-3601:
----------------------------------

I noticed that as part of this issue BigQueryCheckOperator was called out as out of scope with the reason _"does not require location since it does not use location internally"_.

 

When I try to use the BigQueryCheckOperator or BigQueryValueCheckOperator on a dataset that is not in the 'US' location my task fails with the following error
{code:java}
[2019-08-15 07:26:19,378] {__init__.py:1580} ERROR - BigQuery job status check failed. Final error was: 404
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/bigquery_hook.py", line 1241, in run_with_configuration
    jobId=self.running_job_id).execute()
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 855, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/anz-data-cde-airflow/jobs/job_ISDpiVtd7U1p-6N9wT378LfwoFHc?alt=json returned "Not found: Job anz-data-cde-airflow:job_ISDpiVtd7U1p-6N9wT378LfwoFHc">

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/airflow/models/__init__.py", line 1441, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.6/site-packages/airflow/operators/check_operator.py", line 81, in execute
    records = self.get_db_hook().get_first(self.sql)
  File "/usr/local/lib/python3.6/site-packages/airflow/hooks/dbapi_hook.py", line 138, in get_first
    cur.execute(sql)
  File "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/bigquery_hook.py", line 1821, in execute
    self.job_id = self.run_query(sql)
  File "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/bigquery_hook.py", line 849, in run_query
    return self.run_with_configuration(configuration)
  File "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/bigquery_hook.py", line 1263, in run_with_configuration
    format(err.resp.status))
Exception: BigQuery job status check failed. Final error was: 404
[2019-08-15 07:26:19,388] {__init__.py:1611} INFO - Marking task as FAILED.
{code}
This is the same error I get when I try to run the BigQuery operator without specifying a location, therefore I believe the error is related to not passing a location to the BigQueryHook.

 

Should fixing this problem with the BigQueryCheckOperator be part of this Task or should a separate one be created?

 

> update operators to BigQuery to support location
> ------------------------------------------------
>
>                 Key: AIRFLOW-3601
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
>             Project: Apache Airflow
>          Issue Type: Task
>          Components: gcp
>    Affects Versions: 1.10.1
>            Reporter: Yohei Onishi
>            Assignee: Yohei Onishi
>            Priority: Major
>              Labels: bigquery
>
> location support for BigQueryHook was merged by the PR 4324 [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py
> ** BigQueryGetDataOperator (fix in https://issues.apache.org/jira/browse/AIRFLOW-4287)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> * bigquery_to_bigquery.py
> ** BigQueryToBigQueryOperator (fix in https://issues.apache.org/jira/browse/AIRFLOW-4288)
> * bigquery_to_gcs.py
> ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
> ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
> ** BigQueryTableSensor
> The following operators does not require location since it does not use location internally
> * bigquery_check_operator.py
> ** BigQueryCheckOperator
> * bigquery_table_delete_operator.py
> ** BigQueryDeleteDatasetOperator https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)