You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Yohei Onishi (JIRA)" <ji...@apache.org> on 2018/12/27 07:27:00 UTC
[jira] [Assigned] (AIRFLOW-3571)
GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from
GCS to BiqQuery but a task is failed
[ https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yohei Onishi reassigned AIRFLOW-3571:
-------------------------------------
Assignee: Yohei Onishi
> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed
> -------------------------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
> Issue Type: Bug
> Components: contrib
> Affects Versions: 1.10.0
> Reporter: Yohei Onishi
> Assignee: Yohei Onishi
> Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: asia-northeast1-c
> * BigQuery dataset and table: asia-northeast1-c
> * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading CSV file from a GCS bucket to a BigQuery table but the task was failed due to the following error.
>
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] {discovery.py:871} INFO - URL being requested: GET https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check failed. Final error was: %s', 404)
> Traceback (most recent call last)
> File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
> File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrappe
> return wrapped(*args, **kwargs
> File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
> File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
> File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 237, in execut
> time_partitioning=self.time_partitioning
> File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 951, in run_loa
> return self.run_with_configuration(configuration
> File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff0000}fmy-project:job_abc123{color} but the correct job id is{color:#ff0000} my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, not actual id.)
> I suppose the operator does not treat zone properly.
>
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed Billing Tier Labels
> ---------- --------- ----------------- ---------- -------------------------------------------------------------- ----------------- -------------- -------------- --------
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)