You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Mark Secada (JIRA)" <ji...@apache.org> on 2017/10/23 21:01:04 UTC
[jira] [Created] (AIRFLOW-1750)
GoogleCloudStorageToBigQueryOperator 404 HttpError
Mark Secada created AIRFLOW-1750:
------------------------------------
Summary: GoogleCloudStorageToBigQueryOperator 404 HttpError
Key: AIRFLOW-1750
URL: https://issues.apache.org/jira/browse/AIRFLOW-1750
Project: Apache Airflow
Issue Type: Bug
Components: gcp
Affects Versions: Airflow 1.8
Environment: Python 2.7.13
Reporter: Mark Secada
Fix For: Airflow 1.8
I'm trying to write a DAG which uploads JSON files to GoogleCloudStorage and then moves them to BigQuery. I was able to upload these files to GoogleCloudStorage, but when I run this second task, I get a 404 HttpError. The error looks like this:
{bash}
ERROR - <HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects//jobs?alt=json returned "Not Found">
Traceback (most recent call last):
File "/Users/marksecada/anaconda/lib/python2.7/site-packages/airflow/models.py", line 1374, in run
result = task_copy.execute(context=context)
File "/Users/marksecada/anaconda/lib/python2.7/site-packages/airflow/contrib/operators/gcs_to_bq.py", line 153, in execute
schema_update_options=self.schema_update_options)
File "/Users/marksecada/anaconda/lib/python2.7/site-packages/airflow/contrib/hooks/bigquery_hook.py", line 476, in run_load
return self.run_with_configuration(configuration)
File "/Users/marksecada/anaconda/lib/python2.7/site-packages/airflow/contrib/hooks/bigquery_hook.py", line 498, in run_with_configuration
.insert(projectId=self.project_id, body=job_data) \
File "/Users/marksecada/anaconda/lib/python2.7/site-packages/oauth2client/util.py", line 135, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Users/marksecada/anaconda/lib/python2.7/site-packages/googleapiclient/http.py", line 838, in execute
raise HttpError(resp, content, uri=self.uri)
{bash}
My code's here:
{code:python}
// Some comments here
t3 = GoogleCloudStorageToBigQueryOperator(
task_id='move_'+source+'_from_gcs_to_bq',
bucket='mybucket',
source_objects=['news/latest_headline_'+source+'.json'],
destination_project_dataset_table='mydataset.latest_news_headlines',
schema_object='news/latest_headline_'+source+'.json',
source_format='NEWLINE_DELIMITED_JSON',
write_disposition='WRITE_APPEND'
dag=dag)
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)