You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kamil Bregula (JIRA)" <ji...@apache.org> on 2019/06/12 13:22:00 UTC

[jira] [Commented] (AIRFLOW-3503) GoogleCloudStorageHook delete return success when nothing was done

    [ https://issues.apache.org/jira/browse/AIRFLOW-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862093#comment-16862093 ] 

Kamil Bregula commented on AIRFLOW-3503:
----------------------------------------

[~yohei]

Hello

Do you plan to continue working on this change? I am organising tickets in JIRA related to GCP and I would like to know the current status of this change.

This functionality seems to be interesting. The community would have been happy if you had completed this change.

Can I help you in this?

Regards,

> GoogleCloudStorageHook  delete return success when nothing was done
> -------------------------------------------------------------------
>
>                 Key: AIRFLOW-3503
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3503
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: gcp
>    Affects Versions: 1.10.1
>            Reporter: lot
>            Assignee: Yohei Onishi
>            Priority: Major
>              Labels: bigquery, gcp, hooks
>
> I'm loading files to BigQuery from Storage using:
>  
> {{gcs_export_uri = BQ_TABLE_NAME + '/' + EXEC_TIMESTAMP_PATH + '/*' gcs_to_bigquery_op = GoogleCloudStorageToBigQueryOperator( dag=dag, task_id='load_products_to_BigQuery', bucket=GCS_BUCKET_ID, destination_project_dataset_table=table_name_template, source_format='NEWLINE_DELIMITED_JSON', source_objects=[gcs_export_uri], src_fmt_configs=\{'ignoreUnknownValues': True}, create_disposition='CREATE_IF_NEEDED', write_disposition='WRITE_TRUNCATE', skip_leading_rows = 1, google_cloud_storage_conn_id=CONNECTION_ID, bigquery_conn_id=CONNECTION_ID)}}
>  
> After that I want to delete the files so I do:
> {{def delete_folder():}}
> {{    """}}
> {{    Delete files Google cloud storage}}
> {{    """}}
> {{    hook = GoogleCloudStorageHook(}}
> {{            google_cloud_storage_conn_id=CONNECTION_ID)}}
> {{    hook.delete(}}
> {{        bucket=GCS_BUCKET_ID,}}
> {{        object=gcs_export_uri)}}
>  
>  
> {{This runs with PythonOperator.}}
> {{The task marked as Success even though nothing was deleted.}}
> {{Log:}}
> [2018-12-12 11:31:29,247] \{base_task_runner.py:98} INFO - Subtask: [2018-12-12 11:31:29,247] \{transport.py:151} INFO - Attempting refresh to obtain initial access_token [2018-12-12 11:31:29,249] \{base_task_runner.py:98} INFO - Subtask: [2018-12-12 11:31:29,249] \{client.py:795} INFO - Refreshing access_token [2018-12-12 11:31:29,584] \{base_task_runner.py:98} INFO - Subtask: [2018-12-12 11:31:29,583] \{python_operator.py:90} INFO - Done. Returned value was: None
>  
>  
> I expect the function to fail and return something like "file was not found" if there is nothing to delete Or let the user decide with specific flag if he wants the function to fail or success if files were not found.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)