You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/03 13:11:04 UTC

[GitHub] [airflow] turbaszek edited a comment on pull request #9590: Improve idempotency of BigQueryInsertJobOperator

turbaszek edited a comment on pull request #9590:
URL: https://github.com/apache/airflow/pull/9590#issuecomment-668013078


   @edejong @jaketf @potiuk @nathadfield I added changes the ``force_rerun``, ``reattach_states`` and job_id from configuration has. This operator works now in the following way:
   
   - it calculates a unique hash of the job using job's configuration or uuid if force_rerun is True
   - creates job_id in form of
   `[provide_job_id | airflow_{dag_id}_{task_id}_{exec_date}]_{uniqueness_suffix}`
   - submits a BigQuery job using the job_id
   - if job with given id already exists then it tries to reattach to the job if its not done and its
   state is in reattach_states. If the job is done the operator will raise AirflowException.
   
   Using force_rerun will submit a new job every time without attaching to already existing ones.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org