You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/27 14:40:27 UTC
[GitHub] [airflow] lidalei opened a new pull request, #25342: Fix BigQueryInsertJobOperator cancel_on_kill
lidalei opened a new pull request, #25342:
URL: https://github.com/apache/airflow/pull/25342
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of an existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
This PR fixes an issue with BigQueryInsertJobOperator. If the task reaches timeout set by task `execution_timeout`, `on_kill` will be called but the `self.job_id` is None. This is because the function `_submit_job` is a blocking call but `self.job_id` is only set after it. This PR is hugely inspired by https://github.com/apache/airflow/pull/22955.
<!--
**^ Add meaningful description above**
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
--->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #25342: Fix BigQueryInsertJobOperator cancel_on_kill
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25342:
URL: https://github.com/apache/airflow/pull/25342#issuecomment-1199203518
Some errors though
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] lidalei commented on pull request #25342: Fix BigQueryInsertJobOperator cancel_on_kill
Posted by GitBox <gi...@apache.org>.
lidalei commented on PR #25342:
URL: https://github.com/apache/airflow/pull/25342#issuecomment-1199455724
> Some errors though
Fixed the failed test case. The `Conflict` exception will be raised when we call `_begin`.
```
Traceback (most recent call last):
File "/Users/dalei/.pyenv/versions/3.8.10/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/Users/dalei/go/src/github.com/lidalei/airflow/airflow/providers/google/common/hooks/base_google.py", line 463, in inner_wrapper
return func(self, *args, **kwargs)
File "/Users/dalei/go/src/github.com/lidalei/airflow/airflow/providers/google/cloud/hooks/bigquery.py", line 1542, in insert_job
job._begin()
File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1298, in _begin
super(QueryJob, self)._begin(client=client, retry=retry, timeout=timeout)
File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 510, in _begin
api_response = client._call_api(
File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 759, in _call_api
return call()
File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func
return retry_target(
File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/api_core/retry.py", line 190, in retry_target
return target()
File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/_http/__init__.py", line 494, in api_request
raise exceptions.from_http_response(response)
google.api_core.exceptions.Conflict: 409 POST https://bigquery.googleapis.com/bigquery/v2/projects/xxx/jobs?prettyPrint=false: Already Exists: Job xxx:EU.abc_test
Location: EU
Job ID: abc_test
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #25342: Fix BigQueryInsertJobOperator cancel_on_kill
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #25342:
URL: https://github.com/apache/airflow/pull/25342#issuecomment-1205328475
Awesome work, congrats on your first merged pull request!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #25342: Fix BigQueryInsertJobOperator cancel_on_kill
Posted by GitBox <gi...@apache.org>.
potiuk merged PR #25342:
URL: https://github.com/apache/airflow/pull/25342
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] josh-fell commented on a diff in pull request #25342: Fix BigQueryInsertJobOperator cancel_on_kill
Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #25342:
URL: https://github.com/apache/airflow/pull/25342#discussion_r933313569
##########
airflow/providers/google/cloud/operators/bigquery.py:
##########
@@ -2221,10 +2217,16 @@ def execute(self, context: Any):
BigQueryTableLink.persist(**persist_kwargs)
self.job_id = job.job_id
- return job.job_id
+ # Wait for the job to complete
+ job.result(timeout=self.result_timeout, retry=self.result_retry)
+ self._handle_job_error(job)
+
+ return self.job_id
def on_kill(self) -> None:
if self.job_id and self.cancel_on_kill:
self.hook.cancel_job( # type: ignore[union-attr]
job_id=self.job_id, project_id=self.project_id, location=self.location
)
+ else:
+ self.log.info(f'Skipping to cancel job: {self.project_id}:{self.location}.{self.job_id}')
Review Comment:
```suggestion
self.log.info('Skipping to cancel job: %s:%s.%s', self.project_id, self.location, self.job_id)
```
We should [try to avoid f-strings in logging statements](https://blog.pilosus.org/posts/2020/01/24/python-f-strings-in-logging/) and use %-format style instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org