You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/10/21 16:16:26 UTC
[GitHub] [airflow] nicholashendrickx-tomtom opened a new pull request, #27185: Handle AzureBatch task result correctly
nicholashendrickx-tomtom opened a new pull request, #27185:
URL: https://github.com/apache/airflow/pull/27185
closes: https://github.com/apache/airflow/issues/25635
This change looks at the specific task in scope of the Operator and only treats the Operator as failed when its own task failed.
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of an existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
---
**^ Add meaningful description above**
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
o-nikolas commented on code in PR #27185:
URL: https://github.com/apache/airflow/pull/27185#discussion_r1008320041
##########
airflow/providers/microsoft/azure/hooks/batch.py:
##########
@@ -359,18 +359,44 @@ def wait_for_job_tasks_to_complete(self, job_id: str, timeout: int) -> list[batc
incomplete_tasks = [task for task in tasks if task.state != batch_models.TaskState.completed]
if not incomplete_tasks:
# detect if any task in job has failed
- fail_tasks = [
+ failed_tasks = [
task
for task in tasks
- if task.executionInfo.result
- == batch_models.TaskExecutionInformation.TaskExecutionResult.failure
+ if (task.execution_info.result == batch_models.TaskExecutionResult.failure) or (
+ task.execution_info.exit_code != 0)
]
- return fail_tasks
+ return failed_tasks
for task in incomplete_tasks:
self.log.info("Waiting for %s to complete, currently on %s state", task.id, task.state)
time.sleep(15)
raise TimeoutError("Timed out waiting for tasks to complete")
+ def wait_for_single_job_task_to_complete(self, job_id: str, task_id: str,
+ timeout: int) -> batch_models.CloudTask | None:
+ """
+ Wait for a single task in a particular job to complete
+
+ :param job_id: A string that identifies the job
+ :param task_id: A string that identifies the task
+ :param timeout: The amount of time to wait before timing out in minutes
Review Comment:
Something about the return value, name and doc string don't quite match up with the behaviour of the code itself. A task is only returned in a failure case, and nothing (well none) is returned if it is successful. The code calling this is really just using the returned task as a boolean to detect whether it was a pass or fail. What about:
```suggestion
def wait_for_single_job_task_to_complete(self, job_id: str, task_id: str,
timeout: int) -> bool:
"""
Wait for a single task in a particular job to complete, return False if it ultimately fails or True if it succeeds.
:param job_id: A string that identifies the job
:param task_id: A string that identifies the task
:param timeout: The amount of time to wait before timing out in minutes
:return: A bool that represents whether or not the task failed (false) or succeeded (true)
```
Just a minor nit though
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nicholashendrickx-tomtom commented on pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
nicholashendrickx-tomtom commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1335210038
I finally found some time to finish this @o-nikolas. Please review once more.
I rebased on the updated main branch since I was running a lot of commits behind and there was a conflict which could not be auto-resolved.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #27185: Handle AzureBatch task result correctly
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1475449672
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nicholashendrickx-tomtom commented on a diff in pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
nicholashendrickx-tomtom commented on code in PR #27185:
URL: https://github.com/apache/airflow/pull/27185#discussion_r1006545878
##########
airflow/providers/microsoft/azure/operators/batch.py:
##########
@@ -293,16 +293,17 @@ def execute(self, context: Context) -> None:
# Add task to job
self.hook.add_single_task_to_job(job_id=self.batch_job_id, task=task)
# Wait for tasks to complete
- fail_tasks = self.hook.wait_for_job_tasks_to_complete(job_id=self.batch_job_id, timeout=self.timeout)
+ failed_tasks = self.hook.wait_for_job_tasks_to_complete(job_id=self.batch_job_id,
+ timeout=self.timeout)
# Clean up
if self.should_delete_job:
# delete job first
self.clean_up(job_id=self.batch_job_id)
if self.should_delete_pool:
self.clean_up(self.batch_pool_id)
# raise exception if any task fail
Review Comment:
Good point. I wrote an improvement locally, let me test it and then push it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nicholashendrickx-tomtom commented on pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
nicholashendrickx-tomtom commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1375466429
> > > I finally found some time to finish this @o-nikolas. Please review once more. I rebased on the updated main branch since I was running a lot of commits behind and there was a conflict which could not be auto-resolved.
> >
> >
> > Hey @nicholashendrickx-tomtom!
> > It looks like some test cases are still failing. Do you mind sorting those out before final review? Fixing them may result in changes to the code.
>
> Hey @nicholashendrickx-tomtom, any update on this one?
Sorry no update yet. I should check what failed. One this is formatting but I haven't looked into the database related test errors.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] o-nikolas commented on pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
o-nikolas commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1335577776
> I finally found some time to finish this @o-nikolas. Please review once more. I rebased on the updated main branch since I was running a lot of commits behind and there was a conflict which could not be auto-resolved.
Hey @nicholashendrickx-tomtom!
It looks like some test cases are still failing. Do you mind sorting those out before final review? Fixing them may result in changes to the code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1287175204
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:
- Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
- In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
- Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
- Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
- Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
- Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
Apache Airflow is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://s.apache.org/airflow-slack
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] closed pull request #27185: Handle AzureBatch task result correctly
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed pull request #27185: Handle AzureBatch task result correctly
URL: https://github.com/apache/airflow/pull/27185
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
o-nikolas commented on code in PR #27185:
URL: https://github.com/apache/airflow/pull/27185#discussion_r1005174990
##########
airflow/providers/microsoft/azure/operators/batch.py:
##########
@@ -293,16 +293,17 @@ def execute(self, context: Context) -> None:
# Add task to job
self.hook.add_single_task_to_job(job_id=self.batch_job_id, task=task)
# Wait for tasks to complete
- fail_tasks = self.hook.wait_for_job_tasks_to_complete(job_id=self.batch_job_id, timeout=self.timeout)
+ failed_tasks = self.hook.wait_for_job_tasks_to_complete(job_id=self.batch_job_id,
+ timeout=self.timeout)
# Clean up
if self.should_delete_job:
# delete job first
self.clean_up(job_id=self.batch_job_id)
if self.should_delete_pool:
self.clean_up(self.batch_pool_id)
# raise exception if any task fail
Review Comment:
It's odd that you're waiting on several tasks even though only one was submitted, but then again, I'm not very familiar with this Azure API.
Either way, this comment should be updated to reflect the new behaviour of only failing on a specific task.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] o-nikolas commented on a diff in pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
o-nikolas commented on code in PR #27185:
URL: https://github.com/apache/airflow/pull/27185#discussion_r1026066774
##########
airflow/providers/microsoft/azure/hooks/batch.py:
##########
@@ -359,18 +359,44 @@ def wait_for_job_tasks_to_complete(self, job_id: str, timeout: int) -> list[batc
incomplete_tasks = [task for task in tasks if task.state != batch_models.TaskState.completed]
if not incomplete_tasks:
# detect if any task in job has failed
- fail_tasks = [
+ failed_tasks = [
task
for task in tasks
- if task.executionInfo.result
- == batch_models.TaskExecutionInformation.TaskExecutionResult.failure
+ if (task.execution_info.result == batch_models.TaskExecutionResult.failure) or (
+ task.execution_info.exit_code != 0)
]
- return fail_tasks
+ return failed_tasks
for task in incomplete_tasks:
self.log.info("Waiting for %s to complete, currently on %s state", task.id, task.state)
time.sleep(15)
raise TimeoutError("Timed out waiting for tasks to complete")
+ def wait_for_single_job_task_to_complete(self, job_id: str, task_id: str,
+ timeout: int) -> batch_models.CloudTask | None:
+ """
+ Wait for a single task in a particular job to complete
+
+ :param job_id: A string that identifies the job
+ :param task_id: A string that identifies the task
+ :param timeout: The amount of time to wait before timing out in minutes
Review Comment:
Hey @nicholashendrickx-tomtom have you had a time to revisit this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nicholashendrickx-tomtom commented on a diff in pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
nicholashendrickx-tomtom commented on code in PR #27185:
URL: https://github.com/apache/airflow/pull/27185#discussion_r1008003897
##########
airflow/providers/microsoft/azure/operators/batch.py:
##########
@@ -293,16 +293,17 @@ def execute(self, context: Context) -> None:
# Add task to job
self.hook.add_single_task_to_job(job_id=self.batch_job_id, task=task)
# Wait for tasks to complete
- fail_tasks = self.hook.wait_for_job_tasks_to_complete(job_id=self.batch_job_id, timeout=self.timeout)
+ failed_tasks = self.hook.wait_for_job_tasks_to_complete(job_id=self.batch_job_id,
+ timeout=self.timeout)
# Clean up
if self.should_delete_job:
# delete job first
self.clean_up(job_id=self.batch_job_id)
if self.should_delete_pool:
self.clean_up(self.batch_pool_id)
# raise exception if any task fail
Review Comment:
I've refactored it @o-nikolas feel free to take a look.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] o-nikolas commented on pull request #27185: Handle AzureBatch task result correctly
Posted by "o-nikolas (via GitHub)" <gi...@apache.org>.
o-nikolas commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1414442210
Looks like both static checks and tests are failing.
To catch and fix static checks early, you can enable pre-commit on your laptop. See steps on how to do that [here](https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#installing-pre-commit-hooks)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] o-nikolas commented on pull request #27185: Handle AzureBatch task result correctly
Posted by GitBox <gi...@apache.org>.
o-nikolas commented on PR #27185:
URL: https://github.com/apache/airflow/pull/27185#issuecomment-1372992380
> > I finally found some time to finish this @o-nikolas. Please review once more. I rebased on the updated main branch since I was running a lot of commits behind and there was a conflict which could not be auto-resolved.
>
> Hey @nicholashendrickx-tomtom!
>
> It looks like some test cases are still failing. Do you mind sorting those out before final review? Fixing them may result in changes to the code.
Hey @nicholashendrickx-tomtom, any update on this one?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org