You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/29 03:28:35 UTC

[GitHub] [airflow] YingboWang opened a new pull request #8624: Profile hostname for celery executor (#8619)

YingboWang opened a new pull request #8624:
URL: https://github.com/apache/airflow/pull/8624


   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   Address github issue #8619 
   When there is an inconsistency between executor event return state and DB state. It is hard to debug the root cause and for celery executor we need to query logs from all workers. The reason is that at this time DB hostname was not updated and we don't know which worker have picked up the task and caused the failure. If we have the hostname with the executor event, we will be able to identify the host and debug root cause on that host. It could reduce time and operations to find out problematic host.
   
   Implementation:
   Change value type of event_buffer from str state to tuple of state and info. Add hostname to the celery command failed exception information and pass celery info into event_buffer.
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] YingboWang commented on a change in pull request #8624: Profile hostname for celery executor (#8619)

Posted by GitBox <gi...@apache.org>.
YingboWang commented on a change in pull request #8624:
URL: https://github.com/apache/airflow/pull/8624#discussion_r418249567



##########
File path: airflow/executors/base_executor.py
##########
@@ -58,7 +58,7 @@ def __init__(self, parallelism: int = PARALLELISM):
         self.queued_tasks: OrderedDict[TaskInstanceKeyType, QueuedTaskInstanceType] \
             = OrderedDict()
         self.running: Set[TaskInstanceKeyType] = set()
-        self.event_buffer: Dict[TaskInstanceKeyType, Optional[str]] = {}
+        self.event_buffer: Dict[TaskInstanceKeyType, Tuple[Optional[str], Any]] = {}

Review comment:
       Yes. I would like to create a custom type for this case. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] KevinYang21 merged pull request #8624: Profile hostname for celery executor (#8619)

Posted by GitBox <gi...@apache.org>.
KevinYang21 merged pull request #8624:
URL: https://github.com/apache/airflow/pull/8624


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] YingboWang commented on pull request #8624: Profile hostname for celery executor (#8619)

Posted by GitBox <gi...@apache.org>.
YingboWang commented on pull request #8624:
URL: https://github.com/apache/airflow/pull/8624#issuecomment-622084656


   @ashb Thank you for your comments. 
   
   Although the issue was raised from celery executor. It could raise question for all executor with distributed workers: "where did these unknown executor failure come from?" The blocker is that base executor event_buffer doesn't hold any information other than "state". Adding a field to base executor event_buffer do need update multiple executors but it enables all executors transfer information back to scheduler as needed and that is very helpful for debugging failures that are not reflected in airflow task logs. What do you think.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8624: Profile hostname for celery executor (#8619)

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8624:
URL: https://github.com/apache/airflow/pull/8624#discussion_r417860289



##########
File path: airflow/executors/base_executor.py
##########
@@ -58,7 +58,7 @@ def __init__(self, parallelism: int = PARALLELISM):
         self.queued_tasks: OrderedDict[TaskInstanceKeyType, QueuedTaskInstanceType] \
             = OrderedDict()
         self.running: Set[TaskInstanceKeyType] = set()
-        self.event_buffer: Dict[TaskInstanceKeyType, Optional[str]] = {}
+        self.event_buffer: Dict[TaskInstanceKeyType, Tuple[Optional[str], Any]] = {}

Review comment:
       `Any` doesn't feel like the right type.
   
   Additionally rather than using a tuple it would be "clearer" if you created a custom type like we did for `TaskInstanceKeyType`

##########
File path: airflow/executors/base_executor.py
##########
@@ -58,7 +58,7 @@ def __init__(self, parallelism: int = PARALLELISM):
         self.queued_tasks: OrderedDict[TaskInstanceKeyType, QueuedTaskInstanceType] \
             = OrderedDict()
         self.running: Set[TaskInstanceKeyType] = set()
-        self.event_buffer: Dict[TaskInstanceKeyType, Optional[str]] = {}
+        self.event_buffer: Dict[TaskInstanceKeyType, Tuple[Optional[str], Any]] = {}

Review comment:
       Oh, because this is in the BaseExecutor. Hmmmmmm.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org