You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/12/05 12:59:42 UTC

[GitHub] [airflow] hterik opened a new issue, #28116: Present task-errors caused by scheduler in UI

hterik opened a new issue, #28116:
URL: https://github.com/apache/airflow/issues/28116

   ### Description
   
   Some times tasks are marked as failed by the scheduler or other internal airflow components, with the reason being known by the scheduler, but not surfaced to users. Instead the task is just marked as failed, without any indication why in the Logs-tab nor under Task instance details. Only way to figure out is for admins to go through the scheduler-logs.
   
   Example: `{scheduler_job.py:1526} ERROR - Detected zombie job: .....`, marks the task as failed.
   Here the UI looks like this, with the logs-tab being only showing an empty attempt of fetching log files from worker:
   ![image](https://user-images.githubusercontent.com/89977373/205640822-6c3f05e3-3c15-4dbd-b46b-5f415bbf1962.png)
   
   If there is a need to hide potential scheduler-secrets from users, there should at least be some generic error message saying that the task was not run to completion because of internal scheduler error. So that users know if they should debug task-logic themselves, or involve admins in debugging the infrastructure.
   
   Looking at `taskinstance.handle_failure`-function it seem like most the code for handling such errors is already very centralized. Is there a way to attach the error there onto the taskinstance itself, for use in the UI? Or is it possible to merge the relevant log streams to show in the Logs-tab?
   
   ### Use case/motivation
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on issue #28116: Present task-errors caused by scheduler in UI

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #28116:
URL: https://github.com/apache/airflow/issues/28116#issuecomment-1342540552

   yeah this is also why cluster policy overrides don't appear in logs nor in task instance.
   I think this is part of a greater challange that we need to figure out how to handle.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28116: Present task-errors caused by scheduler in UI

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #28116:
URL: https://github.com/apache/airflow/issues/28116#issuecomment-1338382467

   this is not as straightforward as you think - you should take a look at airflow's architecture - scheduler logs are the logs from scheduler, not from task. They are stored on a different instance where UI has no access (UI only accesses task logs from the worker where workers are involved and not from scheduler. Scheduler is really a control plane for the tasks.
   
   Implementing such a feature would lkely require somethign similar to "import errors" - where errors are stored in a DB and then displayed in the UI - but those errors would have to be stored in task instance model. 
   
   But yeah. might be an interesting feature. Maybe someone will pick it up - marked it as good first issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org