You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/18 23:55:18 UTC

[GitHub] [airflow] ReadytoRocc opened a new issue #20940: Deferrable Tasks do not Respect Task Pools

ReadytoRocc opened a new issue #20940:
URL: https://github.com/apache/airflow/issues/20940


   ### Apache Airflow version
   
   2.2.3 (latest released)
   
   ### What happened
   
   When using a deferrable operator, the pool slot is released after the task goes into a state of deferred.
   
   ### What you expected to happen
   
   I would expect the task to retain the pool slot for the entire execution across the worker and triggerer resources, and only release the pool once it reaches a terminal state.
   
   ### How to reproduce
   
   Create a DAG the below dag, and a Pool named `async` with a slot count of `1`. Let this dag run, and notice that tasks first 16 tasks enter `queued` -> `running` -> `deferred`.
   
   ```
   from datetime import datetime
   from airflow import DAG
   from airflow.sensors.date_time import DateTimeSensorAsync
   
   with DAG(
       "async_dag",
       start_date=datetime(2022, 1, 18),
       schedule_interval="* * * * *",
       catchup=True,
   ) as dag:
   
       async_sensor = DateTimeSensorAsync(
           task_id="async_task",
           target_time="""{{ macros.datetime.utcnow() + macros.timedelta(minutes=20) }}""",
           pool="async",
       )
   ```
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ReadytoRocc commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
ReadytoRocc commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1026192205


   @andrewgodwin & @potiuk thank you for the detailed responses and for confirming this behavior is expected. Are we able to convert this `issue` into a `discussion`? I believe there are some use cases for using pools with deferrable operators.
   
   An example would be if you want to limit the use of an external service that Airflow is orchestrating. Another would be to limit certain sets of tasks from using all of the triggerer’s resources.
   
   Please let me know your thoughts, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1016366442


   I think this is an intended feature, not a bug. 
   
   It's the matter of assumption/definiton of whether the deferred task takes the pool slot or not. I think (but I would love to hear what others think) this is quite a reasonable assumption that deferred task does not take a slot.
   
   Taking into account that the deferred task - by definition takes almost no resources and any external communication is asynchronous and pool feature is designed to limit resources an potentially decrease the loaad/throttle external, mostly synchronous services, I thin not taking a pool by deferred task is quite reasonable.
   
   Does the task take the slot back when it is back from defferred state @ReadytoRocc ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ReadytoRocc commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
ReadytoRocc commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1016649552


   > Does the task take the slot back when it is back from defferred state?
   
   @potiuk yes. Once the `trigger` is finished, the task will be marked as `SCHEDULED`. The task will then re-queue once a `pool` slot is available.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1016662122


   So I'd say it's pretty expected behaviour. I wonder what others think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1016366442


   I think this is an intended feature, not a bug. 
   
   It's the matter of assumption/definiton of whether the deferred task takes the pool slot or not. I think (but I would love to hear what others think) this is quite a reasonable assumption that deferred task does not take a slot.
   
   Taking into account that the deferred task - by definition takes almost no resources and any external communication is asynchronous and pool feature is designed to limit resources an potentially decrease the loaad/throttle external, mostly synchronous services, I think not taking a pool slot by deferred task is quite reasonable.
   
   Does the task take the slot back when it is back from defferred state @ReadytoRocc ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1019540737


   In short:
   
   ![image](https://user-images.githubusercontent.com/595491/150692540-618e1ad5-04a2-45fc-bd79-b80766a7fdf8.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] andrewgodwin commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
andrewgodwin commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1019533532


   This was indeed originally by design, as I saw pools as a "worker/active task slot" system, designed to concurrency-limit things that were running - and of course, deferred tasks are not really running.
   
   I think either option here is reasonable, and we could change it to have deferred tasks use up a pool slot if that's the consensus, but I went with this behaviour in the initial feature as it seemed most "sensible". It's probably worth documenting it explicitly, though?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1019540510


   I agree with @andrewgodwin @ReadytoRocc - is that explanation/behaviour plausible for you ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20940: Deferrable Tasks do not Respect Task Pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20940:
URL: https://github.com/apache/airflow/issues/20940#issuecomment-1019531224


   @andrewgodwin @uranusjr  - WDYT ? I think the behaviour is intended and maybe just need a bit clarification in the docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org