You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/07 09:43:43 UTC

[GitHub] [airflow] bperson opened a new pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

bperson opened a new pull request #15247:
URL: https://github.com/apache/airflow/pull/15247


   Should fix #14515.
   
   I saw 3 ways of fixing it:
   1. do a DB migration to represent an infinite pool with `maxint` slots instead of `-1`.
   2. do the same thing but at runtime in `slots_stats`.
   3. get `_executable_task_instances_to_queued` to understand `-1` pools, changing all code operating on `pool_slots_free` and `open_slots ` [#](https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L908) [#](https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L975) [#](https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L992) [#](https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L1045) [#](https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L1060)
   
   As to why I chose 2:
   1. could be a decent alternative but it's a DB migration with all the complexities inherent to it.
   2. is relatively non-intrusive: it's almost free to do it at runtime and it keeps the change completely in the core Python part of Airflow ( in case someone out there directly creates pools by inserting them in his DB 😨 )
   3. has the benefit of truly handling infinite pools but I doubt anyone has reached a scale where it matters ( we're talking about a pool with `9223372036854775807` slots here on my system ), plus it increases the overall complexity of that function which is close to the core of the scheduler, we might as well try to keep it as lean as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-859664108


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb edited a comment on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-859683536


   @bperson We've got a test failure in `tests/models/test_pool.py::TestPool::test_infinite_slots`
   
   ```   E       AssertionError: assert {'default_poo... 'total': -1}} == {'default_poo...'total': inf}}
     E         Omitting 1 identical items, use -vv to show
     E         Differing items:
     E         {'test_pool': {'open': -1, 'queued': 1, 'running': 1, 'total': -1}} != {'test_pool': {'open': inf, 'queued': 1, 'running': 1, 'total': inf}}
     E         Use -v to get the full diff
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-860613536


   I have rebased to latest main, hopefully that should fix the 404


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#discussion_r640686927



##########
File path: airflow/models/pool.py
##########
@@ -106,6 +107,8 @@ def slots_stats(
 
         pool_rows: Iterable[Tuple[str, int]] = query.all()
         for (pool_name, total_slots) in pool_rows:
+            if total_slots == -1:
+                total_slots = maxsize

Review comment:
       The one downside to this is that it would then be logged as `9223372036854775807 slots free` etc.
   
   It breaks/is lying about the type, but how about:
   
   ```suggestion
                   total_slots = float('inf')  # type: ignore
   ```
   
   Because this has all the properties we want:
   
   - Anything subtracted from it is still infinity
   - It is greater than any actual number.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-859683536


   @bperson We've got a test failure:
   
   ```   E       AssertionError: assert {'default_poo... 'total': -1}} == {'default_poo...'total': inf}}
     E         Omitting 1 identical items, use -vv to show
     E         Differing items:
     E         {'test_pool': {'open': -1, 'queued': 1, 'running': 1, 'total': -1}} != {'test_pool': {'open': inf, 'queued': 1, 'running': 1, 'total': inf}}
     E         Use -v to get the full diff
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-859578935


   Eh, what is the bot doing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-814966757


   If typing is applied well to the related code path, declaring `total` (and `open` since it’s calculated from it) to `Optional[int]` instead would be a better variant to option 3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-859591112


   Strange


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #15247:
URL: https://github.com/apache/airflow/pull/15247


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb merged pull request #15247: Fix tasks in an infinite slots pool were never scheduled

Posted by GitBox <gi...@apache.org>.
ashb merged pull request #15247:
URL: https://github.com/apache/airflow/pull/15247


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bperson commented on a change in pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
bperson commented on a change in pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#discussion_r645578585



##########
File path: airflow/models/pool.py
##########
@@ -106,6 +107,8 @@ def slots_stats(
 
         pool_rows: Iterable[Tuple[str, int]] = query.all()
         for (pool_name, total_slots) in pool_rows:
+            if total_slots == -1:
+                total_slots = maxsize

Review comment:
       Great idea, and existing log end up pretty solid too:
   ```
   [2021-06-04 13:37:28,963] {scheduler_job.py:993} INFO - Figuring out tasks to run in Pool(name=infinite_pool) with inf open slots and 1 task instances ready to be queued
   [2021-06-04 13:37:28,963] {scheduler_job.py:1021} INFO - DAG SchedulerJobTest.test_infinite_pool has 0/16 running and queued tasks
   [2021-06-04 13:37:28,963] {scheduler_job.py:1086} INFO - Setting the following tasks to queued state:
   	<TaskInstance: SchedulerJobTest.test_infinite_pool.dummy 2016-01-01 00:00:00+00:00 [scheduled]>
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-846479249


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-860680288


   Damnit pymsql:
   
   > TypeError: unsupported operand type(s) for -: 'float' and 'decimal.Decimal'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bperson commented on pull request #15247: Infinite pools: Make their `total_slots` be `maxint`-like instead of `-1`

Posted by GitBox <gi...@apache.org>.
bperson commented on pull request #15247:
URL: https://github.com/apache/airflow/pull/15247#issuecomment-860016031


   @ashb sorry about that, it should be better now ( though Ci seems to be stuck in an endless 404-ing loop right now πŸ˜… )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org