You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 22:08:10 UTC

[GitHub] [airflow] lgwacker opened a new issue #13799: Scheduler crashes when turning some dags on with TypeError: '>' not supported between instances of 'NoneType' and 'int'

lgwacker opened a new issue #13799:
URL: https://github.com/apache/airflow/issues/13799


   **Apache Airflow version**:
   2.0.0
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   1.15
   **Environment**:
   
   - **Cloud provider or hardware configuration**: 
   GKE
   - **OS** (e.g. from /etc/os-release):
   Ubuntu 18.04
   
   **What happened**:
   I just migrated from 1.10.14 to 2.0.0. When I turn on some random dags, the scheduler crashes with the following error:
   
   ```python
   Traceback (most recent call last):
     File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/scheduler_job.py", line 1533, in _do_scheduling
       num_queued_tis = self._critical_section_execute_task_instances(session=session)
     File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/scheduler_job.py", line 1132, in _critical_section_execute_task_instances
       queued_tis = self._executable_task_instances_to_queued(max_tis, session=session)
     File "/usr/local/lib/python3.6/dist-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/usr/local/lib/python3.6/dist-packages/airflow/jobs/scheduler_job.py", line 1034, in _executable_task_instances_to_queued
       if task_instance.pool_slots > open_slots:
   TypeError: '>' not supported between instances of 'NoneType' and 'int'
   ```
   **What you expected to happen**:
   
   I expected those dags would have their tasks scheduled without problems.
   
   **How to reproduce it**:
   
   Can't reproduce it yet. Still trying to figure out if this happens only with specific dags or not.
   
   **Anything else we need to know**:
   
   I couldn't find in which context `task_instance.pool_slots` could be None
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771679440


   > Experiencing this same issue. We upgraded lower environments without problems, but production is throwing this error.
   
   Can you provide some logs please with the stacktrace-- (including logs before and after the error) @scrawfor 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771763410


   @scrawfor another thing I did was deleting the pool, restarting the scheduler, turning the dag on, and creating the pool again. Can you try that?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771763410


   @scrawfor another thing I did was deleting the pool, restarting the scheduler, turning the dag on, and creating the pool again. Can you try that?
   
   Besides that, do you have any DAGs that were previously deleted still showing up on the webserver (since now it serializes them)? If so, delete them. This caused problems for me too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771679440


   > Experiencing this same issue. We upgraded lower environments without problems, but production is throwing this error.
   
   Can you provide some logs please with the stacktrace-- (including logs before and after the error) @scrawfor 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zachliu edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
zachliu edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-784565641


   same situation with the `operator` column in `task_instance` table
   i had to do 
   ```SQL
   UPDATE task_instance SET operator = 'NoOperator' WHERE operator IS NULL;
   ```
   should i open a new issue?
   
   ```python
   >>> import json
   >>> import requests
   >>> from requests.auth import HTTPBasicAuth
   >>> payload = {"dag_ids": ["{my_dag_id}"]}
   >>> r = requests.post("https://localhost:8080/api/v1/dags/~/dagRuns/~/taskInstances/list", auth=HTTPBasicAuth('username', 'password'), data=json.dumps(payload), headers={'Content-Type': 'application/json'})
   >>> r.status_code
   500
   >>> print(r.text)
   {
     "detail": "None is not of type 'string'\n\nFailed validating 'type' in schema['allOf'][0]['properties'][
   'task_instances']['items']['properties']['operator']:\n    {'type': 'string'}\n\nOn instance['task_instanc
   es'][5]['operator']:\n    None",
     "status": 500,
     "title": "Response body does not conform to specification",
     "type": "https://airflow.apache.org/docs/2.0.1/stable-rest-api-ref.html#section/Errors/Unknown"
   }
   None is not of type 'string'
   
   Failed validating 'type' in schema['allOf'][0]['properties']['task_instances']['items']['properties']['ope
   rator']:
       {'type': 'string'}
   
   On instance['task_instances'][5]['operator']:
       None
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-772963317


   Another workaround is to run the following query in your metadata db:
   
   ```sql
   UPDATE task_instance SET pool_slots = 1 WHERE pool_slots IS NULL;
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scrawfor commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
scrawfor commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771671779


   Experiencing this same issue. We upgraded lower environments without problems, but production is throwing this error.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771679440


   > Experiencing this same issue. We upgraded lower environments without problems, but production is throwing this error.
   
   Can you provide some logs please with the stacktrace-- (including logs before and after the error)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] gdevanla commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
gdevanla commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-776337291


   Thanks, @kaxil .  That `update` statement has resolved the problem for me.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-764696829


   Please let us know if this is reproducible


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-764696829


   Please let us know if this is reproducible


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scrawfor commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
scrawfor commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771671779






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scrawfor commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
scrawfor commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771765817


   @lgwacker In a rush to get back online I just renamed the dag which did work after a couple of restarts.  But I appreciate the help!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
zachliu commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-785154394


   > That's a separate issue, can you create a Github issue for that with steps to reproduce
   
   done https://github.com/apache/airflow/issues/14421
   thanks :+1: 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scrawfor edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
scrawfor edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771753922


   @kaxil Sure. 
   @lgwacker - Unfortunately that did not work for me.  ~~Even renaming the dag doesn't seem to have solved the issue.~~ Renaming the dag did fix my issue, although I had to restart the service twice.
   
   ```sh
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-02-02 15:03:30,310] {scheduler_job.py:1241} INFO - Starting the scheduler
   [2021-02-02 15:03:30,310] {scheduler_job.py:1246} INFO - Processing each file at most -1 times
   [2021-02-02 15:03:30,413] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 118
   [2021-02-02 15:03:30,414] {scheduler_job.py:1751} INFO - Resetting orphaned tasks for active dag runs
   [2021-02-02 15:03:30,463] {settings.py:52} INFO - Configured default timezone Timezone('America/New_York')
   [2021-02-02 15:03:30,643] {scheduler_job.py:938} INFO - 32 tasks up for execution:
          <TASK LIST WAS HERE>
   [2021-02-02 15:03:30,652] {scheduler_job.py:967} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 32 task instances ready to be queued
   [2021-02-02 15:03:30,652] {scheduler_job.py:995} INFO - DAG <dag1> has 0/16 running and queued tasks
   [2021-02-02 15:03:30,652] {scheduler_job.py:995} INFO - DAG <dag1> has 1/16 running and queued tasks
   [2021-02-02 15:03:30,652] {scheduler_job.py:995} INFO - DAG <dag1> has 2/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag1> has 3/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag1> has 4/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag1> has 5/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag2> has 0/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag2> has 1/16 running and queued tasks
   [2021-02-02 15:03:30,654] {scheduler_job.py:995} INFO - DAG <dag2> has 2/16 running and queued tasks
   [2021-02-02 15:03:30,654] {scheduler_job.py:995} INFO - DAG <dag2> has 3/16 running and queued tasks
   [2021-02-02 15:03:30,654] {scheduler_job.py:995} INFO - DAG <dag2> has 4/16 running and queued tasks
   [2021-02-02 15:03:30,661] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1533, in _do_scheduling
       num_queued_tis = self._critical_section_execute_task_instances(session=session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1132, in _critical_section_execute_task_instances
       queued_tis = self._executable_task_instances_to_queued(max_tis, session=session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1034, in _executable_task_instances_to_queued
       if task_instance.pool_slots > open_slots:
   TypeError: '>' not supported between instances of 'NoneType' and 'int'
   [2021-02-02 15:03:31,668] {process_utils.py:95} INFO - Sending Signals.SIGTERM to GPID 118
   [2021-02-02 15:03:32,157] {process_utils.py:61} INFO - Process psutil.Process(pid=164, status='terminated', started='15:03:30') (164) terminated with exit code None
   [2021-02-02 15:03:32,175] {process_utils.py:61} INFO - Process psutil.Process(pid=172, status='terminated', started='15:03:30') (172) terminated with exit code None
   [2021-02-02 15:03:32,177] {process_utils.py:201} INFO - Waiting up to 5 seconds for processes to exit...
   [2021-02-02 15:03:32,188] {process_utils.py:61} INFO - Process psutil.Process(pid=118, status='terminated', exitcode=0, started='15:03:29') (118) terminated with exit code 0
   [2021-02-02 15:03:32,189] {scheduler_job.py:1296} INFO - Exited execute loop
   Process QueuedLocalWorker-29:
   Process QueuedLocalWorker-31:
   Process QueuedLocalWorker-33:
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 69, in run
       return super().run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 176, in do_work
       key, command = self.task_queue.get()
     File "<string>", line 2, in get
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
       kind, result = conn.recv()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 250, in recv
       buf = self._recv_bytes()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
       buf = self._recv(4)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
       raise EOFError
   EOFError
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 69, in run
       return super().run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 176, in do_work
       key, command = self.task_queue.get()
     File "<string>", line 2, in get
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
       kind, result = conn.recv()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 250, in recv
       buf = self._recv_bytes()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
       buf = self._recv(4)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
       raise EOFError
   EOFError
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 69, in run
       return super().run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 176, in do_work
       key, command = self.task_queue.get()
     File "<string>", line 2, in get
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
       kind, result = conn.recv()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 250, in recv
       buf = self._recv_bytes()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
       buf = self._recv(4)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
       raise EOFError
   EOFError
   Process QueuedLocalWorker-32:
   Process QueuedLocalWorker-26:
   Process QueuedLocalWorker-30:
   ```
   
   *Env Details:*
   * Docker Image: apache/airflow:2.0.0-python3.8
   * Postgres Metadata DB
   * Local Executor


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771763410


   @scrawfor another thing I did was deleting the pool, restarting the scheduler, turning the dag on, and creating the pool again. Can you try that?
   
   Another thing, do you have any DAGs that were previously deleted still showing up on the webserver (since now it serializes them)? If so, delete them. This caused problems for me too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #13799: Scheduler crashes when turning some dags on with TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-763981290


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771681876






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
zachliu commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-784565641


   same situation with the `operator` column in `task_instance` table
   i had to do 
   ```SQL
   UPDATE task_instance SET operator = 'NoOperator' WHERE operator IS NULL;
   ```
   should i open a new issue?
   
   ```python
   >>> import requests
   >>> from requests.auth import HTTPBasicAuth
   >>> r = requests.get("https://localhost:8080/api/v1/dags/~/dagRuns/~/taskInstances/list", auth=HTTPBasicAuth('username', 'password'))
   >>> r.status_code
   500
   >>> print(r.text)
   {
     "detail": "None is not of type 'string'\n\nFailed validating 'type' in schema['allOf'][0]['properties'][
   'task_instances']['items']['properties']['operator']:\n    {'type': 'string'}\n\nOn instance['task_instanc
   es'][5]['operator']:\n    None",
     "status": 500,
     "title": "Response body does not conform to specification",
     "type": "https://airflow.apache.org/docs/2.0.1/stable-rest-api-ref.html#section/Errors/Unknown"
   }
   None is not of type 'string'
   
   Failed validating 'type' in schema['allOf'][0]['properties']['task_instances']['items']['properties']['ope
   rator']:
       {'type': 'string'}
   
   On instance['task_instances'][5]['operator']:
       None
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #13799:
URL: https://github.com/apache/airflow/issues/13799


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771681876






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scrawfor edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
scrawfor edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771753922


   @kaxil Sure. 
   @lgwacker - Unfortunately that did not work for me.  ~~Even renaming the dag doesn't seem to have solved the issue.~~ Renaming the dag did fix my issue, although I had to restart the service twice.
   
   ```sh
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-02-02 15:03:30,310] {scheduler_job.py:1241} INFO - Starting the scheduler
   [2021-02-02 15:03:30,310] {scheduler_job.py:1246} INFO - Processing each file at most -1 times
   [2021-02-02 15:03:30,413] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 118
   [2021-02-02 15:03:30,414] {scheduler_job.py:1751} INFO - Resetting orphaned tasks for active dag runs
   [2021-02-02 15:03:30,463] {settings.py:52} INFO - Configured default timezone Timezone('America/New_York')
   [2021-02-02 15:03:30,643] {scheduler_job.py:938} INFO - 32 tasks up for execution:
          <TASK LIST WAS HERE>
   [2021-02-02 15:03:30,652] {scheduler_job.py:967} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 32 task instances ready to be queued
   [2021-02-02 15:03:30,652] {scheduler_job.py:995} INFO - DAG <dag1> has 0/16 running and queued tasks
   [2021-02-02 15:03:30,652] {scheduler_job.py:995} INFO - DAG <dag1> has 1/16 running and queued tasks
   [2021-02-02 15:03:30,652] {scheduler_job.py:995} INFO - DAG <dag1> has 2/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag1> has 3/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag1> has 4/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag1> has 5/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag2> has 0/16 running and queued tasks
   [2021-02-02 15:03:30,653] {scheduler_job.py:995} INFO - DAG <dag2> has 1/16 running and queued tasks
   [2021-02-02 15:03:30,654] {scheduler_job.py:995} INFO - DAG <dag2> has 2/16 running and queued tasks
   [2021-02-02 15:03:30,654] {scheduler_job.py:995} INFO - DAG <dag2> has 3/16 running and queued tasks
   [2021-02-02 15:03:30,654] {scheduler_job.py:995} INFO - DAG <dag2> has 4/16 running and queued tasks
   [2021-02-02 15:03:30,661] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1533, in _do_scheduling
       num_queued_tis = self._critical_section_execute_task_instances(session=session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1132, in _critical_section_execute_task_instances
       queued_tis = self._executable_task_instances_to_queued(max_tis, session=session)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1034, in _executable_task_instances_to_queued
       if task_instance.pool_slots > open_slots:
   TypeError: '>' not supported between instances of 'NoneType' and 'int'
   [2021-02-02 15:03:31,668] {process_utils.py:95} INFO - Sending Signals.SIGTERM to GPID 118
   [2021-02-02 15:03:32,157] {process_utils.py:61} INFO - Process psutil.Process(pid=164, status='terminated', started='15:03:30') (164) terminated with exit code None
   [2021-02-02 15:03:32,175] {process_utils.py:61} INFO - Process psutil.Process(pid=172, status='terminated', started='15:03:30') (172) terminated with exit code None
   [2021-02-02 15:03:32,177] {process_utils.py:201} INFO - Waiting up to 5 seconds for processes to exit...
   [2021-02-02 15:03:32,188] {process_utils.py:61} INFO - Process psutil.Process(pid=118, status='terminated', exitcode=0, started='15:03:29') (118) terminated with exit code 0
   [2021-02-02 15:03:32,189] {scheduler_job.py:1296} INFO - Exited execute loop
   Process QueuedLocalWorker-29:
   Process QueuedLocalWorker-31:
   Process QueuedLocalWorker-33:
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 69, in run
       return super().run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 176, in do_work
       key, command = self.task_queue.get()
     File "<string>", line 2, in get
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
       kind, result = conn.recv()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 250, in recv
       buf = self._recv_bytes()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
       buf = self._recv(4)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
       raise EOFError
   EOFError
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 69, in run
       return super().run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 176, in do_work
       key, command = self.task_queue.get()
     File "<string>", line 2, in get
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
       kind, result = conn.recv()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 250, in recv
       buf = self._recv_bytes()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
       buf = self._recv(4)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
       raise EOFError
   EOFError
   Traceback (most recent call last):
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
       self.run()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 69, in run
       return super().run()
     File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
       self._target(*self._args, **self._kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 176, in do_work
       key, command = self.task_queue.get()
     File "<string>", line 2, in get
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 835, in _callmethod
       kind, result = conn.recv()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 250, in recv
       buf = self._recv_bytes()
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
       buf = self._recv(4)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
       raise EOFError
   EOFError
   Process QueuedLocalWorker-32:
   Process QueuedLocalWorker-26:
   Process QueuedLocalWorker-30:
   ```
   
   *Env Details:*
   * Docker Image: apache/airflow:2.0.0-python3.8
   * Postgres Metadata DB
   * Local Executor


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-784632092






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #13799: Scheduler crashes when turning some dags on with TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-763981290


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-776661797


   @kaxil Do we have any idea how this might have possibly happened? Bad/race-y previous migration?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771681876


   I think I found out what was causing this. If you had a DAG with tasks on running state before the upgrade, when you turn those on after upgrading, this error happens.
   
   Try clearing the whole dag run before unpausing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
zachliu commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-784517097


   i'm glad i found this before opening another issue :joy: 
   
   **Apache Airflow version**: 2.0.1
   
   **What happened**:
   
   ```python
   >>> import requests
   >>> from requests.auth import HTTPBasicAuth
   >>> r = requests.get("https://localhost:8080/api/v1/dags/{my_dag_id}/dagRuns/scheduled__2020-04-13T00%3A00%3A00%2B00%3A00/taskInstances", auth=HTTPBasicAuth('username', 'password'))
   >>> r.status_code
   500
   >>> print(r.text)
   {
     "detail": "None is not of type 'integer'\n\nFailed validating 'type' in schema['allOf'][0]['properties']
   ['task_instances']['items']['properties']['pool_slots']:\n    {'type': 'integer'}\n\nOn instance['task_ins
   tances'][0]['pool_slots']:\n    None",
     "status": 500,
     "title": "Response body does not conform to specification",
     "type": "https://airflow.apache.org/docs/2.0.1/stable-rest-api-ref.html#section/Errors/Unknown"
   }
   
   >>> print(r.json()["detail"])
   None is not of type 'integer'
   
   Failed validating 'type' in schema['allOf'][0]['properties']['task_instances']['items']['properties']['poo
   l_slots']:
       {'type': 'integer'}
   
   On instance['task_instances'][0]['pool_slots']:
       None
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-776664953


   > @kaxil Do we have any idea how this might have possibly happened? Bad/race-y previous migration?
   
   Yeah I do, this is caused by the PR that added feature to use more than 1 slot for a task. It didn't use migration. I will fix that for 2.0.2.
   
   Creating a PR shortly


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771679440


   > Experiencing this same issue. We upgraded lower environments without problems, but production is throwing this error.
   
   Can you provide some logs please with the stacktrace-- (including logs before and after the error)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lgwacker edited a comment on issue #13799: Scheduler crashes when unpausing some dags with: TypeError: '>' not supported between instances of 'NoneType' and 'int'

Posted by GitBox <gi...@apache.org>.
lgwacker edited a comment on issue #13799:
URL: https://github.com/apache/airflow/issues/13799#issuecomment-771681876


   I forgot to post an update here. I think I found out what was causing this. If you had a DAG with tasks on running state before the upgrade, when you turn those on after upgrading, this error happens.
   
   Try clearing the whole dag run before unpausing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org