You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/16 04:08:42 UTC

[GitHub] [airflow] JCoder01 opened a new issue #13099: Unable to start scheduler after stopped

JCoder01 opened a new issue #13099:
URL: https://github.com/apache/airflow/issues/13099


   **Apache Airflow version**: 2.0.0rc3
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: Linux
   - **OS** (e.g. from /etc/os-release): Ubuntu
   - **Kernel** (e.g. `uname -a`): 
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   After shutting down the scheduler, while tasks were in running state, trying to restart the scheduler results in pk violations..
   
   `[2020-12-15 22:43:29,673] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(example_task_group, scheduled__2020-12-14T04:31:00+00:00) already exists.
   `
   
   
   **What you expected to happen**:
   
   Scheduler restarts and picks up where it left off.
   
   **How to reproduce it**:
   
   Set example dag ( I used task_group) to schedule_interval `* * * * *` and start the scheduler and let it run for a few minutes. 
   Shut down the scheduler
   Attempt to restart the scheduler
   
   **Anything else we need to know**:
   I came across this doing testing using the LocalExecutor in a virtual env. If no else is able to reproduce it, I'll try again in a clean virtual env.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-768315621


   Looks like this is where the Scheduler runs Plugin code:
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1377
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1439
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1474
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1568
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/dagbag.py#L171
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/dagbag.py#L229
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/serialized_dag.py#L167
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/serialization/serialized_objects.py#L658
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/serialization/serialized_objects.py#L412
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L346
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L276
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L222
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748034037


   I'm still trying to work through this and am slowing moving dags over from the original environment where the error occured to the clean environment. While I can't seem seem to get the database into the troubled state on it's own, I can force it into that state by stopping the scheduler, and running the below. Then when you start the scheduler, you get the pk violation. 
   ```
   update dag set next_dagrun = (select max(execution_date) from dag_run where dag_id = 'example_task_group')
   where dag_id = 'example_task_group2'	
   ```
   I think what is happening is if you have a very small number of dags, in the time it takes for the scheduler to throw the error, one of the parser processes updates the backend db with the correct `next_dagrun` and starting the scheduler again works fine. As the number of dags grow, the chances that the problematic dag gets updated before the scheduler shuts down due the pk violation decreases, so the error persists until the you are lucky enough to get the problematic dag parsed.
   So while it's not clear how the database can get _into_ this state, would it make sense to add some "self healing" in the scheduler start up to reparse all the dags on startup? Or maybe rather than bailing there is some error handling in the scheduler that if a pk violation does arise, the dag is reparsed and tried to be scheduled again?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746173327


   Here's the whole thing.
   ```
   ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(example_task_group, scheduled__2020-12-14T04:32:00+00:00) already exists.
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
       self._create_dag_runs(query.all(), session)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1567, in _create_dag_runs
       creating_job_id=self.id,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/models/dag.py", line 1776, in create_dagrun
       session.flush()
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2536, in flush
       self._flush(objects)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2678, in _flush
       transaction.rollback(_capture_exception=True)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
       with_traceback=exc_tb,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2638, in _flush
       flush_context.execute()
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
       rec.execute(self)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
       uow,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
       insert,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
       statement, params
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
       distilled_params,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
       e, statement, parameters, cursor, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
       sqlalchemy_exception, with_traceback=exc_info[2], from_=e
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(example_task_group, scheduled__2020-12-14T04:32:00+00:00) already exists.
   
   [SQL: INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, last_scheduling_decision, dag_hash) VALUES (%(dag_id)s, %(execution_date)s, %(start_date)s, %(end_date)s, %(state)s, %(run_id)s, %(creat
   ing_job_id)s, %(external_trigger)s, %(run_type)s, %(conf)s, %(last_scheduling_decision)s, %(dag_hash)s) RETURNING dag_run.id]
   [parameters: {'dag_id': 'example_task_group', 'execution_date': datetime.datetime(2020, 12, 14, 4, 32, tzinfo=Timezone('UTC')), 'start_date': datetime.datetime(2020, 12, 16, 11, 5, 22, 127201, tzinfo=Timezone('UTC')), 'end_date': None, 'state': 'running', 'run_id': 'schedule
   d__2020-12-14T04:32:00+00:00', 'creating_job_id': 701, 'external_trigger': False, 'run_type': <DagRunType.SCHEDULED: 'scheduled'>, 'conf': <psycopg2.extensions.Binary object at 0x7fc17b8ee4e0>, 'last_scheduling_decision': None, 'dag_hash': '1628bc132c765d9b68a06841eedfbc7d'}
   ]
   (Background on this error at: http://sqlalche.me/e/13/gkpj)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-768315621


   Looks like this is where the Scheduler runs Plugin code:
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1275
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1377
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1439
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1474
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1568
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/dagbag.py#L171
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/dagbag.py#L229
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/serialized_dag.py#L167
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/serialization/serialized_objects.py#L658
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/serialization/serialized_objects.py#L412
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L346
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L276
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L222
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] satishcb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
satishcb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747758471


    We are using Airflow 1.10.9. We are calling the dag REST endpoint to trigger. Intermittently we see the call fails. In the postgres logs we see 
   `
    ERROR:  duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(ext_trigger_create_cbid_profile_for_duns, manual__2020-12-17T22:39:06.726412+00:00) already exists.
   STATEMENT:  INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, external_trigger, conf) VALUES ('ext_trigger_create_cbid_profile_for_duns', '2020-12-17T22:39:06.726412+00:00'::timestamptz, '2020-12-17T22:39:06.962088+00:00'::timestamptz, NULL, 'running', 'manual__2020-12-17T22:39:06.726412+00:00', true, ....
   `
   
   Since it is related to error reporting it here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748065707


   Yeah, I think at the very least we should catch the unique constraint violation and ignore it/carry on.
   
   That said, I'd like to work out _how_ this is possible, as I thought every update to the DAG model was behind a row-level lock, so this sort of failure wasn't meant to be possible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748273956


   Oh interesting. Just by putting that in a plugin things break?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil removed a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil removed a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163680


   > ```
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with Postgres 9.6 and 12


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-754589174


   Happy New Year @ashb 
   There is a little bit of both. There is an outstanding issue where use of Variables (or probably anything that interacts with the backed db in the global scope) raises the `UNEXPECTED COMMIT` RunTime error. 
   The problem I reported with being unable to restart the scheduler seems to have been caused by wrapping my `Variable.get` in an overly generous try/except, hiding the run time error and causing things in the db to get out of sync.
   Running the below should recreate the issue, I did this on python 3.7
   
   ```
   mkdir airflow_scheduler_test
   cd airflow_scheduler_test
   export AIRFLOW_HOME=$(pwd)
   virtualenv pyenv 
   source pyenv/bin/activate
   pip install apache-airflow==2.0.0
   
   airflow db init
   airflow variables set test TEST
   mkdir plugins
   cat << EOF > plugins/test.py
   from airflow.models.variable import Variable
   
   print(Variable.get('test'))
   EOF
   
   airflow dags unpause example_bash_operator
   airflow scheduler
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746129511


   @JCoder01 Few more questions. Maybe you can provide as much information about the deployment you have.
   
   1) which version of Postgres are you using? 
   2) what kind of deployment you have - is the database managed/on premise/remote
   3) do you use an Database HA
   4) was it a single-scheduler or multiple scheduler case?
   5) Did you use our image or ca container, or installed airflow manually?
   6) Is there a possibility you still had for example another (old?) version of airflow still accessing the database at the same time?
   7) Did you migrate from airflow 1.10 or was it  a fresh installation 
   
   Generally - as much information as possible about your case would be tremendously useful to investigate it!
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746129511


   @JCoder01 Few more questions. Maybe you can provide as much information about the deployment you have.
   
   1) which version of Postgres are you using? 
   2) what kind of deployment you have - is the database managed/on premise/remote
   3) do you use any Database HA? Active/Passive? 
   4) was it a single-scheduler or multiple scheduler case?
   5) Did you use our image or another container, or installed airflow manually in host/virtualenv?
   6) Is there a possibility you still had for example another (old?) version of airflow still accessing the database at the same time?
   7) Did you migrate from airflow 1.10 or was it  a fresh installation 
   
   Generally - as much information as possible about your case would be tremendously useful to investigate it!
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748067032


   > We are using Airflow 1.10.9. We are calling the dag REST endpoint to trigger. Intermittently we see the call fails. In the postgres logs we see
   > `ERROR: duplicate key value violates unique constraint "dag_run_dag_id_run_id_key" DETAIL: Key (dag_id, run_id)=(ext_trigger_create_cbid_profile_for_duns, manual__2020-12-17T22:39:06.726412+00:00) already exists. STATEMENT: INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, external_trigger, conf) VALUES ('ext_trigger_create_cbid_profile_for_duns', '2020-12-17T22:39:06.726412+00:00'::timestamptz, '2020-12-17T22:39:06.962088+00:00'::timestamptz, NULL, 'running', 'manual__2020-12-17T22:39:06.726412+00:00', true, ....`
   > 
   > Since it is related to error reporting it here.
   
   @satishcb This is something else -- That error is saying that the pair `(ext_trigger_create_cbid_profile_for_duns, manual__2020-12-17T22:39:06.726412+00:00)` already exists in the DB. Since that is from the API that likely means you have already triggered that exact pair before.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746168673


   Do you happen to have the full stack trace (there might be more above or below that one you shared) so I can see where in the scheduler it is happening?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746276730


   of course, sorry about that. I
   ```
   [2020-12-15 22:43:15,642] {scheduler_job.py:181} INFO - Started process (PID=46978) to work on /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:15,644] {scheduler_job.py:629} INFO - Processing file /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py for tasks to queue
   [2020-12-15 22:43:15,645] {logging_mixin.py:103} INFO - [2020-12-15 22:43:15,644] {dagbag.py:440} INFO - Filling up the DagBag from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:15,656] {scheduler_job.py:639} INFO - DAG(s) dict_keys(['example_task_group']) retrieved from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:15,658] {logging_mixin.py:103} INFO - [2020-12-15 22:43:15,658] {dag.py:1813} INFO - Sync 1 DAGs
   [2020-12-15 22:43:15,690] {logging_mixin.py:103} INFO - [2020-12-15 22:43:15,690] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:21:00+00:00
   [2020-12-15 22:43:15,708] {scheduler_job.py:189} INFO - Processing /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py took 0.070 seconds
   [2020-12-15 22:43:16,449] {scheduler_job.py:181} INFO - Started process (PID=47049) to work on /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:16,450] {scheduler_job.py:629} INFO - Processing file /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py for tasks to queue
   [2020-12-15 22:43:16,451] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,451] {dagbag.py:440} INFO - Filling up the DagBag from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:16,460] {scheduler_job.py:639} INFO - DAG(s) dict_keys(['example_task_group']) retrieved from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:16,481] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,481] {dag.py:1813} INFO - Sync 1 DAGs
   [2020-12-15 22:43:16,501] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,501] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:21:00+00:00
   [2020-12-15 22:43:16,522] {scheduler_job.py:189} INFO - Processing /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py took 0.078 seconds
   [2020-12-15 22:43:16,524] {scheduler_job.py:181} INFO - Started process (PID=47075) to work on /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:16,526] {scheduler_job.py:629} INFO - Processing file /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py for tasks to queue
   [2020-12-15 22:43:16,527] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,527] {dagbag.py:440} INFO - Filling up the DagBag from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:16,538] {scheduler_job.py:639} INFO - DAG(s) dict_keys(['example_task_group']) retrieved from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:16,580] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,580] {dag.py:1813} INFO - Sync 1 DAGs
   [2020-12-15 22:43:16,606] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,606] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:21:00+00:00
   [2020-12-15 22:43:16,628] {scheduler_job.py:189} INFO - Processing /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py took 0.108 seconds
   [2020-12-15 22:43:18,346] {scheduler_job.py:181} INFO - Started process (PID=47308) to work on /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:18,348] {scheduler_job.py:629} INFO - Processing file /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py for tasks to queue
   [2020-12-15 22:43:18,348] {logging_mixin.py:103} INFO - [2020-12-15 22:43:18,348] {dagbag.py:440} INFO - Filling up the DagBag from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:18,363] {scheduler_job.py:639} INFO - DAG(s) dict_keys(['example_task_group']) retrieved from /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py
   [2020-12-15 22:43:18,365] {logging_mixin.py:103} INFO - [2020-12-15 22:43:18,365] {dag.py:1813} INFO - Sync 1 DAGs
   [2020-12-15 22:43:18,402] {logging_mixin.py:103} INFO - [2020-12-15 22:43:18,401] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:30:00+00:00
   [2020-12-15 22:43:18,427] {scheduler_job.py:189} INFO - Processing /home/jcoder/git/airflow_2.0/macs-dags/macs-dags/dags/example_task_group.py took 0.085 seconds```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748295089


   But both happen when the scheduler loads plugins. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748294931


   In this simple example, it gets called when the scheduler is initializing and loads plugins. 
   In my actual plug-in, when the view get initialized like this 
   
   v_appbuilder_view = TestAppBuilderBaseView()
   
   The call to variables is made. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746182448


   I can see in the database is this:
   for dag_run:
   
   |id|dag_id|execution_date|state|
   |--|------|--------------|-----|
   |353|example_task_group|2020-12-13 23:32:00|running|
   
   and in dag, it looks like the scheduler didn't pick up that run?
   |dag_id|next_dagrun|next_dagrun_create_after|
   |------|-----------|------------------------|
   |example_task_group|2020-12-13 23:32:00|2020-12-13 23:33:00|
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-745932599


   Looks serious. If we can confirm that one, I am afraid it might lead to RC4 @kaxil @ashb if this is easily triggerable.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163927


   > Happy New Year @ashb
   > There is a little bit of both. There is an outstanding issue where use of Variables (or probably anything that interacts with the backed db in the global scope) in a plugin raises the `UNEXPECTED COMMIT` RunTime error.
   > The problem I reported with being unable to restart the scheduler seems to have been caused by wrapping my `Variable.get` in an overly generous try/except, hiding the run time error and causing things in the db to get out of sync.
   > Running the below should recreate the issue, I did this on python 3.7
   > 
   > ```
   > mkdir airflow_scheduler_test
   > cd airflow_scheduler_test
   > export AIRFLOW_HOME=$(pwd)
   > virtualenv pyenv 
   > source pyenv/bin/activate
   > pip install apache-airflow==2.0.0
   > 
   > airflow db init
   > airflow variables set test TEST
   > mkdir plugins
   > cat << EOF > plugins/test.py
   > from airflow.models.variable import Variable
   > 
   > print(Variable.get('test'))
   > EOF
   > 
   > airflow dags unpause example_bash_operator
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with Postgres 9.6 and 12, haven't tried it with 13.1
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747083535


   I spent a bit more time on this today, and while I can occasionally re-create it in the original environment, I can't recreate it in a fresh install. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767714797


   This looks like the same issue as https://github.com/apache/airflow/issues/13685


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #13099:
URL: https://github.com/apache/airflow/issues/13099


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746143343


   1.) looks like PostgreSQL 13.1
   2.) on prem, in docker, used only for personal testing
   3.) No,
   4.) single-scheduler
   5.) manual install in virutual env
   6.) as far as I know, no, there were no other versions accessing the database.
   7.) Fresh install
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil removed a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil removed a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163927


   I was not able to reproduce it with Postgres 9.6 and 12, haven't tried it with 13.1
   
   **EDIT**: I was able to reproduce it with 2.0.0 and Sqlite
   
   
   ```
   ❯ airflow scheduler
   
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-01-26 12:46:09,634] {scheduler_job.py:1241} INFO - Starting the scheduler
   [2021-01-26 12:46:09,634] {scheduler_job.py:1246} INFO - Processing each file at most -1 times
   [2021-01-26 12:46:09,639] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 4646
   [2021-01-26 12:46:09,640] {scheduler_job.py:1751} INFO - Resetting orphaned tasks for active dag runs
   [2021-01-26 12:46:09,645] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
   [2021-01-26 12:46:09,657] {scheduler_job.py:1805} INFO - Reset the following 3 orphaned TaskInstances:
   	<TaskInstance: scenario1_case2_01.tasks__1_of_10 2021-01-25 00:00:00+00:00 [queued]>
   	<TaskInstance: scenario4_case2_1_40.tasks__1_of_10 2021-01-25 00:00:00+00:00 [queued]>
   	<TaskInstance: scenario1_case2_02.tasks__2_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   [2021-01-26 12:46:09,661] {dag_processing.py:518} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2 ) when using sqlite. So we set parallelism to 1.
   [2021-01-26 12:46:09,845] {plugins_manager.py:231} ERROR - UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/plugins_manager.py", line 222, in load_plugins_from_plugin_directory
       loader.exec_module(mod)
     File "<frozen importlib._bootstrap_external>", line 728, in exec_module
     File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
     File "/Users/kaxilnaik/airflow/plugins/test.py", line 3, in <module>
       print(Variable.get('test'))
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 123, in get
       var_val = Variable.get_variable_from_secrets(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 181, in get_variable_from_secrets
       var_val = secrets_backend.get_variable(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/contextlib.py", line 119, in __exit__
       next(self.gen)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 32, in create_session
       session.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
       self.transaction.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 504, in commit
       self._prepare_impl()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 472, in _prepare_impl
       self.session.dispatch.before_commit(self.session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
       fn(*args, **kw)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/sqlalchemy.py", line 217, in _validate_commit
       raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!")
   RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   [2021-01-26 12:46:09,853] {plugins_manager.py:232} ERROR - Failed to import plugin /Users/kaxilnaik/airflow/plugins/test.py
   
   
   ... followed by
   
   [2021-01-26 13:07:54,518] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlite3.IntegrityError: UNIQUE constraint failed: dag_run.dag_id, dag_run.run_id
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
       self._create_dag_runs(query.all(), session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1582, in _create_dag_runs
       creating_job_id=self.id,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/dag.py", line 1781, in create_dagrun
       session.flush()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2523, in flush
       self._flush(objects)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2664, in _flush
       transaction.rollback(_capture_exception=True)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
       exc_value, with_traceback=exc_tb,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2624, in _flush
       flush_context.execute()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
       rec.execute(self)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
       uow,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
       insert,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
       statement, params
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
       distilled_params,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
       e, statement, parameters, cursor, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
       sqlalchemy_exception, with_traceback=exc_info[2], from_=e
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: dag_run.dag_id, dag_run.run_id
   [SQL: INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, last_scheduling_decision, dag_hash) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
   [parameters: ('scenario1_case2_01', '2021-01-25 00:00:00.000000', '2021-01-26 13:07:54.516957', None, 'running', 'scheduled__2021-01-25T00:00:00+00:00', 1, 0, <DagRunType.SCHEDULED: 'scheduled'>, <memory at 0x7fc388665530>, None, '09f9507aca47e7751a40cc98e546ac0d')]
   (Background on this error at: http://sqlalche.me/e/13/gkpj)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748263101


   @ashb I think I narrowed this down to being an issue with one of my plugins (a FAB view) that relies on an Airflow Variable. I think this is the same one I pinged you about on Slack a week or two ago and we thought it was likely due to me running the scheduler a debugger, but maybe that wasn't (entirely) the problem. I found that if I commented out the lines that did the call to `Variables` the pk violations stopped happening. If I put them back in, stopped the scheduler long enough for a dag to need to be triggered right away (with one scheduled to run every minute, that was one minute) and then started the scheduler again, it would trigger the pk violation.
   
   In trying to recreate a minimal example, I added the following to a file in plugins/
   
   ```
   from airflow.models.variable import Variable
   v = Variable().get("VAR")
   ```
   and upon starting the it triggered this Error:
   `RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!`
   which was the same error I pinged you about.
   This made be go back and look at the code in the plugin and I had:
   ```
   try:
       models = Variables().get("MODELS")
   except:
       models = []
   ```
   If I took out the generic except, then rather than kicking out a pk violation, it raised the UNEXPECTED COMMIT error. So the pk violation was happening because of the overly broad except in the plugin code.
   
   So this brings up the question about how does one use Variables in Plugins without breaking the HA code?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746264176


   @JCoder01 COuld we have some more logs surrounding those times -- it would help us tell which places this is getting updated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163680


   > ```
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with breeze and Postgres 9.6 and 12


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747113218


   Is it possible that you have different dags with the same DAG_ID (educated guesses here) ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767531551


   I was able to reproduce this.
   
   > cd airflow_scheduler_test
   > export AIRFLOW_HOME=$(pwd)
   > virtualenv pyenv 
   > source pyenv/bin/activate
   > pip install apache-airflow==2.0.0
   > 
   > airflow db init
   > airflow variables set test TEST
   > mkdir plugins
   > cat << EOF > plugins/test.py
   > from airflow.models.variable import Variable
   > 
   > print(Variable.get('test'))
   > EOF
   > 
   > airflow dags unpause example_bash_operator
   > airflow scheduler
   
   Stacktrace:
   
   <details><summary>CLICK ME</summary>
   <p>
   
   ```
   ❯ airflow scheduler
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-01-26 13:09:59,150] {scheduler_job.py:1242} INFO - Starting the scheduler
   [2021-01-26 13:09:59,151] {scheduler_job.py:1247} INFO - Processing each file at most -1 times
   [2021-01-26 13:09:59,154] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 5278
   [2021-01-26 13:09:59,156] {scheduler_job.py:1772} INFO - Resetting orphaned tasks for active dag runs
   [2021-01-26 13:09:59,161] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
   [2021-01-26 13:09:59,194] {plugins_manager.py:231} ERROR - UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/plugins_manager.py", line 222, in load_plugins_from_plugin_directory
       loader.exec_module(mod)
     File "<frozen importlib._bootstrap_external>", line 728, in exec_module
     File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
     File "/Users/kaxilnaik/airflow/plugins/test.py", line 3, in <module>
       print(Variable.get('test'))
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 127, in get
       var_val = Variable.get_variable_from_secrets(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 193, in get_variable_from_secrets
       var_val = secrets_backend.get_variable(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/contextlib.py", line 119, in __exit__
       next(self.gen)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 32, in create_session
       session.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
       self.transaction.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 504, in commit
       self._prepare_impl()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 472, in _prepare_impl
       self.session.dispatch.before_commit(self.session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
       fn(*args, **kw)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/sqlalchemy.py", line 217, in _validate_commit
       raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!")
   RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   [2021-01-26 13:09:59,201] {plugins_manager.py:232} ERROR - Failed to import plugin /Users/kaxilnaik/airflow/plugins/test.py
   [2021-01-26 13:09:59,386] {scheduler_job.py:1623} INFO - DAG scenario1_case2_02 is at (or above) max_active_runs (1 of 1), not creating any more runs
   [2021-01-26 13:09:59,387] {scheduler_job.py:1623} INFO - DAG scenario1_case2_01 is at (or above) max_active_runs (1 of 1), not creating any more runs
   [2021-01-26 13:09:59,435] {scheduler_job.py:936} INFO - 3 tasks up for execution:
   	<TaskInstance: scenario1_case2_02.tasks__1_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   	<TaskInstance: scenario1_case2_01.tasks__1_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   	<TaskInstance: scenario4_case2_1_40.tasks__1_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   [2021-01-26 13:09:59,436] {scheduler_job.py:970} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 3 task instances ready to be queued
   [2021-01-26 13:09:59,436] {scheduler_job.py:997} INFO - DAG scenario1_case2_02 has 0/1000000 running and queued tasks
   [2021-01-26 13:09:59,436] {scheduler_job.py:997} INFO - DAG scenario1_case2_01 has 0/1000000 running and queued tasks
   [2021-01-26 13:09:59,436] {scheduler_job.py:997} INFO - DAG scenario4_case2_1_40 has 0/16 running and queued tasks
   [2021-01-26 13:09:59,437] {scheduler_job.py:1058} INFO - Setting the following tasks to queued state:
   	<TaskInstance: scenario1_case2_02.tasks__1_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   	<TaskInstance: scenario1_case2_01.tasks__1_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   	<TaskInstance: scenario4_case2_1_40.tasks__1_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   [2021-01-26 13:09:59,438] {scheduler_job.py:1100} INFO - Sending TaskInstanceKey(dag_id='scenario1_case2_02', task_id='tasks__1_of_10', execution_date=datetime.datetime(2021, 1, 25, 0, 0, tzinfo=Timezone('UTC')), try_number=1) to executor with priority 10 and queue default
   [2021-01-26 13:09:59,439] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'scenario1_case2_02', 'tasks__1_of_10', '2021-01-25T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/Users/kaxilnaik/airflow/dags/scenario1_case2_02.py']
   [2021-01-26 13:09:59,439] {scheduler_job.py:1100} INFO - Sending TaskInstanceKey(dag_id='scenario1_case2_01', task_id='tasks__1_of_10', execution_date=datetime.datetime(2021, 1, 25, 0, 0, tzinfo=Timezone('UTC')), try_number=1) to executor with priority 10 and queue default
   [2021-01-26 13:09:59,439] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'scenario1_case2_01', 'tasks__1_of_10', '2021-01-25T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/Users/kaxilnaik/airflow/dags/scenario1_case2_01.py']
   [2021-01-26 13:09:59,439] {scheduler_job.py:1100} INFO - Sending TaskInstanceKey(dag_id='scenario4_case2_1_40', task_id='tasks__1_of_10', execution_date=datetime.datetime(2021, 1, 25, 0, 0, tzinfo=Timezone('UTC')), try_number=1) to executor with priority 10 and queue default
   [2021-01-26 13:09:59,439] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'scenario4_case2_1_40', 'tasks__1_of_10', '2021-01-25T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/Users/kaxilnaik/airflow/dags/scenario4_case2_1.py']
   [2021-01-26 13:09:59,440] {sequential_executor.py:59} INFO - Executing command: ['airflow', 'tasks', 'run', 'scenario1_case2_02', 'tasks__1_of_10', '2021-01-25T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/Users/kaxilnaik/airflow/dags/scenario1_case2_02.py']
   [2021-01-26 13:10:00,926] {dagbag.py:440} INFO - Filling up the DagBag from /Users/kaxilnaik/airflow/dags/scenario1_case2_02.py
   TEST
   Running <TaskInstance: scenario1_case2_02.tasks__1_of_10 2021-01-25T00:00:00+00:00 [queued]> on host 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa
   [2021-01-26 13:10:01,570] {sequential_executor.py:59} INFO - Executing command: ['airflow', 'tasks', 'run', 'scenario1_case2_01', 'tasks__1_of_10', '2021-01-25T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/Users/kaxilnaik/airflow/dags/scenario1_case2_01.py']
   [2021-01-26 13:10:02,992] {dagbag.py:440} INFO - Filling up the DagBag from /Users/kaxilnaik/airflow/dags/scenario1_case2_01.py
   TEST
   Running <TaskInstance: scenario1_case2_01.tasks__1_of_10 2021-01-25T00:00:00+00:00 [queued]> on host 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa
   [2021-01-26 13:10:03,628] {sequential_executor.py:59} INFO - Executing command: ['airflow', 'tasks', 'run', 'scenario4_case2_1_40', 'tasks__1_of_10', '2021-01-25T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/Users/kaxilnaik/airflow/dags/scenario4_case2_1.py']
   [2021-01-26 13:10:05,054] {dagbag.py:440} INFO - Filling up the DagBag from /Users/kaxilnaik/airflow/dags/scenario4_case2_1.py
   TEST
   Running <TaskInstance: scenario4_case2_1_40.tasks__1_of_10 2021-01-25T00:00:00+00:00 [queued]> on host 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa
   [2021-01-26 13:10:05,750] {scheduler_job.py:1201} INFO - Executor reports execution of scenario1_case2_02.tasks__1_of_10 execution_date=2021-01-25 00:00:00+00:00 exited with status success for try_number 1
   [2021-01-26 13:10:05,750] {scheduler_job.py:1201} INFO - Executor reports execution of scenario1_case2_01.tasks__1_of_10 execution_date=2021-01-25 00:00:00+00:00 exited with status success for try_number 1
   [2021-01-26 13:10:05,750] {scheduler_job.py:1201} INFO - Executor reports execution of scenario4_case2_1_40.tasks__1_of_10 execution_date=2021-01-25 00:00:00+00:00 exited with status success for try_number 1
   [2021-01-26 13:10:05,792] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(scenario1_case2_02, scheduled__2021-01-25T00:00:00+00:00) already exists.
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
       self._create_dag_runs(query.all(), session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1582, in _create_dag_runs
       creating_job_id=self.id,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/dag.py", line 1781, in create_dagrun
       session.flush()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2523, in flush
       self._flush(objects)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2664, in _flush
       transaction.rollback(_capture_exception=True)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
       exc_value, with_traceback=exc_tb,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2624, in _flush
       flush_context.execute()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
       rec.execute(self)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
       uow,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
       insert,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
       statement, params
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
       distilled_params,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
       e, statement, parameters, cursor, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
       sqlalchemy_exception, with_traceback=exc_info[2], from_=e
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(scenario1_case2_02, scheduled__2021-01-25T00:00:00+00:00) already exists.
   
   [SQL: INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, last_scheduling_decision, dag_hash) VALUES (%(dag_id)s, %(execution_date)s, %(start_date)s, %(end_date)s, %(state)s, %(run_id)s, %(creating_job_id)s, %(external_trigger)s, %(run_type)s, %(conf)s, %(last_scheduling_decision)s, %(dag_hash)s) RETURNING dag_run.id]
   [parameters: {'dag_id': 'scenario1_case2_02', 'execution_date': datetime.datetime(2021, 1, 25, 0, 0, tzinfo=Timezone('UTC')), 'start_date': datetime.datetime(2021, 1, 26, 13, 10, 5, 790285, tzinfo=Timezone('UTC')), 'end_date': None, 'state': 'running', 'run_id': 'scheduled__2021-01-25T00:00:00+00:00', 'creating_job_id': 1, 'external_trigger': False, 'run_type': <DagRunType.SCHEDULED: 'scheduled'>, 'conf': <psycopg2.extensions.Binary object at 0x7f95c0b11db0>, 'last_scheduling_decision': None, 'dag_hash': '3bb8093d821e43ddfc6ab56cd9eccfd2'}]
   (Background on this error at: http://sqlalche.me/e/13/gkpj)
   [2021-01-26 13:10:06,814] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 5278
   [2021-01-26 13:10:07,005] {process_utils.py:206} INFO - Waiting up to 5 seconds for processes to exit...
   [2021-01-26 13:10:07,041] {process_utils.py:66} INFO - Process psutil.Process(pid=5388, status='terminated', started='13:10:05') (5388) terminated with exit code None
   [2021-01-26 13:10:07,043] {process_utils.py:66} INFO - Process psutil.Process(pid=5385, status='terminated', started='13:10:05') (5385) terminated with exit code None
   [2021-01-26 13:10:07,043] {process_utils.py:66} INFO - Process psutil.Process(pid=5278, status='terminated', exitcode=0, started='13:09:59') (5278) terminated with exit code 0
   [2021-01-26 13:10:07,043] {scheduler_job.py:1296} INFO - Exited execute loop
   ```
   
   </p>
   </details>
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747100104


   Different virtual envs and databases. both virtual envs and databases are on the same host. The only difference is what dags and provider packages are in each virtual env. I do have a dag where I was testing out overriding the `following_schedule` DAG method with a custom function to allow running at specific times. I have that dag in both envs thinking it was the culprit but it even with the dag disabled I see the error in the original environment, for a different dag. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767531551


   I was able to reproduce this


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747402210


   I'll keep trying to narrow it down. It seems an `airflow db init` does not reliably fix it. 
   It does seem to point to a race condition, based on after the error occurs, repeatedly trying to restart the scheduler will sometimes sort things out. 
   The only reliably way to restart the scheduler after the error occurs seems to be to manually update 'next_dag_run_create_after` to a date time in the future. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748273956


   Oh interesting. Just by putting that in a plugin things break?
   
   It should be possible to do it - when it's this plugin code being run? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746133738


   And  - is it possible to get a database dump in the "failing" statej? That would be super useful if you could send us a dump of the database so that we can import it and investigate further!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163680






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163680


   > ```
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with Postgres 9.6 and 12


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-768532309


   Will be fixed by one of https://github.com/apache/airflow/pull/13931 and https://github.com/apache/airflow/pull/13932


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163680






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746173327


   Here's the whole thing.
   ` ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(example_task_group, scheduled__2020-12-14T04:32:00+00:00) already exists.
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
       self._create_dag_runs(query.all(), session)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1567, in _create_dag_runs
       creating_job_id=self.id,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/airflow/models/dag.py", line 1776, in create_dagrun
       session.flush()
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2536, in flush
       self._flush(objects)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2678, in _flush
       transaction.rollback(_capture_exception=True)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
       with_traceback=exc_tb,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2638, in _flush
       flush_context.execute()
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
       rec.execute(self)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
       uow,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
       insert,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
       statement, params
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
       distilled_params,
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
       e, statement, parameters, cursor, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
       sqlalchemy_exception, with_traceback=exc_info[2], from_=e
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/home/jcoder/git/airflow_2.0/pyenv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_run_dag_id_run_id_key"
   DETAIL:  Key (dag_id, run_id)=(example_task_group, scheduled__2020-12-14T04:32:00+00:00) already exists.
   
   [SQL: INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, last_scheduling_decision, dag_hash) VALUES (%(dag_id)s, %(execution_date)s, %(start_date)s, %(end_date)s, %(state)s, %(run_id)s, %(creat
   ing_job_id)s, %(external_trigger)s, %(run_type)s, %(conf)s, %(last_scheduling_decision)s, %(dag_hash)s) RETURNING dag_run.id]
   [parameters: {'dag_id': 'example_task_group', 'execution_date': datetime.datetime(2020, 12, 14, 4, 32, tzinfo=Timezone('UTC')), 'start_date': datetime.datetime(2020, 12, 16, 11, 5, 22, 127201, tzinfo=Timezone('UTC')), 'end_date': None, 'state': 'running', 'run_id': 'schedule
   d__2020-12-14T04:32:00+00:00', 'creating_job_id': 701, 'external_trigger': False, 'run_type': <DagRunType.SCHEDULED: 'scheduled'>, 'conf': <psycopg2.extensions.Binary object at 0x7fc17b8ee4e0>, 'last_scheduling_decision': None, 'dag_hash': '1628bc132c765d9b68a06841eedfbc7d'}
   ]
   (Background on this error at: http://sqlalche.me/e/13/gkpj)`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] stelukutla edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
stelukutla edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-774539054


   issue fixed after running the below command
   **_airflow dags pause <dag_id>_** 
   and start the scheduler 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767892380


   #13920 should fix the issue with Duplicate Dag Run.
   
   I still need to figure out the fix to allow retrieving Variable / Connections in the Plugin
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163927


   I was not able to reproduce it with Postgres 9.6 and 12, haven't tried it with 13.1
   
   **EDIT**: I was able to reproduce it with 2.0.0 and Sqlite
   
   
   ```
   ❯ airflow scheduler
   
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-01-26 12:46:09,634] {scheduler_job.py:1241} INFO - Starting the scheduler
   [2021-01-26 12:46:09,634] {scheduler_job.py:1246} INFO - Processing each file at most -1 times
   [2021-01-26 12:46:09,639] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 4646
   [2021-01-26 12:46:09,640] {scheduler_job.py:1751} INFO - Resetting orphaned tasks for active dag runs
   [2021-01-26 12:46:09,645] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
   [2021-01-26 12:46:09,657] {scheduler_job.py:1805} INFO - Reset the following 3 orphaned TaskInstances:
   	<TaskInstance: scenario1_case2_01.tasks__1_of_10 2021-01-25 00:00:00+00:00 [queued]>
   	<TaskInstance: scenario4_case2_1_40.tasks__1_of_10 2021-01-25 00:00:00+00:00 [queued]>
   	<TaskInstance: scenario1_case2_02.tasks__2_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   [2021-01-26 12:46:09,661] {dag_processing.py:518} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2 ) when using sqlite. So we set parallelism to 1.
   [2021-01-26 12:46:09,845] {plugins_manager.py:231} ERROR - UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/plugins_manager.py", line 222, in load_plugins_from_plugin_directory
       loader.exec_module(mod)
     File "<frozen importlib._bootstrap_external>", line 728, in exec_module
     File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
     File "/Users/kaxilnaik/airflow/plugins/test.py", line 3, in <module>
       print(Variable.get('test'))
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 123, in get
       var_val = Variable.get_variable_from_secrets(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 181, in get_variable_from_secrets
       var_val = secrets_backend.get_variable(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/contextlib.py", line 119, in __exit__
       next(self.gen)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 32, in create_session
       session.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
       self.transaction.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 504, in commit
       self._prepare_impl()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 472, in _prepare_impl
       self.session.dispatch.before_commit(self.session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
       fn(*args, **kw)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/sqlalchemy.py", line 217, in _validate_commit
       raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!")
   RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   [2021-01-26 12:46:09,853] {plugins_manager.py:232} ERROR - Failed to import plugin /Users/kaxilnaik/airflow/plugins/test.py
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-768315621


   Looks like this is where the Scheduler runs Plugin code:
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/jobs/scheduler_job.py#L1568
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/dagbag.py#L171
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/dagbag.py#L229
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/models/serialized_dag.py#L167
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/serialization/serialized_objects.py#L658
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/serialization/serialized_objects.py#L412
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L346
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L276
   
   https://github.com/apache/airflow/blob/65e49fc56f32b3e815fdf4a17be6b4e1c1e43c11/airflow/plugins_manager.py#L222
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747095140


   What's the diference between the environments? Just airflow installation, database? virtualenv? anything else ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746152929


   I have created another virtualenv/database and am seeing trying to recreate. I don't know about dumping the whole database, I don't want to accidentally share sensitive info.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163927


   I was not able to reproduce it with Postgres 9.6 and 12, haven't tried it with 13.1
   
   **EDIT**: I was able to reproduce it with 2.0.0 and Sqlite
   
   
   ```
   ❯ airflow scheduler
   
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2021-01-26 12:46:09,634] {scheduler_job.py:1241} INFO - Starting the scheduler
   [2021-01-26 12:46:09,634] {scheduler_job.py:1246} INFO - Processing each file at most -1 times
   [2021-01-26 12:46:09,639] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 4646
   [2021-01-26 12:46:09,640] {scheduler_job.py:1751} INFO - Resetting orphaned tasks for active dag runs
   [2021-01-26 12:46:09,645] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
   [2021-01-26 12:46:09,657] {scheduler_job.py:1805} INFO - Reset the following 3 orphaned TaskInstances:
   	<TaskInstance: scenario1_case2_01.tasks__1_of_10 2021-01-25 00:00:00+00:00 [queued]>
   	<TaskInstance: scenario4_case2_1_40.tasks__1_of_10 2021-01-25 00:00:00+00:00 [queued]>
   	<TaskInstance: scenario1_case2_02.tasks__2_of_10 2021-01-25 00:00:00+00:00 [scheduled]>
   [2021-01-26 12:46:09,661] {dag_processing.py:518} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2 ) when using sqlite. So we set parallelism to 1.
   [2021-01-26 12:46:09,845] {plugins_manager.py:231} ERROR - UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/plugins_manager.py", line 222, in load_plugins_from_plugin_directory
       loader.exec_module(mod)
     File "<frozen importlib._bootstrap_external>", line 728, in exec_module
     File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
     File "/Users/kaxilnaik/airflow/plugins/test.py", line 3, in <module>
       print(Variable.get('test'))
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 123, in get
       var_val = Variable.get_variable_from_secrets(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/variable.py", line 181, in get_variable_from_secrets
       var_val = secrets_backend.get_variable(key=key)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/contextlib.py", line 119, in __exit__
       next(self.gen)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 32, in create_session
       session.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
       self.transaction.commit()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 504, in commit
       self._prepare_impl()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 472, in _prepare_impl
       self.session.dispatch.before_commit(self.session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
       fn(*args, **kw)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/sqlalchemy.py", line 217, in _validate_commit
       raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!")
   RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   [2021-01-26 12:46:09,853] {plugins_manager.py:232} ERROR - Failed to import plugin /Users/kaxilnaik/airflow/plugins/test.py
   
   
   ... followed by
   
   [2021-01-26 13:07:54,518] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlite3.IntegrityError: UNIQUE constraint failed: dag_run.dag_id, dag_run.run_id
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
       self._create_dag_runs(query.all(), session)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1582, in _create_dag_runs
       creating_job_id=self.id,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
       return func(*args, **kwargs)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/airflow/models/dag.py", line 1781, in create_dagrun
       session.flush()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2523, in flush
       self._flush(objects)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2664, in _flush
       transaction.rollback(_capture_exception=True)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
       exc_value, with_traceback=exc_tb,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2624, in _flush
       flush_context.execute()
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
       rec.execute(self)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
       uow,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
       insert,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
       statement, params
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
       distilled_params,
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
       e, statement, parameters, cursor, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
       sqlalchemy_exception, with_traceback=exc_info[2], from_=e
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
       cursor, statement, parameters, context
     File "/Users/kaxilnaik/opt/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: dag_run.dag_id, dag_run.run_id
   [SQL: INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, last_scheduling_decision, dag_hash) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
   [parameters: ('scenario1_case2_01', '2021-01-25 00:00:00.000000', '2021-01-26 13:07:54.516957', None, 'running', 'scheduled__2021-01-25T00:00:00+00:00', 1, 0, <DagRunType.SCHEDULED: 'scheduled'>, <memory at 0x7fc388665530>, None, '09f9507aca47e7751a40cc98e546ac0d')]
   (Background on this error at: http://sqlalche.me/e/13/gkpj)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746287259


   And last question for now (you may not have this one anymore): Did you get any of the "Setting next_dagrun for example_task_group" messages in the main scheduler output?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163927


   > Happy New Year @ashb
   > There is a little bit of both. There is an outstanding issue where use of Variables (or probably anything that interacts with the backed db in the global scope) in a plugin raises the `UNEXPECTED COMMIT` RunTime error.
   > The problem I reported with being unable to restart the scheduler seems to have been caused by wrapping my `Variable.get` in an overly generous try/except, hiding the run time error and causing things in the db to get out of sync.
   > Running the below should recreate the issue, I did this on python 3.7
   > 
   > ```
   > mkdir airflow_scheduler_test
   > cd airflow_scheduler_test
   > export AIRFLOW_HOME=$(pwd)
   > virtualenv pyenv 
   > source pyenv/bin/activate
   > pip install apache-airflow==2.0.0
   > 
   > airflow db init
   > airflow variables set test TEST
   > mkdir plugins
   > cat << EOF > plugins/test.py
   > from airflow.models.variable import Variable
   > 
   > print(Variable.get('test'))
   > EOF
   > 
   > airflow dags unpause example_bash_operator
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with Postgres 9.6 and 12
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil removed a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil removed a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163680


   > ```
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with Postgres 9.6 and 12


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-754589174


   Happy New Year @ashb 
   There is a little bit of both. There is an outstanding issue where use of Variables (or probably anything that interacts with the backed db in the global scope) in a plugin raises the `UNEXPECTED COMMIT` RunTime error. 
   The problem I reported with being unable to restart the scheduler seems to have been caused by wrapping my `Variable.get` in an overly generous try/except, hiding the run time error and causing things in the db to get out of sync.
   Running the below should recreate the issue, I did this on python 3.7
   ```
   mkdir airflow_scheduler_test
   cd airflow_scheduler_test
   export AIRFLOW_HOME=$(pwd)
   virtualenv pyenv 
   source pyenv/bin/activate
   pip install apache-airflow==2.0.0
   
   airflow db init
   airflow variables set test TEST
   mkdir plugins
   cat << EOF > plugins/test.py
   from airflow.models.variable import Variable
   
   print(Variable.get('test'))
   EOF
   
   airflow dags unpause example_bash_operator
   airflow scheduler
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746232980


   @ashb as requested:
   ```
   [2020-12-15 22:43:15,690] {logging_mixin.py:103} INFO - [2020-12-15 22:43:15,690] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:21:00+00:00
   [2020-12-15 22:43:16,501] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,501] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:21:00+00:00
   [2020-12-15 22:43:16,606] {logging_mixin.py:103} INFO - [2020-12-15 22:43:16,606] {dag.py:2266} INFO - Setting next_dagrun for example_task_group to 2020-12-14T04:21:00+00:00
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-748333536


   log and stacktrace
   ```
   
   (pyenv) jcoder@bos-rndab01 ~/airflow_scheduler_test $ airflow scheduler
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2020-12-18 16:41:19,262] {scheduler_job.py:1241} INFO - Starting the scheduler
   [2020-12-18 16:41:19,262] {scheduler_job.py:1246} INFO - Processing each file at most -1 times
   [2020-12-18 16:41:19,323] {dag_processing.py:250} INFO - Launched DagFileProcessorManager with pid: 79
   25
   [2020-12-18 16:41:19,324] {scheduler_job.py:1751} INFO - Resetting orphaned tasks for active dag runs
   [2020-12-18 16:41:19,331] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
   [2020-12-18 16:41:19,349] {scheduler_job.py:1805} INFO - Reset the following 2 orphaned TaskInstances:
           <TaskInstance: example_task_group2.section_2.inner_section_2.task_2 2020-12-17 21:47:00+00:00
   [scheduled]>
           <TaskInstance: example_task_group2.section_2.inner_section_2.task_2 2020-12-17 21:48:00+00:00
   [scheduled]>
   [2020-12-18 16:41:19,451] {plugins_manager.py:231} ERROR - UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   Traceback (most recent call last):
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/airflow/plugins_manager.
   py", line 222, in load_plugins_from_plugin_directory
       loader.exec_module(mod)
     File "<frozen importlib._bootstrap_external>", line 728, in exec_module
     File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
     File "/home/jcoder/airflow_scheduler_test/plugins/Plugin.py", line 3, in <module>
       v = Variable().get("VAR")
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/airflow/models/variable.
   py", line 123, in get
       var_val = Variable.get_variable_from_secrets(key=key)
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/airflow/models/variable.
   py", line 181, in get_variable_from_secrets
       var_val = secrets_backend.get_variable(key=key)
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/airflow/utils/session.py
   ", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/opt/anaconda/2020/envs/py_37_pands0.25/lib/python3.7/contextlib.py", line 119, in __exit__
       next(self.gen)
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/airflow/utils/session.py
   ", line 32, in create_session
       session.commit()
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.p
   y", line 1042, in commit
       self.transaction.commit()
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.p
   y", line 504, in commit
       self._prepare_impl()
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/sqlalchemy/orm/session.p
   y", line 472, in _prepare_impl
       self.session.dispatch.before_commit(self.session)
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/sqlalchemy/event/attr.py
   ", line 322, in __call__
       fn(*args, **kw)
     File "/home/jcoder/airflow_scheduler_test/pyenv/lib/python3.7/site-packages/airflow/utils/sqlalchemy
   .py", line 217, in _validate_commit
       raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!")
   RuntimeError: UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!
   [2020-12-18 16:41:19,460] {plugins_manager.py:232} ERROR - Failed to import plugin /home/jcoder/airflo
   w_scheduler_test/plugins/Plugin.py
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746190232


   if you do a `airflow db init` on the database after it gets to this state, it fixes it, and updates the dag table to the correct `next_dagrun` so it almost seems like in the unlikely event that it gets into this state, the scheduler needs to do a round of updates on start up before starting to schedule tasks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747100104


   Different virtual envs and databases. both virtual envs and databases are on the same host. The only difference is what dags and provider packages are in each virtual env. I do have a dag where I was testing out overriding the `following_schedule` DAG method with a custom function to allow running at specific times. I have that dag in both envs thinking it was the culprit but it even with the dag disabled I see the error in the original environment, for a different dag, but not in the second virtual env.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747341221


   Marking this as can't reproduce for now -- it may be a race condition, but... 🤷🏻 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-767163927


   > Happy New Year @ashb
   > There is a little bit of both. There is an outstanding issue where use of Variables (or probably anything that interacts with the backed db in the global scope) in a plugin raises the `UNEXPECTED COMMIT` RunTime error.
   > The problem I reported with being unable to restart the scheduler seems to have been caused by wrapping my `Variable.get` in an overly generous try/except, hiding the run time error and causing things in the db to get out of sync.
   > Running the below should recreate the issue, I did this on python 3.7
   > 
   > ```
   > mkdir airflow_scheduler_test
   > cd airflow_scheduler_test
   > export AIRFLOW_HOME=$(pwd)
   > virtualenv pyenv 
   > source pyenv/bin/activate
   > pip install apache-airflow==2.0.0
   > 
   > airflow db init
   > airflow variables set test TEST
   > mkdir plugins
   > cat << EOF > plugins/test.py
   > from airflow.models.variable import Variable
   > 
   > print(Variable.get('test'))
   > EOF
   > 
   > airflow dags unpause example_bash_operator
   > airflow scheduler
   > ```
   
   I was not able to reproduce it with Postgres 9.6 and 12, haven't tried it with 13.1
   
   **EDIT**: I was able to reproduce it with 2.0.0 and Sqlite


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-768315621






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-745934655


   Thanks for reporting @JCoder01 !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747114005


   I think it is most likely  because you had overriden `following_schedule` 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747125761


   @potiuk I generally believe anything is possible, but in this case I think it's rather unlikely. If that was the case, do you think it would matter if the dag was enabled or not? I only have two dags enabled.
   
   @kaxil If it is this, do you have any ideas where to look to try and confirm? And if it is this, if the dag is in both environments, wouldn't this happen in both environments? Note that the dag with the custom `following_schedule` is not the one throwing the pk violation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] stelukutla commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
stelukutla commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-774539054


   issue fixed after running the below command
   airflow dags pause <dag_id> 
   and start the scheduler 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-747114005


   I think it is most likely  because you updated `following_schedule` 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746129511






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746179364


   Okay right - as expected, nothing really interesting there -- it's `self._create_dag_runs(query.all(), session)` which is failing, which is just creating the dag runs that the previous query gave it https://github.com/apache/airflow/blob/69801f5ef016bcf21af348f4fdeb67f09db8f394/airflow/models/dag.py#L2220-L2242


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-746062483


   I've certainly never seen this, and I've done heave testing of killing and restarting schedulers.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13099: Unable to start scheduler after stopped

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13099:
URL: https://github.com/apache/airflow/issues/13099#issuecomment-754549659


   @JCoder01 where did we get with this issue? Was it a problem with your plugin or is this issue still outstanding?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org