You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "C-K-Loan (via GitHub)" <gi...@apache.org> on 2023/02/16 12:25:47 UTC

[GitHub] [airflow] C-K-Loan opened a new issue, #29572: Database is Locked -- Even when deleting database and using new airflow home dir

C-K-Loan opened a new issue, #29572:
URL: https://github.com/apache/airflow/issues/29572

   ### Apache Airflow version
   
   2.5.1
   
   ### What happened
   
   My Airflow started giving `Database is Locked` Errors.  I tried uninstalling everything and deleting `~/airflow` but kept getting db locked errors. 
   My final try  was to setup everything in a new home directory, i.e I set 
   `export AIRFLOW_HOME=~/airflow2` 
   
   then I create a new DB 
   ```
   python3 -m airflow db init
   ```
   gives the following logs 
   
   ``` 
   DB: sqlite:////home/ckl/airflow2/airflow.db
   [2023-02-16 13:05:40,430] {migration.py:204} INFO - Context impl SQLiteImpl.
   [2023-02-16 13:05:40,430] {migration.py:207} INFO - Will assume non-transactional DDL.
   INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
   INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
   INFO  [alembic.runtime.migration] Running stamp_revision  -> 290244fb8b83
   WARNI [airflow.models.crypto] empty cryptography key - values will not be stored encrypted.
   Initialization done
   ```
   
   
   then I start the scheduler in one shell
   `python3 -m airflow scheduler` 
   with logs 
   ```
     ____________       _____________
    ____    |__( )_________  __/__  /________      __
   ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
   ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
    _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
   [2023-02-16 13:14:05 +0100] [4170653] [INFO] Starting gunicorn 20.1.0
   [2023-02-16 13:14:05 +0100] [4170653] [INFO] Listening at: http://[::]:8793 (4170653)
   [2023-02-16 13:14:05 +0100] [4170653] [INFO] Using worker: sync
   [2023-02-16 13:14:05 +0100] [4170654] [INFO] Booting worker with pid: 4170654
   [2023-02-16 13:14:05 +0100] [4170655] [INFO] Booting worker with pid: 4170655
   [2023-02-16 13:14:07,512] {scheduler_job.py:714} INFO - Starting the scheduler
   [2023-02-16 13:14:07,512] {scheduler_job.py:719} INFO - Processing each file at most -1 times
   [2023-02-16 13:14:07,514] {executor_loader.py:107} INFO - Loaded executor: SequentialExecutor
   [2023-02-16 13:14:07,520] {manager.py:163} INFO - Launched DagFileProcessorManager with pid: 4170697
   [2023-02-16 13:14:07,521] {scheduler_job.py:1408} INFO - Resetting orphaned tasks for active dag runs
   [2023-02-16 13:14:07,577] {settings.py:58} INFO - Configured default timezone Timezone('UTC')
   [2023-02-16T13:14:07.586+0100] {manager.py:409} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2) when using sqlite. So we set parallelism to 1.
   ```
   
   
   
   and in a separate shell I start the webserver
   
   ` python3 -m airflow webserver -p 6969` which creates all the default users etc.. (which gives no exceptions, all good)
   
   Now the problem occurs, while airflow is initiazing all these users , the scheduler shell starts throwing exceptions
   
   
   ```
   2023-02-16T13:14:07.586+0100] {manager.py:409} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2) when using sqlite. So we set parallelism to 1.                                            
   Process DagFileProcessor4-Process:                                                                                                                                                                                   
   Traceback (most recent call last):                                                                                                                                                                                     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context                                                                                                     
       self.dialect.do_execute(                                                                                                                                                                                         
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute                                                                                                         
       cursor.execute(statement, parameters)                                                                                                                                                                            
   sqlite3.OperationalError: database is locked                                                                                                                                                                         
                                                                                                                                                                                                                        
   The above exception was the direct cause of the following exception:                                                                                                                                                 
                                                                                                                                                                                                                        
   Traceback (most recent call last):                                                                                                                                                                                   
     File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap                                                                                                                                     
       self.run()                                                                                                                                                                                                       
     File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run                                                                                                                                            
       self._target(*self._args, **self._kwargs)                                                                                                                                                                        
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/dag_processing/processor.py", line 174, in _run_file_processor                                                                                             _handle_dag_file_processing()                                                                                                                                                                                    
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/dag_processing/processor.py", line 155, in _handle_dag_file_processing                                                                                 
       result: tuple[int, int] = dag_file_processor.process_file(                                                                                                                                                         File "/home/ckl/.local/lib/python3.10/site-packages/airflow/utils/session.py", line 75, in wrapper          
       return func(*args, session=session, **kwargs)                                                 
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/dag_processing/processor.py", line 768, in process_file
       dagbag.sync_to_db(processor_subdir=self._dag_directory, session=session)                                  
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/utils/session.py", line 72, in wrapper          
       return func(*args, **kwargs)                                                                          
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/models/dagbag.py", line 645, in sync_to_db      
       for attempt in run_with_db_retries(logger=self.log):                       
   File "/home/ckl/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 384, in __iter__
       do = self.iter(retry_state=retry_state)
     File "/home/ckl/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 362, in iter
       raise retry_exc.reraise()
     File "/home/ckl/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 195, in reraise
       raise self.last_attempt.result()
     File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
       return self.__get_result()
     File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
       raise self._exception
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/models/dagbag.py", line 657, in sync_to_db
       serialize_errors.extend(_serialize_dag_capturing_errors(dag, session))
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/models/dagbag.py", line 628, in _serialize_dag_capturing_errors
       dag_was_updated = SerializedDagModel.write_dag(
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/utils/session.py", line 72, in wrapper
       return func(*args, **kwargs)
     File "/home/ckl/.local/lib/python3.10/site-packages/airflow/models/serialized_dag.py", line 152, in write_dag
       .scalar()
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2892, in scalar
       ret = self.one()
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2869, in one
       return self._iter().one()
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 2915, in _iter
       result = self.session.execute(
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1714, in execute
       result = conn._execute_20(statement, params or {}, execution_options)
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_20                                                                                                          
       return meth(self, args_10style, kwargs_10style, execution_options)
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
       return connection._execute_clauseelement(
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement
       ret = self._execute_context(
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
       self._handle_dbapi_exception(
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception
       util.raise_(
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 210, in raise_
       raise exception
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
       self.dialect.do_execute(
     File "/home/ckl/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute                                                                                                         
       cursor.execute(statement, parameters)
   sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
   [SQL: SELECT ? AS anon_1 
   FROM serialized_dag 
   WHERE serialized_dag.dag_id = ? AND serialized_dag.last_updated > ?]
   [parameters: (1, 'example_xcom_args', '2023-02-16 12:14:36.805117')]
   (Background on this error at: https://sqlalche.me/e/14/e3q8)
   
           
   ```
   The UI is accessible, but it says
   ``` 
   The scheduler does not appear to be running. Last heartbeat was received 5 minutes ago.
   
   The DAGs list may not update, and new tasks will not be scheduled.
   
   ``` 
   
   
   This essentially leaves my airflow in an unusable state. 
   
   
   ### What you think should happen instead
   
   scheduler should not throw db locked exceptions
   
   ### How to reproduce
   
   See main post
   
   ### Operating System
   
   Pop!_OS 22.04 LTS
   
   ### Versions of Apache Airflow Providers
   
   ```
   apache-airflow                      2.5.1
   apache-airflow-providers-amazon     6.1.0
   apache-airflow-providers-common-sql 1.3.0
   apache-airflow-providers-ftp        3.2.0
   apache-airflow-providers-http       4.1.0
   apache-airflow-providers-imap       3.1.0
   apache-airflow-providers-sqlite     3.3.0
   ```
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   I just setup via shell on my private server
   
   ### Anything else
   
   Working fine for a long time.
   Today a job started throwing OOM errors and since then my airflow is all messed up .
   Maybe some rouge process is still around? 
   But to make sure thats not the case, I setup everything in a new directory. 
   
   not sure what else todo, I have no physical access to the machine right now and restarting iit is not an option
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis closed issue #29572: Database is Locked -- Even when deleting database and using new airflow home dir

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis closed issue #29572: Database is Locked -- Even when deleting database and using new airflow home dir
URL: https://github.com/apache/airflow/issues/29572


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on issue #29572: Database is Locked -- Even when deleting database and using new airflow home dir

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #29572:
URL: https://github.com/apache/airflow/issues/29572#issuecomment-1433284718

   Airflow supports SQLite only for developing purpose.
   
   I would recommend to use [`airflow standalone`](https://airflow.apache.org/docs/apache-airflow/stable/start.html) command in case if you need setup Airflow for quick developing DAGs.
   In case if you feel confident with docker or K8S you could run it on it, some doc:
   - [Running Airflow in Docker](https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html)
   - [Helm Chart for Apache Airflow ](https://airflow.apache.org/docs/helm-chart/stable/index.html)
   
   And also would recommend use any other other [supported backend](https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#choosing-database-backend), please note if you choose MS SQL Server it is [experimental feature](https://airflow.apache.org/docs/apache-airflow/stable/release-process.html#experimental-features), so better choose PostgreSQL or MySQL 8
   
   And last but not least, with any other database backend rather than SQLite you could use other executors for example if you run on single machine and you do not have intensive tasks you could select [LocalExecutor](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/local.html), the reason to select other executor pretty simple [SequentialExecutor](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/sequential.html) not recommended for production usage
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #29572: Database is Locked -- Even when deleting database and using new airflow home dir

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #29572:
URL: https://github.com/apache/airflow/issues/29572#issuecomment-1433010636

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org