You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/04 11:22:07 UTC

[GitHub] [airflow] dreamca4er opened a new issue, #24176: Can't disable sqlalchemy pool

dreamca4er opened a new issue, #24176:
URL: https://github.com/apache/airflow/issues/24176

   ### Apache Airflow version
   
   2.3.1 (latest released)
   
   ### What happened
   
   After upgrading from Airflow 2.2.3 to 2.3.1 we are getting an error during Airflow web interface usage:
   ```
   Python version: 3.7.13
   Airflow version: 2.3.1
   Node: %HOST%
   -------------------------------------------------------------------------------
   Traceback (most recent call last):
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/util/_collections.py", line 1008, in __call__
       return self.registry[key]
   KeyError: <greenlet.greenlet object at 0x7f18c5080a10 (otid=0x7f18c62669b0) current active started main>
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 2448, in wsgi_app
       response = self.full_dispatch_request()
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 1953, in full_dispatch_request
       return self.finalize_request(rv)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 1970, in finalize_request
       response = self.process_response(response)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask/app.py", line 2269, in process_response
       self.session_interface.save_session(self, ctx.session, response)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/airflow/www/session.py", line 33, in save_session
       return super().save_session(*args, **kwargs)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_session/sessions.py", line 554, in save_session
       saved_session = self.sql_session_model.query.filter_by(
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 552, in __get__
       return type.query_class(mapper, session=self.sa.session())
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/orm/scoping.py", line 129, in __call__
       return self.registry()
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/util/_collections.py", line 1010, in __call__
       return self.registry.setdefault(key, self.createfunc())
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 4058, in __call__
       return self.class_(**local_kw)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 176, in __init__
       bind = options.pop('bind', None) or db.engine
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 1000, in engine
       return self.get_engine()
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 1019, in get_engine
       return connector.get_engine()
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 596, in get_engine
       self._engine = rv = self._sa.create_engine(sa_url, options)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/flask_sqlalchemy/__init__.py", line 1029, in create_engine
       return sqlalchemy.create_engine(sa_url, **engine_opts)
     File "<string>", line 2, in create_engine
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/util/deprecations.py", line 298, in warned
       return fn(*args, **kwargs)
     File "/home/%WORK_DIR%/venv/lib/python3.7/site-packages/sqlalchemy/engine/create.py", line 646, in create_engine
       engineclass.__name__,
   TypeError: Invalid argument(s) 'pool_size' sent to create_engine(), using configuration MySQLDialect_mysqldb/NullPool/Engine.  Please check that the keyword arguments are appropriate for this combination of components.
   ```
   After changing env **AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_ENABLED** (or airflow.cfg parameter **sql_alchemy_pool_enabled** in case of configs set not via envs) from **False** to **True** error becomes resolved.
   
   ### What you think should happen instead
   
   I think that we should still be able to disable sqlalchemy pooling if there is such an option. Besides, it works in Airflow 2.2.3. Somehow **pool_size** option gets passed to sqlaclhemy **create_engine** method, even when pooling is disabled via setting environment variable.
   
   ### How to reproduce
   
   Here our airflow.cfg (without sensitive info)
   [airflow.cfg.zip](https://github.com/apache/airflow/files/8837511/airflow.cfg.zip)
   Here is our `pip freeze` output
   ```
   aiohttp==3.8.1
   aiosignal==1.2.0
   alembic==1.8.0
   amqp==5.1.1
   anyio==3.6.1
   apache-airflow==2.3.1
   apache-airflow-providers-cncf-kubernetes==4.0.2
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-sqlite==2.1.3
   apispec==3.3.2
   argcomplete==1.12.3
   async-timeout==4.0.2
   asynctest==0.13.0
   attrs==20.3.0
   Babel==2.10.1
   backports.zoneinfo==0.2.1
   bcrypt==3.2.2
   billiard==3.6.4.0
   bingads==13.0.11.1
   bleach==5.0.0
   blinker==1.4
   boto==2.49.0
   cached-property==1.5.2
   cachelib==0.7.0
   cachetools==4.2.4
   cattrs==1.5.0
   celery==5.2.1
   certifi==2022.5.18.1
   cffi==1.15.0
   charset-normalizer==2.0.12
   click==8.1.3
   click-didyoumean==0.3.0
   click-plugins==1.1.1
   click-repl==0.2.0
   clickclick==20.10.2
   clickhouse-driver==0.2.3
   colorama==0.4.4
   colorlog==4.8.0
   commonmark==0.9.1
   connexion==2.13.1
   console-menu==0.6.0
   cron-descriptor==1.2.24
   croniter==1.0.15
   cryptography==37.0.2
   curlify==2.2.1
   defusedxml==0.7.1
   Deprecated==1.2.13
   dill==0.3.5.1
   dnspython==2.2.1
   docutils==0.16
   email-validator==1.2.1
   facebook-business==13.0.0
   Flask==1.1.2
   Flask-AppBuilder==3.4.5
   Flask-Babel==2.0.0
   Flask-Bcrypt==1.0.1
   Flask-Caching==1.11.1
   Flask-JWT-Extended==3.25.1
   Flask-Login==0.4.1
   Flask-OpenID==1.3.0
   Flask-Session==0.4.0
   Flask-SQLAlchemy==2.5.1
   Flask-WTF==0.14.3
   flower==1.0.0
   frozenlist==1.3.0
   future==0.18.2
   gcs-oauth2-boto-plugin==2.5
   google-ads==16.0.0
   google-api-core==2.7.1
   google-api-python-client==2.40.0
   google-auth==1.35.0
   google-auth-httplib2==0.1.0
   google-auth-oauthlib==0.5.1
   google-cloud-bigquery==2.31.0
   google-cloud-core==2.2.3
   google-cloud-storage==1.43.0
   google-crc32c==1.3.0
   google-reauth==0.1.1
   google-resumable-media==2.3.3
   googleads==22.0.0
   googleapis-common-protos==1.56.2
   graphviz==0.20
   greenlet==1.1.2
   grpcio==1.46.3
   grpcio-status==1.46.3
   gunicorn==20.1.0
   guppy3==3.1.2
   h11==0.12.0
   httpcore==0.13.7
   httplib2==0.20.4
   httpx==0.19.0
   humanize==3.13.1
   idna==3.3
   importlib-metadata==4.11.4
   importlib-resources==5.7.1
   inflection==0.5.1
   iso8601==1.0.2
   isodate==0.6.1
   itsdangerous==1.1.0
   Jinja2==3.0.3
   joblib==1.1.0
   jsonschema==3.2.0
   kombu==5.2.4
   kubernetes==23.6.0
   lazy-object-proxy==1.7.1
   lockfile==0.12.2
   lxml==4.9.0
   Mako==1.2.0
   Markdown==3.3.4
   MarkupSafe==2.0.1
   marshmallow==3.16.0
   marshmallow-enum==1.5.1
   marshmallow-oneofschema==3.0.1
   marshmallow-sqlalchemy==0.26.1
   mixpanel==4.9.0
   multidict==6.0.2
   mysql-connector==2.2.9
   mysqlclient==2.1.0
   numpy==1.21.6
   oauth2client==4.1.3
   oauthlib==3.2.0
   openapi-schema-validator==0.2.3
   openapi-spec-validator==0.4.0
   oyaml==1.0
   packaging==21.3
   pandas==1.3.5
   pathspec==0.9.0
   pendulum==2.1.2
   pika==1.1.0
   platformdirs==2.5.2
   pluggy==1.0.0
   prison==0.2.1
   prometheus-client==0.14.1
   prompt-toolkit==3.0.8
   proto-plus==1.19.6
   protobuf==3.20.0
   psutil==5.9.1
   psycopg2-binary==2.9.3
   pyarrow==6.0.1
   pyasn1==0.4.8
   pyasn1-modules==0.2.8
   pycountry==22.3.5
   pycparser==2.21
   pycryptodome==3.12.0
   pyflakes==2.4.0
   Pygments==2.12.0
   PyJWT==1.7.1
   pyOpenSSL==22.0.0
   pyparsing==3.0.9
   pyperclip==1.8.2
   pyrsistent==0.18.1
   python-daemon==2.3.0
   python-dateutil==2.8.2
   python-nvd3==0.15.0
   python-slugify==6.1.2
   python3-openid==3.2.0
   pytz==2022.1
   pytz-deprecation-shim==0.1.0.post0
   pytzdata==2020.1
   pyu2f==0.1.5
   PyYAML==5.4.1
   requests==2.27.1
   requests-file==1.5.1
   requests-oauthlib==1.3.1
   requests-toolbelt==0.9.1
   retry-decorator==1.1.1
   rfc3986==1.5.0
   rich==12.4.4
   rsa==4.8
   scikit-learn==0.24.1
   scipy==1.7.3
   selenium==3.141.0
   setproctitle==1.2.3
   six==1.16.0
   slackclient==2.5.0
   sniffio==1.2.0
   SocksiPy-branch==1.1
   SQLAlchemy==1.4.9
   SQLAlchemy-JSONField==1.0.0
   SQLAlchemy-Utils==0.38.2
   suds-community==1.1.1
   swagger-ui-bundle==0.0.9
   tableau-api-lib==0.1.14
   tableauhyperapi==0.0.13129
   tabulate==0.8.9
   tenacity==8.0.1
   termcolor==1.1.0
   text-unidecode==1.3
   threadpoolctl==3.1.0
   tornado==6.1
   typeguard==2.13.3
   typing_extensions==4.2.0
   tzdata==2022.1
   tzlocal==4.2
   ua-parser==0.10.0
   unicodecsv==0.14.1
   Unidecode==1.3.2
   uritemplate==4.1.1
   urllib3==1.26.9
   user-agents==2.0
   vine==5.0.0
   wcwidth==0.2.5
   webencodings==0.5.1
   websocket-client==1.3.2
   Werkzeug==1.0.1
   wrapt==1.14.1
   WTForms==2.3.3
   xgboost==1.4.0
   xmltodict==0.13.0
   yarl==1.7.2
   zeep==4.1.0
   zenpy==2.0.22
   zipp==3.8.0
   ```
   To reproduce:
   1. Change %PLACEHOLDERS% in the attached **airflow.cfg** to valid db connect details and paths. Place it in ~/airflow/.
   2. Create python3.7 venv (or equivalent) with the attached pip requirements.
   3. Run `airflow webserver`.
   4. Go to Airflow web interface and experience an error before credentials prompt.
   
   To fix:
   1. Change **sql_alchemy_pool_enabled** parameter in airflow.cfg to True.
   
   ### Operating System
   
   Ubuntu 20.04.4 LTS
   
   ### Versions of Apache Airflow Providers
   
   ```
   apache-airflow-providers-cncf-kubernetes==4.0.2
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-sqlite==2.1.3
   ```
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Deployment type doesn't matter for this error: it reproduces on fully deployed stage (Airflow components running in containers built from our own Dockerfile) and on developers laptop, with venv and `airflow webserver` command.
   
   ### Anything else
   
   After further investigation we've found that **create_engine** receives following options:
   ```
   {
     'pool_size': 10, 
     'pool_recycle': 7200, 
     'poolclass': <class 'sqlalchemy.pool.impl.NullPool'>, 
     'isolation_level': 'READ COMMITTED', 
     'encoding': 'utf-8'
   }
   ```
   These **pool_size** and **pool_recycle** weren't set by us, so they must have come from some default values.
   It seems than an error occurs during **create_app** function: `airflow/www/app.py:71`. And that `pool_size` parameter comes from **apply_driver_hacks** method of SQLAlchemy class: `flask_sqlalchemy/__init__.py:937`
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #24176: Can't disable sqlalchemy pool

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #24176:
URL: https://github.com/apache/airflow/issues/24176#issuecomment-1146636026

   Please raise it to flask_sqlalchemy we cannot do much about it I am afraid.
   
   Converting it into discussion - would be great to know when you open the issue and what results you get from that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org