You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/07 13:32:22 UTC
[GitHub] [airflow] mattinbits opened a new issue #10221: Deadlock Exception with MSSQL as backend DB
mattinbits opened a new issue #10221:
URL: https://github.com/apache/airflow/issues/10221
<!--
Welcome to Apache Airflow! For a smooth issue process, try to answer the following questions.
Don't worry if they're not all applicable; just try to include what you can :-)
If you need to include code snippets or logs, please put them in fenced code
blocks. If they're super-long, please use the details tag like
<details><summary>super-long log</summary> lots of stuff </details>
Please delete these comment blocks before submitting the issue.
-->
<!--
IMPORTANT!!!
PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
Please complete the next sections or the issue will be closed.
This questions are the first thing we need to know to understand the context.
-->
**Apache Airflow version**: 1.10.9
**Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
**Environment**:
- **Cloud provider or hardware configuration**: On premise infrastructure. Scheduler is a CENTOS 7 docker image running on a RHEL 7 server. Database is SQL Server 2016
- **OS** (e.g. from /etc/os-release): CentOS Linux 7 (Core)
- **Kernel** (e.g. `uname -a`): 3.10.0-1127.13.1.el7.x86_64
- **Install tools**:
- **Others**:
**What happened**:
We have a DAG where several file sensors wait for similar files in parallel, using "reschedule" mode. Periodically, one or more of these tasks fail. The logs show a deadlock reported from the Database:
```
sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('40001', '[40001] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Transaction (Process ID 111) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. (1205) (SQLExecDirectW)')
[SQL: SELECT count(*) AS count_1
FROM task_instance
WHERE task_instance.pool = ? AND task_instance.state IN (?, ?)]
[parameters: ('default_pool', 'running', 'queued')]
(Background on this error at: http://sqlalche.me/e/dbapi)
```
Checking this query directly using SSMS, I can see it executes immediately and uses the ti_pool index.
And the associated stack trace:
```
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 37, in <module>
args.func(args)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 75, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 545, in run
_run(args, dag, ti)
File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 460, in _run
run_job.run()
File "/usr/local/lib/python3.7/site-packages/airflow/jobs/base_job.py", line 221, in run
self._execute()
File "/usr/local/lib/python3.7/site-packages/airflow/jobs/local_task_job.py", line 90, in _execute
pool=self.pool):
File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 851, in _check_and_change_state_before_execution
verbose=True):
File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 644, in are_dependencies_met
session=session):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 668, in get_failed_dep_statuses
dep_context):
File "/usr/local/lib/python3.7/site-packages/airflow/ti_deps/deps/base_ti_dep.py", line 106, in get_dep_statuses
for dep_status in self._get_dep_statuses(ti, session, dep_context):
File "/usr/local/lib/python3.7/site-packages/airflow/ti_deps/deps/pool_slots_available_dep.py", line 62, in _get_dep_statuses
open_slots = pools[0].open_slots()
File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/models/pool.py", line 113, in open_slots
return self.slots - self.occupied_slots(session)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/models/pool.py", line 70, in occupied_slots
.filter(TaskInstance.state.in_(STATES_TO_COUNT_AS_RUNNING))
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3469, in scalar
ret = self.one()
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3436, in one
ret = self.one_or_none()
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3405, in one_or_none
ret = list(self)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3481, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3506, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1020, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1324, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1518, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
raise exception
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1284, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
cursor.execute(statement, parameters)
```
<!-- (please include exact error messages if you can) -->
**What you expected to happen**:
The tasks succeed/reschedule as expected.
<!-- What do you think went wrong? -->
**How to reproduce it**:
I have struggled to reproduce this locally away from our production environment. I suspect it is related to the size of the task_instance table and therefore hard to reproduce locally on a clean instance of airflow.
<!---
As minimally and precisely as possible. Keep in mind we do not have access to your cluster or dags.
If you are using kubernetes, please attempt to recreate the issue using minikube or kind.
## Install minikube/kind
- Minikube https://minikube.sigs.k8s.io/docs/start/
- Kind https://kind.sigs.k8s.io/docs/user/quick-start/
If this is a UI bug, please provide a screenshot of the bug or a link to a youtube video of the bug in action
You can include images using the .md sytle of
![alt text](http://url/to/img.png)
To record a screencast, mac users can use QuickTime and then create an unlisted youtube video with the resulting .mov file.
--->
**Anything else we need to know**:
<!--
How often does this problem occur? Once? Every time etc?
Any relevant logs to include? Put them here in side a detail tag:
<details><summary>x.log</summary> lots of stuff </details>
-->
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on issue #10221: Deadlock Exception with MSSQL as backend DB
Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #10221:
URL: https://github.com/apache/airflow/issues/10221#issuecomment-701095136
MSSQL is not official supported by Airflow. See: https://github.com/apache/airflow/issues/10713
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #10221: Deadlock Exception with MSSQL as backend DB
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #10221:
URL: https://github.com/apache/airflow/issues/10221#issuecomment-670518366
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk closed issue #10221: Deadlock Exception with MSSQL as backend DB
Posted by GitBox <gi...@apache.org>.
potiuk closed issue #10221:
URL: https://github.com/apache/airflow/issues/10221
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org