You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Bill Farner (JIRA)" <ji...@apache.org> on 2015/07/16 19:25:04 UTC
[jira] [Created] (AURORA-1401) DB deadlock caused stuck job update
Bill Farner created AURORA-1401:
-----------------------------------
Summary: DB deadlock caused stuck job update
Key: AURORA-1401
URL: https://issues.apache.org/jira/browse/AURORA-1401
Project: Aurora
Issue Type: Bug
Components: Scheduler
Reporter: Bill Farner
We observed the symptom of a job update that was stuck in ROLLING_FORWARD, but was not active. Looking in scheduler logs revealed a DB-level deadlock:
{noformat}
E0715 19:04:53.085 THREAD1471 org.apache.aurora.scheduler.updater.JobUpdateEventSubscriber.startUp: Failed to resume job updates: org.apache.aurora.scheduler.storage.Storage$StorageException:
### Error querying database. Cause: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
### The error may exist in org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### SQL: SELECT lock_token FROM job_update_locks AS l INNER JOIN job_updates u ON l.update_row_id = u.id INNER JOIN job_keys AS j ON j.id = u.job_key_id WHERE u.update_id = ? AND j.role = ? AND j.environment = ? AND j.name = ?
### Cause: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
org.apache.aurora.scheduler.storage.Storage$StorageException:
### Error querying database. Cause: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
### The error may exist in org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### SQL: SELECT lock_token FROM job_update_locks AS l INNER JOIN job_updates u ON l.update_row_id = u.id INNER JOIN job_keys AS j ON j.id = u.job_key_id WHERE u.update_id = ? AND j.role = ? AND j.environment = ? AND j.name = ?
### Cause: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
at org.apache.aurora.scheduler.storage.db.DbStorage.write(DbStorage.java:145)
at org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
at com.twitter.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:87)
at org.apache.aurora.scheduler.storage.log.LogStorage.doInTransaction(LogStorage.java:602)
at org.apache.aurora.scheduler.storage.log.LogStorage.write(LogStorage.java:635)
at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.write(CallOrderEnforcingStorage.java:130)
at org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.systemResume(JobUpdateControllerImpl.java:263)
at org.apache.aurora.scheduler.updater.JobUpdateEventSubscriber.startUp(JobUpdateEventSubscriber.java:85)
at com.google.common.util.concurrent.AbstractIdleService$2$1.run(AbstractIdleService.java:54)
at com.google.common.util.concurrent.Callables$3.run(Callables.java:93)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
### The error may exist in org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### SQL: SELECT lock_token FROM job_update_locks AS l INNER JOIN job_updates u ON l.update_row_id = u.id INNER JOIN job_keys AS j ON j.id = u.job_key_id WHERE u.update_id = ? AND j.role = ? AND j.environment = ? AND j.name = ?
### Cause: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:122)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:113)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectOne(DefaultSqlSession.java:73)
at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.ibatis.session.SqlSessionManager$SqlSessionInterceptor.invoke(SqlSessionManager.java:334)
at com.sun.proxy.$Proxy57.selectOne(Unknown Source)
at org.apache.ibatis.session.SqlSessionManager.selectOne(SqlSessionManager.java:165)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:69)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:53)
at org.apache.aurora.scheduler.storage.db.$Proxy64.selectLockToken(Unknown Source)
at org.apache.aurora.scheduler.storage.db.DbJobUpdateStore.getLockToken(DbJobUpdateStore.java:257)
at com.twitter.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:87)
at org.apache.aurora.scheduler.storage.ForwardingStore.getLockToken(ForwardingStore.java:178)
at org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.changeJobUpdateStatus(JobUpdateControllerImpl.java:454)
at org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.access$1000(JobUpdateControllerImpl.java:108)
at org.apache.aurora.scheduler.updater.JobUpdateControllerImpl$4.execute(JobUpdateControllerImpl.java:283)
at org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:131)
at org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult$Quiet.apply(Storage.java:148)
at org.apache.aurora.scheduler.storage.log.LogStorage$24.apply(LogStorage.java:605)
at org.apache.aurora.scheduler.storage.log.LogStorage$24.apply(LogStorage.java:602)
at org.apache.aurora.scheduler.storage.db.DbStorage.write(DbStorage.java:143)
... 10 more
Caused by: org.h2.jdbc.JdbcSQLException: Deadlock detected. The current transaction was rolled back. Details: "
Session #3 (user: ) on thread RowGarbageCollector RUNNING is waiting to lock PUBLIC.LOCKS (shared) while locking PUBLIC.JOB_KEYS (exclusive).
Session #2 (user: ) on thread JobUpdateEventSubscriber STARTING is waiting to lock PUBLIC.JOB_KEYS (shared) while locking PUBLIC.JOB_INSTANCE_UPDATE_EVENTS (exclusive), PUBLIC.JOB_UPDATE_EVENTS (exclusive), PUBLIC.LOCKS (exclusive), PUBLIC.JOB_UPDATE_LOCKS (exclusive)."; SQL statement:
SELECT
lock_token
FROM job_update_locks AS l
INNER JOIN job_updates u ON l.update_row_id = u.id
INNER JOIN job_keys AS j ON j.id = u.job_key_id
WHERE
u.update_id = ?
AND j.role = ?
AND j.environment = ?
AND j.name = ? [40001-187]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.table.RegularTable.doLock1(RegularTable.java:496)
at org.h2.table.RegularTable.lock(RegularTable.java:472)
at org.h2.table.TableFilter.lock(TableFilter.java:146)
at org.h2.command.dml.Select.queryWithoutCache(Select.java:671)
at org.h2.command.dml.Query.query(Query.java:322)
at org.h2.command.dml.Query.query(Query.java:290)
at org.h2.command.dml.Query.query(Query.java:36)
at org.h2.command.CommandContainer.query(CommandContainer.java:90)
at org.h2.command.Command.executeQuery(Command.java:197)
at org.h2.jdbc.JdbcPreparedStatement.execute(JdbcPreparedStatement.java:192)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:62)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:78)
at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:62)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:303)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:154)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:102)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:82)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:120)
... 32 more
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)