You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Giovanni Matteo Fumarola (JIRA)" <ji...@apache.org> on 2018/06/07 22:02:00 UTC

[jira] [Assigned] (YARN-8337) [FederationStateStore - MySql] Deadlock In addApplicationHome

     [ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Giovanni Matteo Fumarola reassigned YARN-8337:
----------------------------------------------

    Assignee: Giovanni Matteo Fumarola

> [FederationStateStore - MySql] Deadlock In addApplicationHome
> -------------------------------------------------------------
>
>                 Key: YARN-8337
>                 URL: https://issues.apache.org/jira/browse/YARN-8337
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: federation, router
>            Reporter: Jianchao Jia
>            Assignee: Giovanni Matteo Fumarola
>            Priority: Major
>         Attachments: YARN-8337.001.patch, YARN-8337.002.patch
>
>
> We use mysql innodb as the state store engine,in router log we found dead lock error like below:
> {code:java}
> [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : Unable to insert the newly generated application application_1526295230627_127402
> com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
>         at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown Source)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
>         at com.mysql.jdbc.Util.getInstance(Util.java:408)
>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973)
>         at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909)
>         at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)
>         at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680)
>         at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484)
>         at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858)
>         at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079)
>         at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013)
>         at com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104)
>         at com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418)
>         at com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887)
>         at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61)
>         at com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java)
>         at org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547)
> {code}
> Use "show engine innodb status;" command to find what happens 
> {code:java}
> 2018-05-21 15:41:40 7f4685870700
> *** (1) TRANSACTION:
> TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB 4999
> mysql tables in use 2, locked 2
> LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s)
> MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 192.168.1.138 federation executing
> INSERT INTO applicationsHomeSubCluster
> (applicationId,homeSubCluster)
> (SELECT applicationId_IN, homeSubCluster_IN
> FROM applicationsHomeSubCluster
> WHERE applicationId = applicationId_IN
> HAVING COUNT(*) = 0 )
> *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
> RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 lock_mode X locks gap before rec insert intention waiting
> Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
> 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes);
> 1: len 6; hex 00000ba5f32d; asc -;;
> 2: len 7; hex dd000000280110; asc ( ;;
> 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;;
> *** (2) TRANSACTION:
> TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB 4999
> mysql tables in use 2, locked 2
> 4 lock struct(s), heap size 1184, 2 row lock(s)
> MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 192.168.1.138 federation executing
> INSERT INTO applicationsHomeSubCluster
> (applicationId,homeSubCluster)
> (SELECT applicationId_IN, homeSubCluster_IN
> FROM applicationsHomeSubCluster
> WHERE applicationId = applicationId_IN
> HAVING COUNT(*) = 0 )
> *** (2) HOLDS THE LOCK(S):
> RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131539 lock mode S locks gap before rec
> Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
> 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes);
> 1: len 6; hex 00000ba5f32d; asc -;;
> 2: len 7; hex dd000000280110; asc ( ;;
> 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;;
> *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
> RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131539 lock_mode X locks gap before rec insert intention waiting
> Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
> 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes);
> 1: len 6; hex 00000ba5f32d; asc -;;
> 2: len 7; hex dd000000280110; asc ( ;;
> 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;;
> *** WE ROLL BACK TRANSACTION (2)
> {code}
> PROCEDURE sp_addApplicationHomeSubCluster will create gap lock .
> insert into select where clause,if the applicationId does not exists in table applicationsHomeSubCluster ,it will create a gap lock。
> At this moment ,if other threads want to insert new records of the gap,deaklock may happen.
>  
> To reproduce the situation , we use 3 sessions to insert different applicationId :application_1526528662010_001201,application_1526528662010_001202,application_1526528662010_001203
> !http://bit.jd.com/zhangmang/JDHadoop-2.7.1/uploads/78912e6d245b8009052fb24e73cbaf54/image.png!
>  
> To fix this issue,we should use insert ignore into  instead of insert into select 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org