You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@sentry.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2019/01/25 22:36:00 UTC
[jira] [Commented] (SENTRY-2491) Sentry High availability unit
tests run into deadlock sometimes
[ https://issues.apache.org/jira/browse/SENTRY-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752760#comment-16752760 ]
Hadoop QA commented on SENTRY-2491:
-----------------------------------
Here are the results of testing the latest attachment
https://issues.apache.org/jira/secure/attachment/12956400/SENTRY-2491.002.patch against master.
{color:red}Overall:{color} -1 due to 2 errors
{color:red}ERROR:{color} mvn test exited 1
{color:red}ERROR:{color} Failed: org.apache.sentry.hdfs.TestSentryHDFSServiceProcessor
Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/4337/console
This message is automatically generated.
> Sentry High availability unit tests run into deadlock sometimes
> ---------------------------------------------------------------
>
> Key: SENTRY-2491
> URL: https://issues.apache.org/jira/browse/SENTRY-2491
> Project: Sentry
> Issue Type: Bug
> Components: Sentry
> Affects Versions: 2.2.0
> Reporter: Na Li
> Assignee: Na Li
> Priority: Major
> Attachments: SENTRY-2491.001.patch, SENTRY-2491.002.patch
>
>
> In sentry unit tests, we don't create schema before running a test. Instead, we use dataNucleus to create sentry tables when they are accessed. This creates potential deadlock when running test for Sentry HA setup.
> For example, the following shows the event sequence that causes deadlock
> 1) thread_1 gets shared lock of SYSTABLES in order to read table SENTRY_HMS_NOTIFICATION_ID
> 2) thread_2 gets shared lock of SYSTABLES in order to read table SENTRY_HMS_NOTIFICATION_ID
> 3) thread_1 tries to get execution lock to create table SENTRY_HMS_NOTIFICATION_ID,
> and wait for execution lock because thread_2 got shared lock already.
> 4) thread_2 tries to get execution lock to create table SENTRY_HMS_NOTIFICATION_ID,
> and wait for execution lock because thread_1 got shared lock already.
> The solution is to let the instances of sentry service start with delay. Specifically,
> let HMS follower threads separate as far as possible, i.e., half of the interval.
>
> This deadlock only exists in unit tests, and does not exist in production because schema is created before starting Sentry services. Therefore, there is no table creation after service starts.
> Example of the log message when such deadlock happens
> {code}
> 2019-01-04 18:32:46,332 (pool-13-thread-1) [INFO - org.apache.sentry.hdfs.DBUpdateForwarder.getAllUpdatesFrom(DBUpdateForwarder.java:140)] (org.apache.sentry.hdfs.PermImageRetriever) Requested sequence number 0 is less than 0 or requested deltas for that sequence number are not available. Fetch a full update
> 2019-01-04 18:32:49,346 (hms-follower) [DEBUG - org.apache.sentry.service.thrift.SentryStateBank.enableState(SentryStateBank.java:102)] HMSFollower entered state STARTED
> 2019-01-04 18:32:50,091 (hms-follower) [DEBUG - org.apache.sentry.service.thrift.SentryStateBank.enableState(SentryStateBank.java:102)] HMSFollower entered state STARTED
> 2019-01-04 18:33:00,286 (store-cleaner) [ERROR - org.datanucleus.util.Log4JLogger.error(Log4JLogger.java:115)] Error thrown executing CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
> (
> NOTIFICATION_ID BIGINT NOT NULL
> ) : A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, SYSTABLES, (1,3)
> Waiting XID : {436, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
> (
> NOTIFICATION_ID BIGINT NOT NULL
> )
> Granted XID : {419, S}
> Lock : ROW, SYSTABLES, (1,3)
> Waiting XID : {419, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
> (
> NOTIFICATION_ID BIGINT NOT NULL
> )
> Granted XID : {419, S} , {436, S}
> . The selected victim is XID : 436.
> java.sql.SQLTransactionRollbackException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, SYSTABLES, (1,3)
> Waiting XID : {436, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
> (
> NOTIFICATION_ID BIGINT NOT NULL
> )
> Granted XID : {419, S}
> Lock : ROW, SYSTABLES, (1,3)
> Waiting XID : {419, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
> (
> NOTIFICATION_ID BIGINT NOT NULL
> )
> Granted XID : {419, S} , {436, S}
> . The selected victim is XID : 436.
> at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
> at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
> at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
> at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
> at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
> at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:879)
> at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementList(AbstractTable.java:830)
> at org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.java:546)
> at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:609)
> at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3385)
> at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2896)
> at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
> at
> ...[truncated 36339941 chars]...
> eImpl(RetryClientInvocationHandler.java:94)] Calling getAllUpdatesFrom
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)