You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Rajini Sivaram (Jira)" <ji...@apache.org> on 2020/03/04 09:38:00 UTC

[jira] [Resolved] (KAFKA-9632) Transient test failure: PartitionLockTest.testAppendReplicaFetchWithUpdateIsr

     [ https://issues.apache.org/jira/browse/KAFKA-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rajini Sivaram resolved KAFKA-9632.
-----------------------------------
    Fix Version/s: 2.6.0
         Reviewer: Manikumar
       Resolution: Fixed

> Transient test failure: PartitionLockTest.testAppendReplicaFetchWithUpdateIsr
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-9632
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9632
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.5.0
>            Reporter: Rajini Sivaram
>            Assignee: Rajini Sivaram
>            Priority: Major
>             Fix For: 2.6.0
>
>
> When running this test with _numRecordsPerProducer=500_, the test fails intermittently. The test uses MockTime and runs concurrent log operations. This can cause issues when attempting to roll a segment since Log and MockScheduler don't work well together. MockScheduler currently runs tasks while holding the MockScheduler lock. This can cause a deadlock if a thread attempts to schedule a task while holding a lock which is also acquired within a scheduled task.
> The issue in this test occurs when these two operations happen concurrently:
> 1) LogManager.cleanupLogs is a scheduled task that acquires Log lock. When run with MockScheduler, the thread holds MockScheduler lock and then attempts to acquire Log lock.
> 2) Partition.appendLogsToLeader holds Log lock and attempts to acquire MockScheduler lock in order to schedule a roll().
> Since locking order is reversed in 1) and 2), this causes a deadlock.
> The test itself can be easily fixed by avoiding roll() in the test. But it will be good to fix MockScheduler to enable it to be used in this case.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)