You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/03 13:06:00 UTC

[jira] [Updated] (KUDU-2634) token_signer-itest can get stuck when the cluster is shutting down while the leader master generates a new TSK

     [ https://issues.apache.org/jira/browse/KUDU-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Henke updated KUDU-2634:
------------------------------
    Component/s: test

> token_signer-itest can get stuck when the cluster is shutting down while the leader master generates a new TSK
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2634
>                 URL: https://issues.apache.org/jira/browse/KUDU-2634
>             Project: Kudu
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.8.0
>            Reporter: William Berkeley
>            Priority: Major
>         Attachments: token_signer-itest.log
>
>
> I saw the following thing happen in token_signer-itest:
> 1. The test body finishes. The InternalMiniCluster is being shut down as part of cleaning up the test.
> 2. The follower masters shut down.
> 3. The leader master starts shutting down (Master::Shutdown()). The catalog manager is shutting down the background tasks (CatalogManagerBgTasks::Shutdown(), and so is joining with the bg task thread.
> 4. The bg task thread is in the middle of CatalogManagerBgTasks::Run(), where, because of the short TSK rotation times, it detects it needs to generate a new TSK. It calls through to SysCatalogTable::SyncWrite to write the new TSK.
> 5. The other two masters are shut down, so SyncWrite blocks forever waiting for the TSK write to replicate.
> 6. The test eventually times out because the itest thread is stuck in CatalogManagerBgTasks::Shutdown() waiting for SysCatalogTable::SyncWrite().
> Log of the failing test attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)