You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2019/06/17 14:09:00 UTC
[jira] [Updated] (RATIS-591) All create log requests RPCs blocked
[ https://issues.apache.org/jira/browse/RATIS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Elser updated RATIS-591:
-----------------------------
Attachment: master-3.txt
master-2.txt
master-1.txt
> All create log requests RPCs blocked
> ------------------------------------
>
> Key: RATIS-591
> URL: https://issues.apache.org/jira/browse/RATIS-591
> Project: Ratis
> Issue Type: Bug
> Components: LogService
> Reporter: Josh Elser
> Assignee: Vladimir Rodionov
> Priority: Major
> Attachments: master-1.txt, master-2.txt, master-3.txt
>
>
> I was trying out Rajeshbabu's new changes in RATIS-541 using the docker automation, but gave invalid options the first time which caused the workers to exit (divide by zero).
> When I tried to rerun the VerificationTool, I found that the tool got stuck waiting for logs to be created. Getting a thread dump from the active leader of the metadata quorum showed 150+ threads all stuck waiting to get a write lock. However, there are no threads holding the lock that everyone is waiting on which seems to me like a deadlock.
> It seems like we have some kind of bug where we orphan a lock that's still held. This doesn't happen normally - makes me wonder if it can happen when the leader changes? I'll attach the log of the metadata quorum nodes from my local test. However, I bet this could be reproduced with some adequate load.
> Can you take a look into this, Vlad?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)