You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Tsz-wo Sze (Jira)" <ji...@apache.org> on 2019/10/02 14:52:00 UTC

[jira] [Commented] (RATIS-692) RaftStorageDirectory.tryLock throws a very deep IOException

    [ https://issues.apache.org/jira/browse/RATIS-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942879#comment-16942879 ] 

Tsz-wo Sze commented on RATIS-692:
----------------------------------

Thanks [~clayb] for testing the patch.

r692_20191002.patch: some refactoring so that the code can be shared with RATIS-696.



> RaftStorageDirectory.tryLock throws a very deep IOException
> -----------------------------------------------------------
>
>                 Key: RATIS-692
>                 URL: https://issues.apache.org/jira/browse/RATIS-692
>             Project: Ratis
>          Issue Type: Sub-task
>          Components: server
>            Reporter: Clay B.
>            Assignee: Tsz-wo Sze
>            Priority: Major
>         Attachments: r692_20190928.patch, r692_20191002.patch
>
>
> Working with our Namazu infrastructure, the first issue I hit when dialing up the faulty I/O injection rate is as follows:
> {code}
> 2019-09-27 14:13:45 ERROR RaftStorageDirectory:336 - Failed to acquire lock on /home/vagrant/test_data/data0_slowed/64656d6f-5261-6674-4772-6f7570313233/in_use.lock. If this storage directory is mounted via NFS, ensure that the appropriate nfs lock services are running.
> java.io.IOException: Input/output error
>         at java.io.RandomAccessFile.writeBytes(Native Method)
>         at java.io.RandomAccessFile.write(RandomAccessFile.java:512)
>         at org.apache.ratis.server.storage.RaftStorageDirectory.tryLock(RaftStorageDirectory.java:327)
>         at org.apache.ratis.server.storage.RaftStorageDirectory.lock(RaftStorageDirectory.java:291)
>         at org.apache.ratis.server.storage.RaftStorageDirectory.analyzeStorage(RaftStorageDirectory.java:264)
>         at org.apache.ratis.server.storage.RaftStorage.analyzeAndRecoverStorage(RaftStorage.java:100)
>         at org.apache.ratis.server.storage.RaftStorage.<init>(RaftStorage.java:63)
>         at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:109)
>         at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:110)
>         at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
>         at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Exception in thread "main" java.io.IOException: Input/output error
>         at java.io.RandomAccessFile.writeBytes(Native Method)
>         at java.io.RandomAccessFile.write(RandomAccessFile.java:512)
>         at org.apache.ratis.server.storage.RaftStorageDirectory.tryLock(RaftStorageDirectory.java:327)
>         at org.apache.ratis.server.storage.RaftStorageDirectory.lock(RaftStorageDirectory.java:291)
>         at org.apache.ratis.server.storage.RaftStorageDirectory.analyzeStorage(RaftStorageDirectory.java:264)
>         at org.apache.ratis.server.storage.RaftStorage.analyzeAndRecoverStorage(RaftStorage.java:100)
>         at org.apache.ratis.server.storage.RaftStorage.<init>(RaftStorage.java:63)
>         at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:109)
>         at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:110)
>         at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
>         at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> It looks like the call chain does not re-try anywhere however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)