You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Shashikant Banerjee (Jira)" <ji...@apache.org> on 2019/12/20 11:48:00 UTC

[jira] [Created] (RATIS-785) Statemachine updater fails with assertion

Shashikant Banerjee created RATIS-785:
-----------------------------------------

             Summary: Statemachine updater fails with assertion
                 Key: RATIS-785
                 URL: https://issues.apache.org/jira/browse/RATIS-785
             Project: Ratis
          Issue Type: Bug
          Components: server
            Reporter: Shashikant Banerjee
             Fix For: 0.5.0


{code:java}
java.lang.IllegalStateException: retry cache entry should be pending: client-7E602ACF0902:70:done
        at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
        at org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
        at org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
        at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
        at org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
        at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
        at java.lang.Thread.run(Thread.java:748)
2019-12-20 11:27:24,343 ERROR org.apache.ratis.server.impl.StateMachineUpdater: ed90869c-317e-4303-8922-9fa83a3983cb@group-9D552F016938-StateMachineUpdater: the StateMachineUpdater hits Throwable
java.lang.IllegalStateException: retry cache entry should be pending: client-7E602ACF0902:70:done
        at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
        at org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
        at org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
        at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
        at org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
        at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
        at java.lang.Thread.run(Thread.java:748)
{code}
The issue seems to be caused by precondition, where in the the reply future in retry cache is marked complete already where it expects to be in pending state.

 

One possible case, would be like , if the entry gets evicted from cache, we end up creating two different requests (two log entries) for same set of client and call id which is the key to retryCache. If the server now restarts and starts reapplying the transaction, the earlier index might add it to the retryCache but when the apply for the other log index happens, it might already see the future marked complete as for both of them retry cache key would be same.

FYI, the issue happens only after a restart.

cc [~msingh], [~ljain] [~szetszwo]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)