You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Hadoop QA (Jira)" <ji...@apache.org> on 2020/01/07 11:05:00 UTC

[jira] [Commented] (RATIS-785) Statemachine updater fails with assertion

    [ https://issues.apache.org/jira/browse/RATIS-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009622#comment-17009622 ] 

Hadoop QA commented on RATIS-785:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 16s{color} | {color:orange} root: The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 19s{color} | {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m  7s{color} | {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.logservice.server.TestMetaServer |
|   | ratis.netty.TestRaftExceptionWithNetty |
|   | ratis.netty.TestLogAppenderWithNetty |
|   | ratis.server.raftlog.TestRaftLogMetrics |
|   | ratis.server.simulation.TestServerRestartWithSimulatedRpc |
|   | ratis.retry.TestMultipleLinearRandomRetry |
|   | ratis.server.simulation.TestRaftWithSimulatedRpc |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/ratis:date2020-01-07 |
| JIRA Issue | RATIS-785 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12990084/RATIS-785.000.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  checkstyle  compile  |
| uname | Linux ff596894cdb2 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh |
| git revision | master / 66774b5 |
| maven | version: Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) |
| Default Java | 1.8.0_232 |
| checkstyle | https://builds.apache.org/job/PreCommit-RATIS-Build/1200/artifact/out/diff-checkstyle-root.txt |
| unit | https://builds.apache.org/job/PreCommit-RATIS-Build/1200/artifact/out/patch-unit-root.txt |
|  Test Results | https://builds.apache.org/job/PreCommit-RATIS-Build/1200/testReport/ |
| Max. process+thread count | 3457 (vs. ulimit of 5000) |
| modules | C: ratis-server U: ratis-server |
| Console output | https://builds.apache.org/job/PreCommit-RATIS-Build/1200/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Statemachine updater fails with assertion
> -----------------------------------------
>
>                 Key: RATIS-785
>                 URL: https://issues.apache.org/jira/browse/RATIS-785
>             Project: Ratis
>          Issue Type: Bug
>          Components: server
>            Reporter: Sammi Chen
>            Assignee: Shashikant Banerjee
>            Priority: Major
>             Fix For: 0.5.0
>
>         Attachments: RATIS-785.000.patch
>
>
> {code:java}
> java.lang.IllegalStateException: retry cache entry should be pending: client-7E602ACF0902:70:done
>         at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
>         at org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
>         at org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
>         at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
>         at org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
>         at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
>         at java.lang.Thread.run(Thread.java:748)
> 2019-12-20 11:27:24,343 ERROR org.apache.ratis.server.impl.StateMachineUpdater: ed90869c-317e-4303-8922-9fa83a3983cb@group-9D552F016938-StateMachineUpdater: the StateMachineUpdater hits Throwable
> java.lang.IllegalStateException: retry cache entry should be pending: client-7E602ACF0902:70:done
>         at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
>         at org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
>         at org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
>         at org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
>         at org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
>         at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> The issue seems to be caused by precondition, where in the the reply future in retry cache is marked complete already where it expects to be in pending state.
>  
> One possible case, would be like , if the entry gets evicted from cache, we end up creating two different requests (two log entries) for same set of client and call id which is the key to retryCache. If the server now restarts and starts reapplying the transaction, the earlier index might add it to the retryCache but when the apply for the other log index happens, it might already see the future marked complete as for both of them retry cache key would be same.
> FYI, the issue happens only after a restart.
> cc [~msingh], [~ljain] [~szetszwo]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)