You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Roman Puchkovskiy (Jira)" <ji...@apache.org> on 2022/11/30 10:49:00 UTC

[jira] [Created] (IGNITE-18292) Ignite 3 cluster sometimes hangs making KeyValueViewPocoTests.TestContains fail

Roman Puchkovskiy created IGNITE-18292:
------------------------------------------

             Summary: Ignite 3 cluster sometimes hangs making KeyValueViewPocoTests.TestContains fail
                 Key: IGNITE-18292
                 URL: https://issues.apache.org/jira/browse/IGNITE-18292
             Project: Ignite
          Issue Type: Bug
            Reporter: Roman Puchkovskiy
            Assignee: Roman Puchkovskiy
             Fix For: 3.0.0-beta2


The corresponding test fails regularly: [https://ci.ignite.apache.org/test/-8907008159462326767?currentProjectId=ApacheIgnite3xGradle_Test&expandTestHistoryChartSection=true&orderBy=status&order=desc]

The following can be found in the log of build #6620 [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunNetTests/6935941?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildProblemsSection=true&expandBuildChangesSection=true&expandBuildTestsSection=true&showLog=6935941_11284_971.988&logFilter=debug&logView=flowAware]

[16:47:18] :         [dotnet test] java.lang.NullPointerException
[16:47:18] :         [dotnet test]     at org.apache.ignite.network.DefaultMessagingService.send0(DefaultMessagingService.java:146)
[16:47:18] :         [dotnet test]     at org.apache.ignite.network.DefaultMessagingService.respond(DefaultMessagingService.java:124)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler$1.sendResponse(IgniteRpcServer.java:183)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor.processRequest0(AppendEntriesRequestProcessor.java:430)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor.processRequest0(AppendEntriesRequestProcessor.java:41)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.impl.core.NodeRequestProcessor.processRequest(NodeRequestProcessor.java:55)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:49)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:29)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.lambda$onReceived$0(IgniteRpcServer.java:195)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.util.concurrent.MpscSingleThreadExecutor$Worker.runTask(MpscSingleThreadExecutor.java:354)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.util.concurrent.MpscSingleThreadExecutor$Worker.run(MpscSingleThreadExecutor.java:338)
[16:47:18] :         [dotnet test]     at org.apache.ignite.raft.jraft.util.concurrent.MpscSingleThreadExecutor.lambda$doStartWorker$3(MpscSingleThreadExecutor.java:262)
[16:47:18] :         [dotnet test]     at java.base/java.lang.Thread.run(Thread.java:834)

Another interesting part of log is this:

[16:47:09] :         [dotnet test] INFO: [default:org.apache.ignite.internal.runner.app.PlatformTestNodeRunner:cf67f7bc-4fe0-451a-ad5d-171e0e3f2b5f@172.120.0.3:3344] Member leaved without notification: default:org.apache.ignite.internal.runner.app.PlatformTestNodeRunner_2:3cf96027-2ac8-4572-a386-746b46f92cbb@172.120.0.3:3345

and then

[16:47:09] :         [dotnet test] INFO: [default:org.apache.ignite.internal.runner.app.PlatformTestNodeRunner_2:3cf96027-2ac8-4572-a386-746b46f92cbb@172.120.0.3:3345] Member leaved without notification: default:org.apache.ignite.internal.runner.app.PlatformTestNodeRunner:cf67f7bc-4fe0-451a-ad5d-171e0e3f2b5f@172.120.0.3:3344

Looks like the 2 nodes comprising the cluster lost each other and were never able to see each other again, even though they were running on the same machine (just on different ports). I wonder could this be some network-related problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)