You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Clay B. (JIRA)" <ji...@apache.org> on 2019/07/21 22:36:00 UTC
[jira] [Comment Edited] (RATIS-485) Load Generator OOMs if Ratis
Unavailable
[ https://issues.apache.org/jira/browse/RATIS-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889833#comment-16889833 ]
Clay B. edited comment on RATIS-485 at 7/21/19 10:35 PM:
---------------------------------------------------------
This can still be reproduced:
* Run the Vagrant test-harness start-up:
** {{cd ./incubator-ratis/dev-support/vagrant; ./run_all_tests.sh build}}
** Wait for build completion
** {{vagrant up ratis-servers}}
** {{vagrant ssh ratis-servers}}
* Twiddle Ratis Daemons/Clients
** {{screen -x}}
* Kill all servers
** {{Ctrl-A 0}} -- switch to First Server
** {{Ctrl-C}} -- kill Server
** Repeat for Server 2 and 3 -- i.e. {{Ctrl-A 1}}; {{Ctrl-C}}; {{Ctrl-A 2}}; {{Ctrl-C}} (all servers should now be dead)
* OOM Load Generator
** Switch to Load Generator -- {{Ctrl-A 3}}
** Re-launch Load Generrator -- {{Return}}{{r}}
** Wait for LOTS of spew; eventually, unable to launch more threads...
was (Author: clayb):
This can still be reproduced:
* Run the Vagrant test-harness start-up:
** {{cd ./incubator-ratis/dev-support/vagrant; ./run_all_tests.sh build}}
** Wait for build completion
** {{vagrant up ratis-servers}}
** {{vagrant ssh ratis-servers}}
** {{screen -x}}
** {{Ctrl-A 0}} -- switch to First Server
** {{Ctrl-C}} -- kill Server
** Repeat for Server 2 and 3 -- i.e. {{Ctrl-A 1}}; {{Ctrl-C}}; {{Ctrl-A 2}}; {{Ctrl-C}} (all servers should now be dead)
** Switch to Load Generator -- {{Ctrl-A 3}}
** Re-launch Load Generrator -- {{Return}}{{r}}
** Wait for LOTS of spew; eventually, unable to launch more threads...
> Load Generator OOMs if Ratis Unavailable
> ----------------------------------------
>
> Key: RATIS-485
> URL: https://issues.apache.org/jira/browse/RATIS-485
> Project: Ratis
> Issue Type: Bug
> Components: examples
> Reporter: Clay B.
> Priority: Trivial
>
> Running the load generator without a Ratis cluster (e.g. spurious node IPs) results in an OOM.
> If one has a single Ratis server it tries seemingly indefinitely:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 --numFiles 100 --peers n0:127.0.0.1:1{code}
> If one has two Ratis servers it OOMs:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 --numFiles 100 --peers n0:127.0.0.1:1,n1:127.0.0.1:2
> [...]
> 1/787867107@5e5792a0 with java.util.concurrent.CompletionException: java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
> 2019-02-14 07:47:22 DEBUG RaftClient:417 - client-272A2E13A5DD: suggested new leader: null. Failed RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0 with java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
> 2019-02-14 07:47:22 DEBUG RaftClient:437 - client-272A2E13A5DD: change Leader from n1 to n0
> 2019-02-14 07:47:22 DEBUG RaftClient:291 - schedule attempt #10740 with policy RetryForeverNoSleep for RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:323 - client-272A2E13A5DD: send* RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:338 - client-272A2E13A5DD: Failed RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0 with java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: unable to create new native thread
> Exception in thread "main" java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: unable to create new native thread
> at org.apache.ratis.client.impl.RaftClientImpl.lambda$sendRequestAsync$14(RaftClientImpl.java:349)
> at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> at java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
> at java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
> at org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:334)
> at org.apache.ratis.client.impl.RaftClientImpl.sendRequestWithRetryAsync(RaftClientImpl.java:286)
> at org.apache.ratis.util.SlidingWindow$Client.sendOrDelayRequest(SlidingWindow.java:243)
> at org.apache.ratis.util.SlidingWindow$Client.retry(SlidingWindow.java:259)
> at org.apache.ratis.client.impl.RaftClientImpl.lambda$null$10(RaftClientImpl.java:293)
> at org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$0(TimeoutScheduler.java:85)
> at org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$1(TimeoutScheduler.java:104)
> at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:50)
> at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:91)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603)
> at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
> at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
> at org.apache.ratis.util.TimeoutScheduler.schedule(TimeoutScheduler.java:117)
> at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:104)
> at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:82)
> at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:134)
> at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.onNext(GrpcClientProtocolClient.java:234)
> at org.apache.ratis.grpc.client.GrpcClientRpc.sendRequestAsync(GrpcClientRpc.java:71)
> at org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:324)
> ... 15 more
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)