You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Clay B. (JIRA)" <ji...@apache.org> on 2019/07/21 22:36:00 UTC

[jira] [Comment Edited] (RATIS-485) Load Generator OOMs if Ratis Unavailable

    [ https://issues.apache.org/jira/browse/RATIS-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889833#comment-16889833 ] 

Clay B. edited comment on RATIS-485 at 7/21/19 10:35 PM:
---------------------------------------------------------

This can still be reproduced:
* Run the Vagrant test-harness start-up:
** {{cd ./incubator-ratis/dev-support/vagrant; ./run_all_tests.sh build}}
** Wait for build completion
** {{vagrant up ratis-servers}}
** {{vagrant ssh ratis-servers}}
* Twiddle Ratis Daemons/Clients
** {{screen -x}}
* Kill all servers
** {{Ctrl-A 0}} -- switch to First Server
** {{Ctrl-C}} -- kill Server
** Repeat for Server 2 and 3 -- i.e. {{Ctrl-A 1}}; {{Ctrl-C}}; {{Ctrl-A 2}}; {{Ctrl-C}} (all servers should now be dead)
* OOM Load Generator
** Switch to Load Generator -- {{Ctrl-A 3}}
** Re-launch Load Generrator -- {{Return}}{{r}}
** Wait for LOTS of spew; eventually, unable to launch more threads...


was (Author: clayb):
This can still be reproduced:
* Run the Vagrant test-harness start-up:
** {{cd ./incubator-ratis/dev-support/vagrant; ./run_all_tests.sh build}}
** Wait for build completion
** {{vagrant up ratis-servers}}
** {{vagrant ssh ratis-servers}}
** {{screen -x}}
** {{Ctrl-A 0}} -- switch to First Server
** {{Ctrl-C}} -- kill Server
** Repeat for Server 2 and 3 -- i.e. {{Ctrl-A 1}}; {{Ctrl-C}}; {{Ctrl-A 2}}; {{Ctrl-C}} (all servers should now be dead)
** Switch to Load Generator -- {{Ctrl-A 3}}
** Re-launch Load Generrator -- {{Return}}{{r}}
** Wait for LOTS of spew; eventually, unable to launch more threads...

> Load Generator OOMs if Ratis Unavailable
> ----------------------------------------
>
>                 Key: RATIS-485
>                 URL: https://issues.apache.org/jira/browse/RATIS-485
>             Project: Ratis
>          Issue Type: Bug
>          Components: examples
>            Reporter: Clay B.
>            Priority: Trivial
>
> Running the load generator without a Ratis cluster (e.g. spurious node IPs) results in an OOM.
> If one has a single Ratis server it tries seemingly indefinitely:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 --numFiles 100 --peers n0:127.0.0.1:1{code}
> If one has two Ratis servers it OOMs:
> {code:java}
> vagrant@ratis-server:~/incubator-ratis$ ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 --numFiles 100 --peers n0:127.0.0.1:1,n1:127.0.0.1:2
> [...]
> 1/787867107@5e5792a0 with java.util.concurrent.CompletionException: java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
> 2019-02-14 07:47:22 DEBUG RaftClient:417 - client-272A2E13A5DD: suggested new leader: null. Failed RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0 with java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
> 2019-02-14 07:47:22 DEBUG RaftClient:437 - client-272A2E13A5DD: change Leader from n1 to n0
> 2019-02-14 07:47:22 DEBUG RaftClient:291 - schedule attempt #10740 with policy RetryForeverNoSleep for RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:323 - client-272A2E13A5DD: send* RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
> 2019-02-14 07:47:22 DEBUG RaftClient:338 - client-272A2E13A5DD: Failed RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0 with java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: unable to create new native thread
> Exception in thread "main" java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: unable to create new native thread
>         at org.apache.ratis.client.impl.RaftClientImpl.lambda$sendRequestAsync$14(RaftClientImpl.java:349)
>         at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
>         at java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
>         at java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
>         at org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:334)
>         at org.apache.ratis.client.impl.RaftClientImpl.sendRequestWithRetryAsync(RaftClientImpl.java:286)
>         at org.apache.ratis.util.SlidingWindow$Client.sendOrDelayRequest(SlidingWindow.java:243)
>         at org.apache.ratis.util.SlidingWindow$Client.retry(SlidingWindow.java:259)
>         at org.apache.ratis.client.impl.RaftClientImpl.lambda$null$10(RaftClientImpl.java:293)
>         at org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$0(TimeoutScheduler.java:85)
>         at org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$1(TimeoutScheduler.java:104)
>         at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:50)
>         at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:91)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:717)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
>         at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603)
>         at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
>         at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
>         at org.apache.ratis.util.TimeoutScheduler.schedule(TimeoutScheduler.java:117)
>         at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:104)
>         at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:82)
>         at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:134)
>         at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.onNext(GrpcClientProtocolClient.java:234)
>         at org.apache.ratis.grpc.client.GrpcClientRpc.sendRequestAsync(GrpcClientRpc.java:71)
>         at org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:324)
>         ... 15 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)