You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2018/06/08 14:26:00 UTC

[jira] [Commented] (SOLR-12075) TestLargeCluster is too flaky

    [ https://issues.apache.org/jira/browse/SOLR-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506072#comment-16506072 ] 

Adrien Grand commented on SOLR-12075:
-------------------------------------

This test failed 3 of the last 10 smoke-release builds.
    https://builds.apache.org/view/L/view/Lucene/job/Lucene-Solr-SmokeRelease-7.x/237/consoleFull
    https://builds.apache.org/view/L/view/Lucene/job/Lucene-Solr-SmokeRelease-7.x/235/consoleFull
    https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1043/consoleFull

I'm going to badapple it again.



> TestLargeCluster is too flaky
> -----------------------------
>
>                 Key: SOLR-12075
>                 URL: https://issues.apache.org/jira/browse/SOLR-12075
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Major
>
> This test is failing a lot in jenkins builds, with two types of failures:
>  * specific test method failures - this may be caused by either bugs in the autoscaling code, bugs in the simulator or timing issues. It should be possible to narrow down the cause by using different speeds of simulated time.
>  * suite-level failures due to leaked threads - most of these failures indicate the ongoing Policy calculations, eg:
> {code}
> com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.autoscaling.sim.TestLargeCluster: 
>   1) Thread[id=21406, name=AutoscalingActionExecutor-7277-thread-1, state=RUNNABLE, group=TGRP-TestLargeCluster]
>        at java.util.ArrayList.iterator(ArrayList.java:834)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:131)
>        at org.apache.solr.common.util.Utils.makeDeepCopy(Utils.java:110)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:92)
>        at org.apache.solr.common.util.Utils.makeDeepCopy(Utils.java:108)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:92)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:74)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Row.copy(Row.java:91)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.lambda$getMatrixCopy$1(Policy.java:297)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session$$Lambda$466/1757323495.apply(Unknown Source)
>        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>        at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.getMatrixCopy(Policy.java:298)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.copy(Policy.java:287)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Row.removeReplica(Row.java:156)
>        at org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.tryEachNode(MoveReplicaSuggester.java:60)
>        at org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.init(MoveReplicaSuggester.java:34)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Suggester.getSuggestion(Suggester.java:129)
>        at org.apache.solr.cloud.autoscaling.ComputePlanAction.process(ComputePlanAction.java:98)
>        at org.apache.solr.cloud.autoscaling.ScheduledTriggers.lambda$null$3(ScheduledTriggers.java:307)
>        at org.apache.solr.cloud.autoscaling.ScheduledTriggers$$Lambda$439/951218654.run(Unknown Source)
>        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>        at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>        at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$9/1677458082.run(Unknown Source)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>        at java.lang.Thread.run(Thread.java:748)
> 	at __randomizedtesting.SeedInfo.seed([C6FA0364D13DAFCC]:0)
> {code}
> It's possible that somewhere an InterruptedException is caught and not propagated so that the Policy calculations don't terminate when the thread is interrupted when closing parent components.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org