You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael DeBruyn (Jira)" <ji...@apache.org> on 2020/07/06 18:04:00 UTC
[jira] [Commented] (SOLR-11208) Usage SynchronousQueue in Executors prevent large scale operations

    [ https://issues.apache.org/jira/browse/SOLR-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152227#comment-17152227 ] 

Michael DeBruyn commented on SOLR-11208:
----------------------------------------

This makes auto scaling policies virtually useless.  I'm currently running 7.7.3 (testing with 8.5.2) with 3x TLOG and 6x PULL nodes that serve 19 collections where the pull nodes are somewhat transient in K8S.  When a node is replaced the node_lost_trigger and node_added_trigger we have in place fail more often than not due to the tiny thread pool and inability to queue the requests.
{noformat}
          "response": [
            "Operation deletenode caused exception:",
            "java.util.concurrent.RejectedExecutionException:java.util.concurrent.RejectedExecutionException: Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$190/0x00007f1a4c8a3db8@3d9a7483 rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor@40bcc8d4[Running, pool size = 10, active threads = 10, queued tasks = 0, completed tasks = 259]",
            "exception",
            {
              "msg": "Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$190/0x00007f1a4c8a3db8@3d9a7483 rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor@40bcc8d4[Running, pool size = 10, active threads = 10, queued tasks = 0, completed tasks = 259]",
              "rspCode": -1
            }
          ]
{noformat}
 

> Usage SynchronousQueue in Executors prevent large scale operations
> ------------------------------------------------------------------
>
>                 Key: SOLR-11208
>                 URL: https://issues.apache.org/jira/browse/SOLR-11208
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 6.6
>            Reporter: Björn Häuser
>            Priority: Major
>         Attachments: response.json
>
>
> I am not sure where to start with this one.
> I tried to post this already on the mailing list: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201708.mbox/%3c48C49426-33A2-4D79-AE26-A4515B8F834E@gmail.com%3e
> In short: the usage of a SynchronousQueue as the workQeue prevents more tasks than max threads.
> For example, taken from OverseerCollectionMessageHandler:
> {code:java}
>   ExecutorService tpe = new ExecutorUtil.MDCAwareThreadPoolExecutor(5, 10, 0L, TimeUnit.MILLISECONDS,
>       new SynchronousQueue<>(),
>       new DefaultSolrThreadFactory("OverseerCollectionMessageHandlerThreadFactory"));
> {code}
> This Executor is used when doing a REPLACENODE (= ADDREPLICA) command. When the node has more than 10 collections this will fail with the mentioned java.util.concurrent.RejectedExecutionException.
> I am also not sure how to fix this. Just replacing the queue with a different implementation feels wrong to me or could cause unwanted side behaviour.
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org