You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Ilan Ginzburg (Jira)" <ji...@apache.org> on 2020/06/08 10:48:02 UTC

[jira] [Commented] (SOLR-14546) OverseerTaskProcessor can process messages out of order

    [ https://issues.apache.org/jira/browse/SOLR-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128157#comment-17128157 ] 

Ilan Ginzburg commented on SOLR-14546:
--------------------------------------

I might understand a few things that appeared pretty mysterious when I wrote the Overseer doc: why {{LockTree.Session}} marks a node as busy when the lock can't be obtained (see [here|https://docs.google.com/document/d/1KTHq3noZBVUQ7QNuBGEhujZ_duwTVpAsvN3Nz5anQUY/edit#bookmark=id.ef1yf1sb14qu]). I now guess the intent of this was to solve this very issue.

It's certainly the right direction for a fix.

Currently my understanding is that the whole Session thing doesn't work (see [here|https://docs.google.com/document/d/1KTHq3noZBVUQ7QNuBGEhujZ_duwTVpAsvN3Nz5anQUY/edit#bookmark=id.oh0pwq90n27s]) because the {{OverseerCollectionMessageHandler.sessionId}} is always {{-1}} and never updated so never equal to the task batch id. This is easy to fix (set it).
This is likely not sufficient, as we would need to rely on the old {{Session}} (from the "previous" batch id) when processing the {{blockedTask}} that got moved to the head of the heads processing list (or said differently: increment the batch id at a different place).

Will work on a proposal along these lines.

> OverseerTaskProcessor can process messages out of order
> -------------------------------------------------------
>
>                 Key: SOLR-14546
>                 URL: https://issues.apache.org/jira/browse/SOLR-14546
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: master (9.0)
>            Reporter: Ilan Ginzburg
>            Priority: Major
>              Labels: collection-api, correctness, overseer
>
> *Summary:* 
> Collection API and ConfigSet API messages can be processed out of order by the {{OverseerTaskProcessor}}/{{OverseerCollectionConfigSetProcessor}} because of the way locking is handled.
>  
> *Some context:*
> The Overseer task processor dequeues messages from the Collection and ConfigSet Zookeeper queue at {{/overseer/collection-queue-work}} and hands processing to an executor pool with 100 max parallel tasks ({{ClusterStateUpdater}} on the other hand uses a single thread consuming and processing the {{/overseer/queue}} and does not suffer from the problem described here).
> Locking prevents tasks modifying the same or related data from running concurrently. For the Collection API, locking can be done at {{CLUSTER}}, {{COLLECTION}}, {{SHARD}} or {{REPLICA}} level and each command has its locking level defined in {{CollectionParams.CollectionAction}}.
> Multiple tasks for the same or competing locking levels do not execute concurrently. Commands locking at {{COLLECTION}} level for the same collection (for example {{CREATE}} creating a collection, {{CREATEALIAS}} creating an alias for the collection and {{CREATESHARD}} creating a new shard for the collection) will be executed sequentially, and it is expected that they are executed in submission order. Commands locking at different levels are also serialized when needed (for example changes to a shard of a collection and changes to the collection itself).
>  
> *The issue:*
> The way {{OverseerTaskProcessor.run()}} deals with messages that cannot be executed due to competing messages already running (i.e. lock unavailable) is not correct. The method basically iterates over the set of messages to execute, skips those that can’t be executed due to locking and executes those that can be (assuming enough remaining available executors).
> The issue of out of order execution occurs when at least 3 messages compete for a lock. The following scenario shows out of order execution (messages numbered in enqueue order):
>  * Message 1 gets the lock and runs,
>  * Message 2 can’t be executed because the lock is taken,
>  * Message 1 finishes execution and releases the lock,
>  * Message 3 can be executed, gets the lock and runs,
>  * Message 2 eventually runs when Message 3 finishes and releases the lock.
> We therefore have execution order 1 - 3 - 2 when the expected execution order is 1 - 2 - 3. Order 1 - 3 - 2 might not make sense or result in a different final state from 1 - 2 - 3.
> (there’s a variant of this scenario in which after Message 2 can't be executed the number of max parallel tasks is reached, then remaining tasks in {{heads}} including Message 3 are put into the {{blockedTasks}} map. These tasks are the first ones considered for processing on the next iteration of the main {{while}} loop. The reordering is similar).
>  
> Note that from the client perspective, there does not need to be 3 outstanding tasks, 2 are sufficient. The third task required for observing reordering could be enqueued after the first one completed.
>  
> *Impact of the issue:*
> This problem makes {{MultiThreadedOCPTest.testFillWorkQueue()}} fail because the test requires task execution in submission order (SOLR-14524).
>  
> The messages required for reordering to happen are not necessarily messages at the same locking level: a message locking at {{SHARD}} or {{REPLICA}} level prevents execution of a {{COLLECTION}} level message for the same collection. Examples can be built of sequences of messages that lead to incorrect results or failures. For example {{ADDREPLICA}} followed by {{CREATEALIAS}} followed by {{DELETEALIAS}}, could see alias creation and deletion reordered, making no sense.
> I wasn’t able to come up with a realistic production example that would be impacted by this issue (doesn't mean one doesn't exist!).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org