You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Mark Robert Miller (Jira)" <ji...@apache.org> on 2020/11/10 14:28:00 UTC

[jira] [Comment Edited] (SOLR-14927) Remove Overseer

    [ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229252#comment-17229252 ] 

Mark Robert Miller edited comment on SOLR-14927 at 11/10/20, 2:27 PM:
----------------------------------------------------------------------

It’s the bad impl that limits the overseer (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. The disadvantage is actually what’s stated as the advantage. Zookeeper owning the state, state updates are not scalable.

The approach will not compete well with ab overseer approach in multiple areas, including cluster scalability. 


was (Author: markrmiller):
It’s the bad impl (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. The disadvantage is actually what’s stated as the advantage. Zookeeper owning the state, state updates are not scalable.

The approach will not compete well with ab overseer approach in multiple areas, including cluster scalability. 

> Remove Overseer
> ---------------
>
>                 Key: SOLR-14927
>                 URL: https://issues.apache.org/jira/browse/SOLR-14927
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Ilan Ginzburg
>            Assignee: Ilan Ginzburg
>            Priority: Major
>              Labels: cluster, collection-api, overseer, solrcloud, zookeeper
>
> This Jira is intended to capture sub jiras on the path to remove the Overseer component from SolrCloud and move to all nodes being able to do the work currently done by Overseer.
> See detailed description in [this doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/].
> Copying (edited) from the above doc:
> The motivation for removing Overseer include:
>  * Mono threaded state change is slow and doesn’t scale,
>  * Communication between cluster nodes and the Overseer use Zookeeper as a queueing mechanism, this is not a good idea,
>  * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper is inefficient and adds latency,
>  * Collection API scalability is poor, because not only a single node processes commands for all Collections, but it also depends on the mono threaded state change queue consumption,
>  * The code supporting Overseer in SolrCloud is complex (election, queue management, recovery etc).
> The general idea is that there’s already a central point in the SolrCloud cluster and it’s Zookeeper. It might not be necessary to have a second central point (Overseer) because nodes can interact directly with Zookeeper and synchronize more efficiently by optimistic locking using “conditional updates” (a.k.a compare and swap or CAS).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org