You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2013/11/02 08:53:20 UTC
[jira] [Comment Edited] (SOLR-5381) Split Clusterstate and scale

    [ https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809096#comment-13809096 ] 

Noble Paul edited comment on SOLR-5381 at 11/2/13 7:51 AM:
-----------------------------------------------------------

OK ,
here is the plan to split clusterstate on a per collection basis

h2. How to use this feature?
Introduce a new option while creating a collection (external=true) . This will keep the state of the collection in a separate node. 
example :

http://localhost:8983/solr/admin/collections?action=CREATE&name=xcoll&numShards=5&replicationFactor=2&external=true

This will result in this following entry in clusterstate.json
{code:JavaScript}
{
 “xcoll” : {“ex”:true}
}
{code}
there will be another ZK entry which carries the actual collection information
*  /collections
** /xcoll
*** /state.json
{code:JavaScript}
{"xcoll":{
    "shards":{"shard1":{
        "range":”80000000-b332ffff”l,
        "state":"active",
        "replicas":{
           "core_node1":{
                  "state":"active",
                  "base_url":"http://192.168.1.5:8983/solr",
                   "core":"xcoll_shard1_replica1",
            "node_name":"192.168.1.5:8983_solr",
            "leader":"true"}}}},
    "router":{"name":"compositeId"}}}
{code}

The main Overseer thread is responsible for creating collections and managing all the events for all the collections in the clusterstate.json . clusterstate.json is modified only when a collection is created/deleted or when state updates happen to “non-external” collections

Each external collection to have its own Overseer queue as follows. There will be a separate thread for each external collection.  

* /collections
** /xcoll
*** /overseer
**** /collection-queue-work
**** /queue
****  /queue-work


h2. SolrJ enhancements
SolrJ would only listen to cluterstate,json. When a request comes for a collection ‘xcoll’
* it would first check if such a collection exists
* If yes it first looks up the details in the local cache for that collection 
* If not found in cache , it fetches the node /collections/xcoll/state.json and caches the information 
* Any query/update will be sent with extra query param specifying the collection name , shard name, Role (Leader/Replica), and range (example \_target_=xcoll:shard1:L:80000000-b332ffff) . A node would throw an error (INVALID_NODE) if it does not the serve the collection/shard/Role/range combo.
* If a SolrJ gets INVALID_NODE error it  would invalidate the cache and fetch fresh state information for that collection (and caches it again).

h2. Changes to each Solr Node
Each node would only listen to the clusterstate.json and the states of collections which it is a member of. If a request comes for a collection it does not serve, it first checks for the \_target_ param. All collections present in the clusterstate.json will be deemed as collections it serves
* If the param is present and the node does not serve that collection/shard/Role/Range combo, an INVALID_NODE error is thrown
** If the validation succeeds it is served 
* If the param is not present and the node is a member of the collection, the request is served by 
** If the node is not a member of the collection,  it uses SolrJ to proxy the request to appropriate location

Internally , the node really does not care about the state of external collections. If/when it is required, the information is fetched real time from ZK and used and thrown away.

h2. Changes to admin GUI
External collections are not shown graphically in the admin UI . 




was (Author: noble.paul):
OK ,
here is the plan to split clusterstate on a per collection basis

h2. How to use this feature?
Introduce a new option while creating a collection (external=true) . This will keep the state of the collection in a separate node. 
example :

http://localhost:8983/solr/admin/collections?action=CREATE&name=xcoll&numShards=5&replicationFactor=2&external=true

This will result in this following entry in clusterstate.json
{code:JavaScript}
{
 “xcoll” : {“ex”:true}
}
{code}
there will be another ZK entry which carries the actual collection information
*  /collections
** /xcoll
*** /state.json
{code:JavaScript}
{"xcoll":{
    "shards":{"shard1":{
        "range":”80000000-b332ffff”l,
        "state":"active",
        "replicas":{
           "core_node1":{
                  "state":"active",
                  "base_url":"http://192.168.1.5:8983/solr",
                   "core":"xcoll_shard1_replica1",
            "node_name":"192.168.1.5:8983_solr",
            "leader":"true"}}}},
    "router":{"name":"compositeId"}}}
{code}

The main Overseer thread is responsible for creating collections and managing all the events for all the collections in the clusterstate.json . clusterstate.json is modified only when a collection is created/deleted or when state updates happen to “non-external” collections

Each external collection to have its own Overseer queue as follows. There will be a separate thread for each external collection.  

* /collections
** /xcoll
*** /overseer
**** /collection-queue-work
**** /queue
****  /queue-work


h2. SolrJ enhancements
SolrJ would only listen to cluterstate,json. When a request comes for a collection ‘xcoll’
* it would first check if such a collection exists
* If yes it first looks up the details in the local cache for that collection 
* If not found in cache , it fetches the node /collections/xcoll/state.json and caches the information 
* Any query/update will be sent with extra query param specifying the collection name , shard name and range (example \_target_=xcoll:shard1:80000000-b332ffff) . A node would throw an error (INVALID_NODE) if it does not the serve the collection/shard/range combo.
* If a SolrJ gets INVALID_NODE error it  would invalidate the cache and fetch fresh state information for that collection (and caches it again).

h2. Changes to each Solr Node
Each node would only listen to the clusterstate.json and the states of collections which it is a member of. If a request comes for a collection it does not serve, it first checks for the \_target_ param. All collections present in the clusterstate.json will be deemed as collections it serves
* If the param is present and the node does not serve that collection/shard/range combo, an INVALID_NODE error is thrown
** If the validation succeeds it is served 
* If the param is not present and the node is a member of the collection, the request is served by 
** If the node is not a member of the collection,  it uses SolrJ to proxy the request to appropriate location

Internally , the node really does not care about the state of external collections. If/when it is required, the information is fetched real time from ZK and used and thrown away.

h2. Changes to admin GUI
External collections are not shown graphically in the admin UI . 



> Split Clusterstate and scale 
> -----------------------------
>
>                 Key: SOLR-5381
>                 URL: https://issues.apache.org/jira/browse/SOLR-5381
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> clusterstate.json is a single point of contention for all components in SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes because there are too many updates and too many nodes need to be notified of the changes. As the no:of nodes go up the size of clusterstate.json keeps going up and it will soon exceed the limit impossed by ZK.
> The first step is to store the shards information in separate nodes and each node can just listen to the shard node it belongs to. We may also need to split each collection into its own node and the clusterstate.json just holding the names of the collections .
> This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org