You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2011/08/18 09:02:59 UTC

[Solr Wiki] Trivial Update of "NewSolrCloudDesign" by NoblePaul

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "NewSolrCloudDesign" page has been changed by NoblePaul:
http://wiki.apache.org/solr/NewSolrCloudDesign?action=diff&rev1=1&rev2=2

  = New SolrCloud Design =
+ (Work in progress)
+ == What is SolrCloud? ==
  
+ SolrCloud is an enhancement to the existing Solr to manage and operate Solr as a search service in a cloud.
+ 
+ == Glossary ==
+ 
+  * Cluster : Cluster is a set of Solr nodes managed as a single unit. The entire cluster must have a single schema and solrconfig
+  * Node : A JVM instance running Solr
+  * Partition : A partition is a subset of the entire document collection.  A partition is created in such a way that all its documents can be contained in a single index.
+  * Shard : A Partition needs to be stored in multiple nodes as specified by the replication factor. All these nodes collectively form a shard. A node may be a part of multiple shards  
+  * Leader : Each Shard has one node identified as its leader. All the writes for documents belonging to a partition should be routed through the leader.
+  * Replication Factor : Minimum number of copies of a document maintained by the cluster
+  * Transaction Log : An append-only log of write operations maintained by each node
+  * Partition version : This is a counter maintained with the leader of each shard and incremented on each write operation and sent to the peers
+  * Cluster Lock : This is a global lock which must be acquired in order to change the range -> partition or the partition -> node mappings.
+ 
+ == Guiding Principles ==
+ 
+  * Any operation can be invoked on any node in the cluster.
+  * No non-recoverable single point of failures
+  * Cluster should be elastic
+  * Writes must never be lost i.e. durability is guaranteed
+  * Order of writes should be preserved
+  * If two clients send document "A" to two different replicas at the same time, one should consistently "win" on all replicas.
+  * Cluster configuration should be managed centrally and can be updated through any node in the cluster. No per node configuration other than local values such as the port, index/logs storage locations should be required
+  * Automatic failover of reads
+  * Automatic failover of writes
+  * Automatically honour the replication factor in the event of a node failure
+